Leveraging Logistic Regression for Phishing Threat Identification

By: KUKUTLA TEJONATH REDDY, International Center for AI and Cyber Security Research and Innovations (CCRI), Asia University, Taiwan, tejonath45@gmail.com

Abstract:

Individuals and organisations are continually threatened by sophisticated phishing attacks hence cybersecurity needs continuous improvement. Logistic regression is one of the most efficient statistical methods whose advantages and drawbacks will be discussed in this paper. Therefore, logistics regression would be suitable in discriminating the relevant phishing patterns as it can deal with binary class issues. Logistic regression is an effective tool that can provide improved security against cybercrimes. This article also looks at how logistic regression can help detect phishing attacks that have become pervasive over time.

Introduction:

The modern phishing attacks have advanced greatly making their threats serious to any individual or organization. With time, cybercrime is gaining a higher level of sophistication and will therefore require large scale and advanced security controls. More advanced and specific phishing will now be revealed by the popular data science statistical approach – logistic regression. This paper will examine the implementation of logistic regression in phishing detection, heightening cybersecurity fortifications.

Understanding Logistic Regression:

Logistic Regression can be described as a statistic tool for solving binary classification issues. For phishing detection, the goal is to classify emails or web pages into two categories: legitimate or phishing [2]. Logistic Regression differs from the linear regression, which predicts a continuous outcome. It uses a logistic function-a Sigmoid, which converts the output to a value between zero and one denoting the probability of membership of the best member of the class [3].

Feature Selection and Extraction:

Successful phishing detection with Logistic Regression depends on the characteristics on which the model is based. The algorithm extracts different features from this data, called features. There are several factors relevant for phishing detection such as structured URLs, domain age, availability of emails, IP addresses, and email content. This process is crucial for enhancing the model’s performance and curbing overfitting by employing optimal feature selection and extraction.

Data Preprocessing:

It is also imperative that the data be pre-processed in order to enhance the quality and suitability of the data for use with Logistic Regression. Handling of missing values, normalization of numerical features, recording categorical variables, and resampling of an unbalanced data set. Additionally, a good pre-processed dataset impacts on the precision and generalization of the model.

Training the Logistic Regression Model:

Having prepared the dataset, the next step will involve training of a Logistic regression model. The algorithm uses the weight of every feature to make the appropriate guess during the training procedure. To ensure adequate model accuracy, cross-validation methods are commonly utilized.

Evaluating Model Performance:

Finally, the performance of the trained model must be assessed with an independent testing data set. The common binary classification model indicators are accuracy, precision, recall and F1-scores. These metrics allow assessment of accuracy of distinction between real phishing and other non-phishing activities generated by the model.

Benefits of Logistic Regression in Phishing Detection:

Interpretation: The effects of all the components are interpreted clearly in logistic regression for classification results. In this case, it is vital to understand each indicator regarding a malware-based phishing attack.

Efficiency: Since logistics regression is computational efficient. thus, suitable for real time or near real time phishing detection systems.

Adaptation: Also, new phishing techniques that appear are quickly incorporated into this model making it easy to update [6] [4].

Conclusion:

Logistic Regression has proven to be a valuable asset in phishing detection due to its simplicity and robust interpretation. By integrating this audit trail into an advanced cybersecurity strategy, organizations can enhance their security measures and stay one step ahead of cyber threats As phishing attacks continue to evolve, logistic regression will play an increasingly important role in strengthening cybersecurity.

References

Bapat, R., Mandya, A., Liu, X., Abraham, B., Brown, D. E., Kang, H., & Veeraraghavan, M. (2018, April). Identifying malicious botnet traffic using logistic regression. In 2018 systems and information engineering design symposium (SIEDS) (pp. 266-271). IEEE.
Soumya, T. R., Ramesh, P., Rosy, N. A., Pughazendi, N., Padmapriya, S., & Khilar, R. (2022, September). Logistic Regression based Machine Learning Technique for Phishing Website Detection. In 2022 4th International Conference on Inventive Research in Computing Applications (ICIRCA) (pp. 683-686). IEEE.
Chiramdasu, R., Srivastava, G., Bhattacharya, S., Reddy, P. K., & Gadekallu, T. R. (2021, August). Malicious url detection using logistic regression. In 2021 IEEE International Conference on Omni-Layer Intelligent Systems (COINS) (pp. 1-6). IEEE.
Vanitha, N., & Vinodhini, V. (2019). Malicious-URL detection using logistic regression technique. International Journal of Engineering and Management Research, 9(6), 108-113.
Rymarczyk, T., Kozłowski, E., Kłosowski, G., & Niderla, K. (2019). Logistic regression for machine learning in process tomography. Sensors, 19(15), 3400.
Bhatti, M. H., Khan, J., Khan, M. U. G., Iqbal, R., Aloqaily, M., Jararweh, Y., & Gupta, B. (2019). Soft computing-based EEG classification by optimal feature selection and neural networks. IEEE Transactions on Industrial Informatics, 15(10), 5747-5754.
Sahoo, S. R., & Gupta, B. B. (2019). Hybrid approach for detection of malicious profiles in twitter. Computers & Electrical Engineering, 76, 65-81.
Gupta, B. B., Yadav, K., Razzak, I., Psannis, K., Castiglione, A., & Chang, X. (2021). A novel approach for phishing URLs detection using lexical based machine learning in a real-time environment. Computer Communications, 175, 47-57.
Cvitić, I., Perakovic, D., Gupta, B. B., & Choo, K. K. R. (2021). Boosting-based DDoS detection in internet of things systems. IEEE Internet of Things Journal, 9(3), 2109-2123.

Cite As

REDDY K. T. (2023) Leveraging Logistic Regression for Phishing Threat Identification, Insights2Techinfo, pp.1

601000cookie-checkLeveraging Logistic Regression for Phishing Threat Identification

Post Views: 96

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Leveraging Logistic Regression for Phishing Threat Identification

Abstract:

Introduction:

Understanding Logistic Regression:

Feature Selection and Extraction:

Data Preprocessing:

Training the Logistic Regression Model:

Evaluating Model Performance:

Benefits of Logistic Regression in Phishing Detection:

Conclusion:

References

Cite As

Leave a Reply Cancel reply

Smart grid and cyber defences

Revolutionizing Healthcare: The Role of Machine Learning in IoMT

Revolutionizing Software Engineering using Quantum Computing

AGILE METHODOLOGIES IN THE ERA OF MACHINE LEARNING DEVELOPMENT

The Marvels of Large Language Models: Unleashing The Power of Generative AI

The differences between Edge Computing and Federated Learning

Evaluating the Efficacy of Phishing Detection Models in Multi-Lingual Environments

Cross-Platform Phishing Detection: Applying Unified Models across Email and Web

Adaptive Phishing Detection Systems Using Online Learning Methods

Real-Time Phishing Detection: Challenges and Solutions in Streaming Data

Incorporating NLP Techniques to Enhance Contextual Understanding in Phishing Detection