Machine Learning Models that Excel in Detecting Phishing Attacks

By: Jampula Navaneeth¹

¹Vel Tech University, Chennai, India

²International Center for AI and Cyber Security Research and Innovations, Asia University, Taiwan Email: navaneethjampula@gmail.com

Abstract

Phishing is still one of the most common threats in the field of cybersecurity, where the offender tries to lure the person into submitting beneficial information. To counter these justifications, the use of machine learning (ML) models has been vital because it provides an efficient and automatic detection methods. This article discusses fully machine learning approaches and measures the performance when trained to act on/towards datasets containing features effective in distinguishing a Phishing Website from the normal or safe one. The ability to tell these sites apart from each other is important in the day’s modern internet browsing.

Keywords: Phishing, Cyber Security, Machine Learning

Introduction

Phishing has turned out to be prevalent as well as dynamic within the context of the information technology environment, and it affects both, individual users and companies. Such attacks are usually in a form of a phishing attack in which the attackers make the targets reveal personal information like passwords or financial information under the disguise of authenticity [1]. And this is where the machine learning (ML) comes into play as the more efficient, and flexible approach to phishing detection [2]. This paper focuses on discussing some of the best ML models that can help facilitate the detection of the phishing attacks and in doing so showcases their strengths to the cybersecurity fraternity.

ML Models for Detecting Phishing Attacks

Figure 1: Various Models for Phishing Detection

Randon Forest: Random Forest is an erosive learning model which involves construction of many decision trees and then combining over them so as to enhance the proficiency of the estimation. When it comes to feature analysis in phishing detection Random Forest can examine the URL length, domain age and whether the URL contains special characters hence making it easier to differentiate between a genuine website and a phishing one. This is a simple model that can accommodate big data, as well as avert the issue of over-training on a given set [3].
Support Vector Machines: SVM is a strong classifier that goes through the course of selecting the best hyperplane to be used in the classification procedures for various classes of data. In phishing detection, SVM can be more useful when the data being used is with more features such as URL contents or body text of the email messages. Due to the ability of transforming the data to higher dimension, SVM can have the ability of separating between the phishing and non-phishing instances even in complicated cases [3].
Neural Networks: Deep learning models under Neural Networks have been of immense success when it comes to identification of the phishing attacks. Such models can deal with rather simple patterns and relationships within the data, mimicking the human brain. CNNs and CNNs have been applied for analysing phishing URLs, emails and web content. Because of their capacity to automatically learn features and enhance the proficiency with new data, the algorithms are effective tools against altering phishing strategies [4].
Gradient Boosting Machines: XGBoost and LightGBM are among the most famous models that are based on the GBM, this technique works by building an array of models one after the other correcting the errors made by other. This is because they are very effective with cases of phishing because of the ability of GBMs to deal with imbalanced datasets and because the approach of the GBMs is to work mainly on the difficult second-stage calls. They are useful in the real-time phishing detection where time and accuracy of detection is of essence [5].
Naive Bayes: Naive Bayes is a classifier that is based on the probability theory called Bayes probability and it has the feature of achieving independence. For all of that, the Naive Bayes has been successfully used in phishing detection, especially in the case of e-mail filters [3]. Thus, Naive Bayes with feature extraction makes it possible to classify a given URL or email as phishing or not with reasonable accuracy in reasonable time, and that is why it can be used in real-time applications [6].

Conclusion

This is why machine learning models provide a major useful in the identification of phishing attacks since the identification is automated and the models learn about new threats. Despite of their differences and advantages, the type of model to be selected is usually determined by the nature of the problem at hand, the requirement for real time detection, and the power of the hardware that is available. Since the tactics in phishing are changing constantly, research and development of difficult ML models will keep on being essential to protect such data.

References

M. Rahaman, S. S. Bakkireddygari, S. Chattopadhyay, A. L. Gomez, V. Arya, and S. Bansal, “Infrastructure and Network Security,” in Metaverse Security Paradigms, IGI Global, 2024, pp. 108–144. doi: 10.4018/979-8-3693-3824-7.ch005.
E. Gandotra and D. Gupta, “An Efficient Approach for Phishing Detection using Machine Learning,” in Multimedia Security: Algorithm Development, Analysis and Applications, K. J. Giri, S. A. Parah, R. Bashir, and K. Muhammad, Eds., Singapore: Springer, 2021, pp. 239–253. doi: 10.1007/978-981-15-8711-5_12.
S. Hossain, D. Sarma, and R. Joyti, “Machine Learning-Based Phishing Attack Detection,” IJACSA, vol. 11, no. 9, 2020, doi: 10.14569/IJACSA.2020.0110945.
O. K. Sahingoz, S. Işılay Baykal, and D. Bulut, “PHISHING DETECTION FROM URLS BY USING NEURAL NETWORKS,” in Computer Science & Information Technology (CS & IT), AIRCC Publication Corporation, Dec. 2018, pp. 41–54. doi: 10.5121/csit.2018.81705.
K. Omari, “Phishing Detection using Gradient Boosting Classifier,” Procedia Computer Science, vol. 230, pp. 120–127, Jan. 2023, doi: 10.1016/j.procs.2023.12.067.
L. Triyono, R. Gernowo, P. Prayitno, M. Rahaman, and T. R. Yudantoro, “Fake News Detection in Indonesian Popular News Portal Using Machine Learning For Visual Impairment,” JOIV : International Journal on Informatics Visualization, vol. 7, no. 3, pp. 726–732, Sep. 2023, doi: 10.30630/joiv.7.3.1243.
Li, K. C., Gupta, B. B., & Agrawal, D. P. (Eds.). (2020). Recent advances in security, privacy, and trust for internet of things (IoT) and cyber-physical systems (CPS).
Chaudhary, P., Gupta, B. B., Choi, C., & Chui, K. T. (2020). Xsspro: Xss attack detection proxy to defend social networking platforms. In Computational Data and Social Networks: 9th International Conference, CSoNet 2020, Dallas, TX, USA, December 11–13, 2020, Proceedings 9 (pp. 411-422). Springer International Publishing.
Gupta, B. B., Gaurav, A., Arya, V., Alhalabi, W., Alsalman, D., & Vijayakumar, P. (2024). Enhancing user prompt confidentiality in Large Language Models through advanced differential encryption. Computers and Electrical Engineering, 116, 109215.

Cite As

Navaneeth J. (2024) Machine Learning Models that Excel in Detecting Phishing Attacks, Insights2Techinfo, pp.1

792500cookie-checkMachine Learning Models that Excel in Detecting Phishing Attacks

Post Views: 356

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Machine Learning Models that Excel in Detecting Phishing Attacks

Introduction

ML Models for Detecting Phishing Attacks

Conclusion

References

Cite As

Leave a Reply Cancel reply

Detecting and Preventing Phishing Attacks in IoT-Based Smart Healthcare Systems

Data-Driven Insights into Rare Disease Diagnosis and Treatment with AI

Genetic Algorithms and Data Analytics for Cybersecurity in Phishing and Blockchain Systems

Machine Learning in Biometric Security Systems

The Role of AI and Machine Learning in Cloud Storage

How AI is Revolutionizing Cyber Forensics

Explainable Multi-Agent Reinforcement Learning for Algorithmic Trading

Internet of Things and Advancements in Businesses

Efficient and Sustainable Desalination using IoT, Cloud Computing, Embedded Systems and Nanotechnology

Role of Machine Learning in Embedded Systems

Pocket Hacking: From Root Access to Kali Linux