By: Rekarius, CCRI, Asia University, Taiwan
Abstract
Phishing attacks have become one of the most damaging cyberattacks, responsible for a large percentage of losses, such as financial theft and data breaches that affect companies or individuals. This study examines how three papers that use AI-driven approaches are utilized for a phishing attack detection system. These three AI-driven approaches employ Reinforcement Learning, Machine Learning techniques such as Logistic Regression, Random Forest, Support Vector Ma- chine (SVM), CatBoost, and XGBoost, as well as Feedforward Deep Neural Networks (FC-DNN), Long Short-Term Memory (LSTM) and Inception-based Convolutional Networks (ID CNN).
Keywords Phishing Detection, AI-driven Approaches Detection, Reinforcement Learning, Machine Learning Techniques, Deep Learning
Introduction
The massive global use of the internet today has opened up opportunities for various cybercrimes. Phishing has established itself as one of the most pervasive and damaging forms of cyberattacks, responsible for a large percentage of data breaches, financial theft, and personal information compromises [3]. Email phishing attacks are among the most prevalent and dangerous types of cybercrime [2]. Several solutions have been developed for detecting phishing attacks. How- ever, they mainly rely on a blacklist approach, which has been inefficient in detecting a zero-day phishing attack. This study examines how three papers that use AI-driven approaches are utilized for a phishing attack detection system. These three AI-driven approaches employ Reinforcement Learning, Machine Learning techniques such as Logistic Regression, Random Forest, Support Vector Machine (SVM), CatBoost, and XGBoost, as well as Feedforward Deep Neural Networks (FC-DNN), Long Short-Term Memory (LSTM) and Inception-based Convolutional Networks (ID CNN).
Proposed Methodology
The methodology used in this study focuses on the proposed methodology from three selected papers on AI-driven phishing detection.
Reinforcement Learning
Reinforcement learning was developed to address the shortcomings of tradi- tional phishing detection strategies such as rule-based methods and heuristic approaches.
Data Collection and Feature Extraction
In this method there are 2 datasets used:
-
- Dataset 1(Real-World Dataset): A dataset consisting of 5000 emails, evenly distributed with 2500 phishing emails (sourced from PhishTank and OpenPhish archives) and 2500 benign emails (collected from public corporate email traffic and open datasets). Dataset 1 was split using an 80/20 ratio (4000 for training and 1000 for testing);
- Dataset 2 (Synthetic Phishing Dataset): A synthetic dataset containing 1000 phishing emails generated using templates based on real-world phishing attack patterns, in- cluding domain spoofing, urgent calls-to-action, credential harvesting requests, and misleading hyperlinks. Dataset 2 was reserved exclu- sively for external validation without prior model exposure.
They focused on extracting both content-based and header- based features, including the following:
URL Features: Length of URLs, number of subdomains, presence of IP addresses, and suspicious domain patterns;
HTML Structure: Presence of embedded scripts, hidden form fields, and iframe usage;
Sender Reputation: Mismatched “From” and “Reply-To” addresses, domain age, and SPF/DKIM validation status;
Keyword Patterns: Occurrence of phishing-related keywords like “urgent”, “verify your account”, and “password reset”.
Reinforcement Learning Model
Their proposed phishing detection system is based on DQN architecture, a widely adopted RL technique that combines Q-Learning with deep neural net- works to handle large, complex feature spaces effectively [6].
Agent-Environment Interaction
In the RL framework, the phishing detection model operates as an agent in- teracting with an environment (the dataset). The environment presents emails with various features, and the agent’s task is to classify each as either phishing or benign. The interaction is formalized as a Markov Decision Process (MDP),
defined by the tuple (S, A, R, P, ), where the following variables are used: S = Set of states (email features); A = Set of actions (phishing, benign); R = Re- ward function; P = State transition probability; = Discount factor for future rewards.
Q-Learning Algorithm
The model was trained in over 10,000 iterations to ensure convergence. The reward system was defined as follows:

TP (True Positive): Correctly identified phishing email
TN (True Negative): Correctly identified legitimate email
FP (False Positive): Legitimate email misclassified as phishing FN (False Negative): Phishing email misclassified as legitimate
Table 1 below summarizes the phishing detection accuracy and false positive rates during the training phase across different models.
Table 1: Comparative performance metrics (training phase)
Model | Accuracy (%) | False Positive Rate (%) |
RL-Based DQN | 95 | 2 |
SVM | 85 | 12 |
Random Forest | 87 | 10 |
Linking Back to Previous Strategies
Unlike rule-based systems that rely on static thresholds [4], or heuristic models prone to high false positives [1], RL-based approach learns from real-time feedback, dynamically adjusting its decision-making process. It also overcomes the data-dependency issues seen in supervised ML techniques [5], offering the following:
- Real-Time Adaptation: Continuous learning from new phishing patterns;
- Lower False Positive Rates: Optimized through tailored reward mechanisms;
- Scalability: Effective across diverse phishing attack vectors.
This methodological framework demonstrates how RL bridges the gaps left by traditional phishing detection models, providing
Machine Learning Techniques 1
The methodology employed for this study is based on a systematic approach to data collection and preparation for machine learning applications. The dataset for this study was gathered from two sources, both of which were obtained via Kaggle. The first dataset consists of 18,650 emails, with 7,328 categorised as phishing attacks and 11,322 as safe. The second sample consists of 5,128 emails, with 2,868 categorised as safe and 2,239 as phishing. The dataset’s main features are ”Email Text” and ”Email Type”, where the email text is used as input and the email type (safe or phishing) is used as the classification label.
The dataset was split into training and testing subsets with an 80:20 ratio, resulting in 14,920 emails for training and 3,730 for testing in the first dataset, and 3,597 for training and 1,531 for testing in the second dataset. This ensures that the model is trained on a substantial portion of the data while still being evaluated on unseen examples to measure generalisation ability.
Data preprocessing is a crucial step in building a robust phishing email detection system. This study carefully handles null values by removing incomplete data entries to maintain the integrity of the dataset. Text cleaning techniques, such as removal of special characters, stop word removal, and lemmatisation, are applied to standardise email content. Feature extraction and engineering are then used to convert the text data into structured numerical representations, making it suitable for machine learning models.
The machine learning models used in the email categorisation system were carefully chosen and supported by empirical data. CatBoost, Support Vector Machine (SVM), Logistic Regression, Random Forest, and XGBoost were chosen for their superior performance in classification tests.
Machine Learning Techniques 2
In this section described how they derived their proposed features for predicting phishing PDC web pages. They also describe and illustrate the proposed system architecture of our prediction model.
Phishing web page prediction features
First, [7] describe how they identified PDC web pages, the web pages that collect users’ personal data. From their observations, PDC web pages usually consist of at least one word or phrase (we term as PDC phrase) in their structure and contents which is related to the specific personal data being collected. The importance of differentiating PDC from non-PDC web pages is that we avoid predicting web pages which do not pose any phishing threat. This will avoid degrading of user’s experience when accessing the non-PDC web pages and the potential false positives on these web pages which will prevent users from accessing them, causing significant implications to users and websites’ owners (e.g., denial of services and losses of revenues).
To determine common PDC phrases used by PDC web pages, we investigated 100 samples of phishing and legitimate web pages capturing the data from which we obtained a list of 43 PDC phrases (indicated in Table 2).
Table 2: Common PDC Phrases in PDC Web Pages
Category | PDC Phrases |
Login Credentials | Username, Login, Sign in, Sign up, Create an account, Forgot password, Forgotten your password, Reset password, Remember me, Passcode, Secret key, Security key, Security number, ID |
Social Media Login | Log in with Facebook, Sign in with Facebook, Log in with Twitter, Sign in with Twitter, Log in with Google, Sign in with Google |
Account and Member- ship | Customer number, Membership number, Account number, Ac- count |
Financial Information | Debit card number, Credit card number, Card number, Card- holder, Billing information, Billing address, Security code, Expiry date |
Other Personal Data | Email, PIN, Date of birth, Birth date, Phone |
System architecture of the prediction model
Training Process
Our prediction model based on the proposed features earlier is built using the following six-step process (illustrated in Figure 1 as steps 1 to 6)

Figure 1: Training Process
Prediction Process The process of predicting a new web page requested by a user is shown in Figure 2.

Figure 2: Prediction Process
Conclusion
Based on the studies conducted in these three papers, it is concluded that the reinforcement learning-based phishing detection model uses Deep Q-Network (DQN) to improve adaptability, accuracy, and continuous learning, achieving 95% accuracy, 96% precision, 94% recall, 2% false positive rate, and AUC of 0.92. The XGBoost model achieves an outstanding accuracy of 98%, with 95% precision and 99% recall for phishing emails, demonstrating strong capabilities in phishing email classification. The CatBoost model achieves 98% accuracy, with 98% recall for phishing emails and 96% precision, making it highly efficient in identifying phishing attempts while maintaining minimal classification errors.
References
- Carlo Marcelo Revoredo da Silva, Eduardo Luzeiro Feitosa, and Vini- cius Cardoso Garcia. Heuristic-based strategy for phishing prediction: A survey of url-based approach. Computers & Security, 88:101613, 2020.
- Opeyemi Isaiah Enitan. An ai-powered approach to real-time phishing de- tection for cybersecurity. International Journal, 12(6), 2023.
- Haidar Jabbar and Samir Al-Janabi. Ai-driven phishing detection: Enhanc- ing cybersecurity with reinforcement learning. Journal of Cybersecurity and Privacy, 5(2):26, 2025.
- Haidar Jabbar, Samir Al-Janabi, and Francis Syms. Ai-integrated cyber security risk management framework for it projects. In 2024 International Jordanian Cybersecurity Conference (IJCC), pages 76–81. IEEE, 2024.
- V Santhana Lakshmi and MS Vijaya. Efficient prediction of phishing web- sites using supervised learning algorithms. Procedia Engineering, 30:798– 805, 2012.
- Seow Wooi Liew, Nor Fazlida Mohd Sani, Mohd Taufik Abdullah, Razali Yaakob, and Mohd Yunus Sharum. An effective security alert mechanism for real-time phishing tweet detection on twitter. Computers & security, 83:201–207, 2019.
- Thomas Nagunwa. Ai-driven approach for robust real-time detection of zero- day phishing websites. International Journal of Information and Computer Security, 23(1):79–118, 2024.
- Arya, V., Gaurav, A., Gupta, B. B., Hsu, C. H., & Baghban, H. (2022, December). Detection of malicious node in vanets using digital twin. In International Conference on Big Data Intelligence and Computing (pp. 204-212). Singapore: Springer Nature Singapore.
- Lu, Y., Guo, Y., Liu, R. W., Chui, K. T., & Gupta, B. B. (2022). GradDT: Gradient-guided despeckling transformer for industrial imaging sensors. IEEE Transactions on Industrial Informatics, 19(2), 2238-2248.
- Sedik, A., Maleh, Y., El Banby, G. M., Khalaf, A. A., Abd El-Samie, F. E., Gupta, B. B., … & Abd El-Latif, A. A. (2022). AI-enabled digital forgery analysis and crucial interactions monitoring in smart communities. Technological Forecasting and Social Change, 177, 121555.
Cite As
Rekarius (2025) AI-Driven Real-Time Detection of Zero-Day Phishing Threats for Enhanced Cybersecurity, Insights2Techinfo, pp.1