AI-Based Intrusion Detection Systems

By: Dadapeer Agraharam Shaik, Department of Computer Science and Technology, Student of Computer Science and technology, Madanapalle Institute of Technology and Science, Angallu,517325, Andhra Pradesh.

Abstract:

Artificial Intelligence (AI) has influenced the area of cybersecurity majorly especially when it comes to IDS creation. These systems employ use the concept of machine learning and deep learning, to identify and counter any unlawful activities in a network in real-time. Thus, IDS based on Artificial Intelligence methods can increase the speed, accuracy, and flexibility in comparison with the purely mathematic models based on the comparison of sets of parameters. This paper will aim at outlining and explaining the architectural model of AI based IDS, advantages, limitations, and potential of the concept as well as its significance in modern day security systems.

Keyword’s: AI, Cyber-threat, Cyber-security, intrusion detection system.

1.Introduction

The problem is that in the modern world, where attacks are becoming more and more complex and diverse, measures that do not use artificial intelligence prove to be ineffective. Intrusion Detection Systems (IDS) are now widely incorporated as an important element of security that is intended to minimize the incidence of unlawful activities. AI has revolutionized of IDS and has made them more efficient and effective. AI-based IDS make use of sophisticated machine learning and deep learning techniques to scrutinize the network traffic, and identify the possible threats within a shortest possible interval of time. These systems can learn from new data making them appropriate in the ever-evolving environment of threats and attacks, which makes them valuable in cybersecurity. The deployment of AI in IDS has not only improved the capacity of the system to detect attacks but has also assisted in reducing the rate of false alarms thereby making IDS to be more of a complete defence structure. This paper intends to explore the structure of AI-based IDS, the superiority of the new generation systems over classical ones, the problems concerning the implementation of the new systems, and the prospects of the new technology development.

2.Classification and Comparison of AI-Based Intrusion Detection Systems

Security is now an essential need because of the highly increased utilization of data and the internet. Current research activities focus to producing machines which would always detect irregular movements on the road. As the implements of the new technologies such as the cloud computing more industries adapt to them than ever and hence there is a high inequality of security threats. Satisfactory solutions must be provided to maintain the functioning of the cloud and network facilities. The use of artificial intelligence in intrusion detection is considered as one of the promising approaches to identify the attacks and classify them{Citation}.

The principal aim of Intrusion Detection Systems (IDS) is to discover network packets through positively analysing them with much scrutiny and alerting administrators through alarms. An IDS comes as an effective back up security measure in the network when other orthodox technologies prove ineffective. In the world of cloud infrastructure traffic is enormous and conventional technologies for security could not detect and differentiate attacks. IDS can operate in different modes, i. e., binary and multiclass classifications. In the case of binary classification, it returns the output in the form of normal or attack data while in the case of multiclass, it will be multiple values containing different types of attacks. This can be regarded as the focus on the detection of intruders and the classification of attacks based on the situation. Furthermore, prevention functions can also be given for potential invasion occurrences to prevent any event. The combined mechanism may then be as termed as intrusion detection and prevention systems.

Classification is made based on the detection method used which is divided into signature and anomaly base detection. A few mechanisms of Signature-based IDS only look for patterns and are specifically compared to events only observable. While this method can successfully identify all the known attacks it can be ineffective in identifying new attacks. IDS techniques can also work in a way that it discovers deviations from normal behaviours this is called misuse detection. The detection mechanism is based on three elements: The three main areas of the component are parameterization, training, and detection.

Consequently, most of the referred studies assume the above-stated traditional models and none of the methods implement immense power. The problem can be defined from the perspective of the branch of study called machine learning (ML), and the deep learning (DL) branch which comes under the ML. The output of an ML problem is a label, and it can either have a value normal or attack. While previous models of ML provided better results as compared to the conventional algorithms, they did not prove to be excellent in terms of high accuracy coupled with low false alarm rates. It stated that the use of DL method for intrusion detection will enable the anticipation of the attacks with high probability. In other words, it can be concluded that the DL based IDS is more efficient in the prediction phase compared to ML.

Fig.1 Classification and comparison of AI based Intrusion Detection System

Concerning the area of ML, DL, and, in general, ensemble based IDSs, the data of the related studies have been generalized in the context of classification. Performance and algorithms can be said to be the main components of this classification. Various performance measures have been used in order to compare the results of the existing studies in terms of ML, DL, and ensemble based IDSs.

The following sub-section presents a classification of AI based IDS. The classification gives the detailed view of the AI-based intrusion detection techniques from the perspective of ML, DL and ensemble learning. That is why, the innovation of this classification is gathering all the techniques of ML, DL, and ensemble learning methodologies, which have an ability to identify intruders. This classification may be useful for the research that operate in the field of cybersecurity to detect and classify attacks.[1]

3.SVM-Based Intrusion Detection System

A Support Vector Machine (SVM) is a form of supervised machine learning algorithm grounded on Statistical Learning Theory (SLT), appropriate for classification and regression. The concept of SVM is to look for such a hyperplane that separates all the classes with maximum distance on either side of the line. Location of support vectors that are an example of the training data create an SVM model.

For the classifier in the proposed architecture, an SVM classifier from the ‘scikit-learn’ machine learning package is utilized. It is fine-tuned with an ‘rbf’ kernel and other default values for the parameters of the model are set. Once trained, the model predicts new incoming data (monitored network traffic) into two classes: This arbitrary breakout complements Normal and Attack as its basic constituent components. The work of the IDS training and the phase of the prediction is described in the following sections of architecture.

The performance of the classification is done using NSL-KDD dataset which eliminates all the problems encountered in the first KDD’99 dataset. The set of data used in this work was obtained from KDD’99 archive, which consists of network connection records obtained from raw data gathered by Lincoln Labs at MIT for an IDS assessment of a DARPA 1998 experiment. The NSL-KDD contains a training dataset with 21 attacks and a test dataset with 37 attacks, categorized into five classes: It is labeled as normal, probe, denial of service (DoS), user to root (U2R), and remote to local (R2L). For this study, the dataset is divided into two classes: Normal and Attack, to fulfil the task of binary classification. [2]

4.Methodology

Data Collection

The CSE-CIC-IDS2018 dataset is used for data collection where the dataset is formulated dependent on two important parameters affecting the performance of an IDS:

Data Pre-processing

In the data pre-processing stage, several steps are performed, including:In the data pre-processing stage, several steps are performed, including:

  • Denoising: Pre-processing of the data which involves cleaning of the data and the elimination of irrelevant data.
  • Contrast Improvement: This is improved to create an obvious contrast aimed at making important patterns distinguishable to the human eye.
  • Data Sorting: Arranging information to enhance its manipulative rate.

Data Transferring

It becomes possible to convert symbolic features of the given dataset into the integer type. For instance, The CSE-CIC-IDS2018 dataset to some extent has symbolic features as well as integer types within the dataset. Usually, symbolic features like TCP and UDP can be easily replaced by the corresponding integer numbers to line up all the datasets.[3]

Data Normalization

Normalization is carried out in three stages: Normalization is carried out in three stages:

  1. Duplicate and Unwanted Data Removal: Inherent data that can be considered as unnecessary for inclusion in analysis consist of such criteria as:
  2. Field Value Generalization: Standardizing some of the specific field values.
  3. Null Value Handling: Setting fields that have no value, or fields where the value is an empty space to zero. This step helps to compare the data as well.

Feature Selection

By this step, the relevance of features is determined in relation to the figures provided by the system on the target range and the trained data set.

Two algorithms are used for feature selection of these, the flexible mutual information-based feature selection is flexible because it means that one or both measures may be replaced by another of the same type with a different parameter set.

Flexible LCC in Feature Selection

From the said algorithms, only few are chosen and used to improve the performance of the model with respect to features relevant to the input.

Attack Recognition

Attack recognition consists of two main steps:

1. Initial Comparison: The samples of system data are also checked against the trained data to distinguish normal from an attacked state[4].

2. Attack Classification: If there is an attack established, the attacked party goes through a second process of subclassification. Thus, the type of attack is distinguished by comparing it to the trained dataset, which would allow for certain response actions.

In this way, guaranteeing the reliability and effectiveness of detections and classifications in network intrusions, the system employs methodologies based on advanced machine learning and data processing.[5]

Conclusion

All in all, it can be concluded that the AI-based IDS is a new breakthrough in cybersecurity services that provides a number of advantages for analysing the existing threats and responding to them in real time. These systems are capable of versing through a large quantity of network traffic data employing improved Methods such as Support Vector Machines (SVM) and Deep Learning (DL) to scan for known and unidentified risks. The fact that this ability allows for the identification of patterns and specific peculiarities in real-time is a competitive edge over more conventional approaches that may fail to follow these constantly adapting threats and attack types. Moreover, incorporating AI in integration with cloud computing offers more scalability and flexibility as necessary; thereby, making AI-based IDS applicable for networks of all sorts. The decision-making capability to perform both binary and multiclass classifications provides the maximum coverage and classification of different kinds of cyber-attacks, which in turn improves the overall network security.

However, the IDS employing analytical tools and features of artificial intelligence as been said well is not without difficulties. Several limitations should be noted; a crucial one is the maintenance of the quality of data used and the handling of large amount of data demanded for training these models. Third, there may be false positive and false negative results; whereby the system may perform poorly, thus reducing its effectiveness. It is crucial to conduct a constant study to improve them to avoid such issues Continual research and innovation are other significant factors that are helpful in improving these technologies. Nevertheless, the continuing research in the field of AI makes IDS even more promising in the future protection against advanced cyber threats. Organic and flexible characteristics of AI-based systems guarantee they will remain a significant part of contemporary cybersecurity, which acts as a defence force that develops alongside a constantly changing threat environment.

Reference:

  1. T. Sowmya and E. A. Mary Anita, “A comprehensive review of AI based intrusion detection system,” Meas. Sens., vol. 28, p. 100827, Aug. 2023, doi: 10.1016/j.measen.2023.100827.
  2. E. Tcydenova, T. W. Kim, C. Lee, and J. H. Park, “Detection of Adversarial Attacks in AI-Based Intrusion Detection Systems Using Explainable AI,” Hum.-Centric Comput. Inf. Sci., vol. 11, no. 0, pp. 1–1, Sep. 2021, doi: 10.22967/HCIS.2021.11.035.
  3. L. Triyono, R. Gernowo, P. Prayitno, M. Rahaman, and T. R. Yudantoro, “Fake News Detection in Indonesian Popular News Portal Using Machine Learning For Visual Impairment,” JOIV Int. J. Inform. Vis., vol. 7, no. 3, pp. 726–732, Sep. 2023, doi: 10.30630/joiv.7.3.1243.
  4. S. Manikandan, M. Rahaman, and Y.-L. Song, “Active Authentication Protocol for IoV Environment with Distributed Servers,” Comput. Mater. Contin., vol. 73, no. 3, pp. 5789–5808, 2022, doi: 10.32604/cmc.2022.031490.
  5. R. Sharma, V. R. Kumar, and R. Sharma, “Ai Based Intrusion Detection System,” Think India J., vol. 22, no. 3, Art. no. 3, Jul. 2019.
  6. Gupta, B. B., & Narayan, S. (2021). A key-based mutual authentication framework for mobile contactless payment system using authentication server. Journal of Organizational and End User Computing (JOEUC), 33(2), 1-16.
  7. Vajrobol, V., Gupta, B. B., & Gaurav, A. (2024). Mutual information based logistic regression for phishing URL detection. Cyber Security and Applications, 2, 100044.
  8. Gupta, B. B., Gaurav, A., Panigrahi, P. K., & Arya, V. (2023). Analysis of cutting-edge technologies for enterprise information system and management. Enterprise Information Systems, 17(11), 2197406.

Cite As

Shaik D. A. (2024) AI-Based Intrusion Detection Systems, Insights2Techinfo, pp.1

76120cookie-checkAI-Based Intrusion Detection Systems
Share this:

Leave a Reply

Your email address will not be published.