In order to reduce the number of features, also known as dimensionality reduction, there are two approaches that can be used.
- Feature Extraction
- Feature Selection
- Feature Extraction
it is intended to modify the features and generate new ones by combining them with the raw/provided features. It is the goal of feature extraction to reduce the number of features in a dataset by developing new features from the existing ones in a dataset and then discarding the original features[1-2].
Regularization can obviously assist to limit the risk of overfitting, but utilizing Feature Extraction techniques instead can result in a variety of other benefits, such as accuracy gains, for example.
Overfitting can result in risk reduction.
Increase your training speed.
Data visualization has been improved.
Our model’s explainability has improved as a result.
The Principle Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Multidimensional Scaling are the most often used methodologies. The libraries in python are as following:
from sklearn.decomposition import PCA
from sklearn.decomposition import TruncatedSVD
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
2. Feature Selection
It can be thought of as a pre-processing phase because it does not introduce any new features, but rather picks a subset of the raw features that are more interpretable[3-4].
Finding the most valuable characteristics in a large initial number can aid in the extraction of valuable information and the discovery of new knowledge.
When it comes to classification difficulties, the significance of features is evaluated in terms of their ability to address diverse classes of problems.
The term “feature relevance” refers to the attribute that provides an evaluation of the usefulness of each feature in distinguishing between different classes.
There are a variety of goals in feature selection.
- It eliminates irrelevant and noisy features while retaining the ones that have the least amount of redundancy and the greatest amount of relevance to the target variable.
- The reduction in computational effort and complexity associated with training and testing a classifier results in more cost-effective models as a result of this.
- It improves the effectiveness of learning algorithms, prevents overfitting, and aids in the development of more general models.
References:
- Alweshah, M. et al. (2020). The monarch butterfly optimization algorithm for solving feature selection problems. Neural Computing and Applications, 1-15.
- Hammad, M., et al. (2021). Myocardial infarction detection based on deep neural network on imbalanced data. Multimedia Systems, 1-13.
- García-Peñalvo, et al. (2021). A Survey on Data mining classification approaches.
- Jain, A. K., et al. (2018). Rule-based framework for detection of smishing messages in mobile environment. Procedia Computer Science, 125, 617-623.