A model-free feature selection technique of feature screening and random forest based recursive feature elimination

Xia, Siwei, Yang, Yuehan

arXiv.org Artificial Intelligence 

Due to the development of data technology, feature selection plays an important role in both statistics and machine learning. High dimensional and ultra-high dimensional datasets are widely used in many fields, such as finance, image recognition, text classification, etc. Although more detailed information is provided with the increase of dimensions, the existence of a large number of redundant features weakens the generalization ability of models and increases the difficulty of data analysis (Jain et al., 2000). Thus, the efficiency of feature selection is crucial, as it focuses on choosing a small subset of informative features that contain the information of data and determine the source of the specific concerns derived from the study. In many data analyses, feature selection is a significant and frequently used dimensionality reduction technique and is often considered a key preprocessing step in data analysis for model benefits, such as interpretability, accuracy, lower computational costs, and less prone to overfitting (Sheikhpour et al., 2017; Khaire and Dhanalakshmi, 2022).

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found