Feature Selection with Redundancy-complementariness Dispersion

Chen, Zhijun, Wu, Chaozhong, Zhang, Yishi, Huang, Zhen, Ran, Bin, Zhong, Ming, Lyu, Nengchao

arXiv.org Machine Learning 

Feature selection has attracted significant attention in data mining and machine learning in the past decades. Many existing feature selection methods eliminate redundancy by measuring pairwise inter-correlation of features, whereas the complementariness of features and higher inter-correlation among more than two features are ignored. In this study, a modification item concerning the complementariness of features is introduced in the evaluation criterion of features. Additionally, in order to identify the interference effect of already-selected False Positives (FPs), the redundancy-complementariness dispersion is also taken into account to adjust the measurement of pairwise inter-correlation of features. To illustrate the effectiveness of proposed method, classification experiments are applied with four frequently used classifiers on ten datasets. Classification results verify the superiority of proposed method compared with five representative feature selection methods. Keywords: Classification, Feature selection, Relevance, Redundancy, Complementariness, Redundancy-complementariness dispersion 1. Introduction With the fast development of the world, the dimensional and size of data is fast-growing in most kinds of fields which challenge the data mining and machine learning techniques. Feature selection is an important and useful method that can effectively reduce the dimensionality of feature space while retaining a relatively high accuracy in representing the original data. The effects of feature selection [9] have been widely recognized for its abilities in facilitating data interpretation, reducing acquisition and storage requirements, increasing learning speeds, improving generalization performance, etc.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found