Feature Selection with Redundancy-complementariness Dispersion

Chen, Zhijun, Wu, Chaozhong, Zhang, Yishi, Huang, Zhen, Ran, Bin, Zhong, Ming, Lyu, Nengchao

Feb-1-2015–arXiv.org Machine Learning

Feature selection has attracted significant attention in data mining and machine learning in the past decades. Many existing feature selection methods eliminate redundancy by measuring pairwise inter-correlation of features, whereas the complementariness of features and higher inter-correlation among more than two features are ignored. In this study, a modification item concerning the complementariness of features is introduced in the evaluation criterion of features. Additionally, in order to identify the interference effect of already-selected False Positives (FPs), the redundancy-complementariness dispersion is also taken into account to adjust the measurement of pairwise inter-correlation of features. To illustrate the effectiveness of proposed method, classification experiments are applied with four frequently used classifiers on ten datasets. Classification results verify the superiority of proposed method compared with five representative feature selection methods. Keywords: Classification, Feature selection, Relevance, Redundancy, Complementariness, Redundancy-complementariness dispersion 1. Introduction With the fast development of the world, the dimensional and size of data is fast-growing in most kinds of fields which challenge the data mining and machine learning techniques. Feature selection is an important and useful method that can effectively reduce the dimensionality of feature space while retaining a relatively high accuracy in representing the original data. The effects of feature selection [9] have been widely recognized for its abilities in facilitating data interpretation, reducing acquisition and storage requirements, increasing learning speeds, improving generalization performance, etc.

artificial intelligence, correlation, machine learning, (17 more...)

arXiv.org Machine Learning

Feb-1-2015

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California (0.68)
  - Wisconsin > Dane County
    - Madison (0.14)

Genre:
- Research Report
  - New Finding (1.00)
  - Experimental Study (0.93)

Industry:
- Health & Medicine
  - Pharmaceuticals & Biotechnology (0.93)
  - Therapeutic Area (0.71)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Search (0.67)
  - Machine Learning
    - Statistical Learning (0.93)
    - Performance Analysis > Accuracy (0.36)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found