A Uniformly Stable Algorithm For Unsupervised Feature Selection
High-dimensional data presents challenges for data management. Feature selection, as an important dimensionality reduction technique, reduces the dimensionality of data by identifying an essential subset of input features, and it can provide interpretable, effective, and efficient insights for analysis and decision-making processes. Algorithmic stability is a key characteristic of an algorithm in its sensitivity to perturbations of input samples. In this paper, first we propose an innovative unsupervised feature selection algorithm. The architecture of our algorithm consists of a feature scorer and a feature selector. The scorer trains a neural network (NN) to score all the features globally, and the selector is in a dependence sub-NN which locally evaluates the representation abilities to select features. Further, we present algorithmic stability analysis and show our algorithm has a performance guarantee by providing a generalization error bound. Empirically, extensive experimental results on ten real-world datasets corroborate the superior generalization performance of our algorithm over contemporary algorithms. Notably, the features selected by our algorithm have comparable performance to the original features; therefore, our algorithm significantly facilitates data management.
Oct-19-2020
- Country:
- Asia > Russia (0.04)
- Europe
- United Kingdom > Scotland
- City of Edinburgh > Edinburgh (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.14)
- Belgium > Flanders
- West Flanders > Bruges (0.04)
- Spain > Catalonia
- Barcelona Province > Barcelona (0.04)
- Switzerland (0.04)
- Netherlands (0.04)
- Italy > Tuscany
- Florence (0.04)
- Germany > Bavaria (0.04)
- Russia > Central Federal District
- Moscow Oblast > Moscow (0.04)
- United Kingdom > Scotland
- North America
- Canada
- Alberta > Census Division No. 6
- Calgary Metropolitan Region > Calgary (0.04)
- British Columbia (0.04)
- Ontario > Toronto (0.04)
- Quebec > Montreal (0.04)
- Alberta > Census Division No. 6
- United States
- California
- Los Angeles County > Long Beach (0.04)
- San Diego County
- San Francisco County > San Francisco (0.14)
- District of Columbia > Washington (0.04)
- Washington > King County
- Seattle (0.04)
- New Jersey > Mercer County
- Princeton (0.04)
- Oregon > Benton County
- Corvallis (0.04)
- Massachusetts > Middlesex County
- Cambridge (0.04)
- Kentucky > Fayette County
- Lexington (0.14)
- Texas (0.04)
- Ohio > Lucas County
- Oregon (0.04)
- California
- Canada
- Oceania > Australia
- New South Wales > Sydney (0.04)
- Genre:
- Research Report (0.64)
- Industry:
- Technology: