AITopics | Asia

Mutual information (MI) based approaches are a popular feature selection paradigm. Although the stated goal of MI-based feature selection is to identify a subset of features that share the highest mutual information with the class variable, most current MI-based techniques are greedy methods that make use of low dimensional MI quantities. The reason for using low dimensional approximation has been mostly attributed to the difficulty associated with estimating the high dimensional MI from limited samples. In this paper, we argue a different viewpoint that, given a very large amount of data, the high dimensional MI objective is still problematic to be employed as a meaningful optimization criterion, due to its overfitting nature: the MI almost always increases as more features are added, thus leading to a trivial solution which includes all features. We propose a novel approach to the MI-based feature selection problem, in which the overfitting phenomenon is controlled rigourously by means of a statistical test. We develop local and global optimization algorithms for this new feature selection model, and demonstrate its effectiveness in the applications of explaining variables and objects.

dataset, mimlmix, partial example, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre:

Research Report (0.34)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Vinh, Nguyen Xuan (The University of Melbourne) | Chan, Jeffrey (The University of Melbourne) | Bailey, James (The University of Melbourne)

AAAI ConferencesJul-14-2014

Mutual information (MI) based approaches are a popular feature selection paradigm. Although the stated goal of MI-based feature selection is to identify a subset of features that share the highest mutual information with the class variable, most current MI-based techniques are greedy methods that make use of low dimensional MI quantities. The reason for using low dimensional approximation has been mostly attributed to the difficulty associated with estimating the high dimensional MI from limited samples. In this paper, we argue a different viewpoint that, given a very large amount of data, the high dimensional MI objective is still problematic to be employed as a meaningful optimization criterion, due to its overfitting nature: the MI almost always increases as more features are added, thus leading to a trivial solution which includes all features. We propose a novel approach to the MI-based feature selection problem, in which the overfitting phenomenon is controlled rigourously by means of a statistical test. We develop local and global optimization algorithms for this new feature selection model, and demonstrate its effectiveness in the applications of explaining variables and objects.

dataset, mimlmix, partial example, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre:

Research Report (0.34)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Vinh, Nguyen Xuan (The University of Melbourne) | Chan, Jeffrey (The University of Melbourne) | Bailey, James (The University of Melbourne)

AAAI ConferencesJul-14-2014

Mutual information (MI) based approaches are a popular feature selection paradigm. Although the stated goal of MI-based feature selection is to identify a subset of features that share the highest mutual information with the class variable, most current MI-based techniques are greedy methods that make use of low dimensional MI quantities. The reason for using low dimensional approximation has been mostly attributed to the difficulty associated with estimating the high dimensional MI from limited samples. In this paper, we argue a different viewpoint that, given a very large amount of data, the high dimensional MI objective is still problematic to be employed as a meaningful optimization criterion, due to its overfitting nature: the MI almost always increases as more features are added, thus leading to a trivial solution which includes all features. We propose a novel approach to the MI-based feature selection problem, in which the overfitting phenomenon is controlled rigourously by means of a statistical test. We develop local and global optimization algorithms for this new feature selection model, and demonstrate its effectiveness in the applications of explaining variables and objects.

dataset, mimlmix, partial example, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre:

Research Report (0.34)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Vinh, Nguyen Xuan (The University of Melbourne) | Chan, Jeffrey (The University of Melbourne) | Bailey, James (The University of Melbourne)

AAAI ConferencesJul-14-2014

Mutual information (MI) based approaches are a popular feature selection paradigm. Although the stated goal of MI-based feature selection is to identify a subset of features that share the highest mutual information with the class variable, most current MI-based techniques are greedy methods that make use of low dimensional MI quantities. The reason for using low dimensional approximation has been mostly attributed to the difficulty associated with estimating the high dimensional MI from limited samples. In this paper, we argue a different viewpoint that, given a very large amount of data, the high dimensional MI objective is still problematic to be employed as a meaningful optimization criterion, due to its overfitting nature: the MI almost always increases as more features are added, thus leading to a trivial solution which includes all features. We propose a novel approach to the MI-based feature selection problem, in which the overfitting phenomenon is controlled rigourously by means of a statistical test. We develop local and global optimization algorithms for this new feature selection model, and demonstrate its effectiveness in the applications of explaining variables and objects.

dataset, mimlmix, partial example, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre:

Research Report (0.34)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Vinh, Nguyen Xuan (The University of Melbourne) | Chan, Jeffrey (The University of Melbourne) | Bailey, James (The University of Melbourne)

AAAI ConferencesJul-14-2014

Mutual information (MI) based approaches are a popular feature selection paradigm. Although the stated goal of MI-based feature selection is to identify a subset of features that share the highest mutual information with the class variable, most current MI-based techniques are greedy methods that make use of low dimensional MI quantities. The reason for using low dimensional approximation has been mostly attributed to the difficulty associated with estimating the high dimensional MI from limited samples. In this paper, we argue a different viewpoint that, given a very large amount of data, the high dimensional MI objective is still problematic to be employed as a meaningful optimization criterion, due to its overfitting nature: the MI almost always increases as more features are added, thus leading to a trivial solution which includes all features. We propose a novel approach to the MI-based feature selection problem, in which the overfitting phenomenon is controlled rigourously by means of a statistical test. We develop local and global optimization algorithms for this new feature selection model, and demonstrate its effectiveness in the applications of explaining variables and objects.

dataset, mimlmix, partial example, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre:

Research Report (0.34)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Vinh, Nguyen Xuan (The University of Melbourne) | Chan, Jeffrey (The University of Melbourne) | Bailey, James (The University of Melbourne)

AAAI ConferencesJul-14-2014

Mutual information (MI) based approaches are a popular feature selection paradigm. Although the stated goal of MI-based feature selection is to identify a subset of features that share the highest mutual information with the class variable, most current MI-based techniques are greedy methods that make use of low dimensional MI quantities. The reason for using low dimensional approximation has been mostly attributed to the difficulty associated with estimating the high dimensional MI from limited samples. In this paper, we argue a different viewpoint that, given a very large amount of data, the high dimensional MI objective is still problematic to be employed as a meaningful optimization criterion, due to its overfitting nature: the MI almost always increases as more features are added, thus leading to a trivial solution which includes all features. We propose a novel approach to the MI-based feature selection problem, in which the overfitting phenomenon is controlled rigourously by means of a statistical test. We develop local and global optimization algorithms for this new feature selection model, and demonstrate its effectiveness in the applications of explaining variables and objects.

dataset, mimlmix, partial example, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre:

Research Report (0.34)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Semantical Clustering of Morphologically Related Chinese Words

Lee, Chia-Ling (National Taiwan University) | Chang, Ya-Ning (Academia Sinica) | Liu, Chao-Lin (National Chengchi University) | Lee, Chia-Ying (Academia Sinica) | Hsu, Jane Yung-jen (National Taiwan University)

AAAI ConferencesJul-14-2014

A Chinese character embedded in different compound words may carry different meanings. In this paper, we aim at semantical clustering of a given family of morphologically related Chinese words. In Experiment 1, we employed linguistic features at the word, syntactic, semantic, and contextual levels in aggregated computational linguistics methods to handle the clustering task. In Experiment 2, we recruited adults and children to perform the clustering task. Experimental results indicate that our computational model achieved a similar level of performance as children.

experiment 2, feature vector, target word, (12 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Taiwan > Taiwan Province > Taipei (0.06)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > California > Santa Clara County > San Jose (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.99)

Add feedback

Tailoring Local Search for Partial MaxSAT

Cai, Shaowei (Chinese Academy of Sciences) | Luo, Chuan (Peking University) | Thornton, John (Griffith University) | Su, Kaile (Griffith University)

AAAI ConferencesJul-14-2014

Partial MaxSAT (PMS) is a generalization to SAT and MaxSAT. Many real world problems can be encoded into PMS in a more natural and compact way than SAT and MaxSAT. In this paper, we propose new ideas for local search for PMS, which mainly rely on the distinction between hard and soft clauses. We use these ideas to develop a local search PMS algorithm called {\it Dist}. Experimental results on PMS benchmarks from MaxSAT Evaluation 2013 show that {\it Dist} significantly outperforms state-of-the-art PMS algorithms, including both local search algorithms and complete ones, on random and crafted benchmarks. For the industrial benchmark, {\it Dist} dramatically outperforms previous local search algorithms and is comparable with complete algorithms.

algorithm, hard clause, soft clause, (15 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Oceania > Australia > Queensland > Brisbane (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Qualitative Planning with Quantitative Constraints for Online Learning of Robotic Behaviours

Wiley, Timothy (The University of New South Wales) | Sammut, Claude (The University of New South Wales) | Bratko, Ivan (University of Ljubljana)

AAAI ConferencesJul-14-2014

This paper resolves previous problems in the Multi-Strategy architecture for online learning of robotic behaviours. The hybrid method includes a symbolic qualitative planner that constructs an approximate solution to a control problem. The approximate solution provides constraints for a numerical optimisation algorithm, which is used to refine the qualitative plan into an operational policy. Introducing quantitative constraints into the planner gives previously unachievable domain independent reasoning. The method is demonstrated on a multi-tracked robot intended for urban search and rescue.

constraint, quantitative constraint, sequence, (16 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Qualitative Reasoning (0.70)

Add feedback

Novel Density-Based Clustering Algorithms for Uncertain Data

Zhang, Xianchao (Dalian University of Technology) | Liu, Han (Dalian University of Technology) | Zhang, Xiaotong (Dalian University of Technology) | Liu, Xinyue (Dalian University of Technology)

AAAI ConferencesJul-14-2014

Density-based techniques seem promising for handling datauncertainty in uncertain data clustering. Nevertheless, someissues have not been addressed well in existing algorithms. Inthis paper, we firstly propose a novel density-based uncertaindata clustering algorithm, which improves upon existing algorithmsfrom the following two aspects: (1) it employs anexact method to compute the probability that the distance betweentwo uncertain objects is less than or equal to a boundaryvalue, instead of the sampling-based method in previouswork; (2) it introduces new definitions of core object probabilityand direct reachability probability, thus reducing thecomplexity and avoiding sampling. We then further improvethe algorithm by using a novel assignment strategy to ensurethat every object will be assigned to the most appropriatecluster. Experimental results show the superiority of our proposedalgorithms over existing ones.

algorithm, minp ts, probability, (15 more...)

AAAI Conferences

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Liaoning Province > Dalian (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

Filters

Collaborating Authors

Asia

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Semantical Clustering of Morphologically Related Chinese Words

Tailoring Local Search for Partial MaxSAT

Qualitative Planning with Quantitative Constraints for Online Learning of Robotic Behaviours

Novel Density-Based Clustering Algorithms for Uncertain Data