AITopics

Multi-view clustering, which seeks a partition of the data inmultiple views that often provide complementary information to eachother, has received considerable attention in recent years. In reallife clustering problems, the data in each view may haveconsiderable noise. However, existing clustering methods blindlycombine the information from multi-view data with possiblyconsiderable noise, which often degrades their performance. In thispaper, we propose a novel Markov chain method for RobustMulti-view Spectral Clustering (RMSC). Our method has a flavor oflow-rank and sparse decomposition, where we firstly construct atransition probability matrix from each single view, and then usethese matrices to recover a shared low-rank transition probabilitymatrix as a crucial input to the standard Markov chain methodfor clustering. The optimization problem of RMSC has a low-rankconstraint on the transition probability matrix, and simultaneouslya probabilistic simplex constraint on each of its rows. To solvethis challenging optimization problem, we propose an optimization procedurebased on the Augmented Lagrangian Multiplier scheme. Experimentalresults on various real world datasets show that theproposed method has superior performance over severalstate-of-the-art methods for multi-view clustering.

artificial intelligence, machine learning, matrix, (15 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

South America > Paraguay > Asunción > Asunción (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Learning Relative Similarity by Stochastic Dual Coordinate Ascent

Wu, Pengcheng (Nanyang Technological University) | Ding, Yi (Nanyang Technological University) | Zhao, Peilin (Rutgers University) | Miao, Chunyan (Nanyang Technological University) | Hoi, Steven C. H. (Nanyang Technological University)

Learning relative similarity from pairwise instances is an important problem in machine learning and has a wide range of applications. Despite being studied for years, some existing methods solved by Stochastic Gradient Descent (SGD) techniques generally suffer from slow convergence. In this paper, we investigate the application of Stochastic Dual Coordinate Ascent (SDCA) technique to tackle the optimization task of relative similarity learning by extending from vector to matrix parameters. Theoretically, we prove the optimal linear convergence rate for the proposed SDCA algorithm, beating the well-known sublinear convergence rate by the previous best metric learning algorithms. Empirically, we conduct extensive experiments on both standard and large-scale data sets to validate the effectiveness of the proposed algorithm for retrieval tasks.

algorithm, artificial intelligence, machine learning, (14 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Singapore (0.14)
North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Small-Variance Asymptotics for Dirichlet Process Mixtures of SVMs

Wang, Yining (Tsinghua University) | Zhu, Jun (Tsinghua University)

Infinite SVM (iSVM) is a Dirichlet process (DP) mixture of large-margin classifiers. Though flexible in learning nonlinear classifiers and discovering latent clustering structures, iSVM has a difficult inference task and existing methods could hinder its applicability to large-scale problems. This paper presents a small-variance asymptotic analysis to derive a simple and efficient algorithm, which monotonically optimizes a max-margin DP-means (M2DPM) problem, an extension of DP-means for both predictive learning and descriptive clustering. Our analysis is built on Gibbs infinite SVMs, an alternative DP mixture of large-margin machines, which admits a partially collapsed Gibbs sampler without truncation by exploring data augmentation techniques. Experimental results show that M2DPM runs much faster than similar algorithms without sacrificing prediction accuracies.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.48)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Cross-Domain Metric Learning Based on Information Theory

Wang, Hao (Chinese Academy of Sciences) | Wang, Wei (Chinese Academy of Sciences) | Zhang, Chen (Chinese Academy of Sciences) | Xu, Fanjiang (Chinese Academy of Sciences)

Supervised metric learning plays a substantial role in statistical classification. Conventional metric learning algorithms have limited utility when the training data and testing data are drawn from related but different domains (i.e., source domain and target domain). Although this issue has got some progress in feature-based transfer learning, most of the work in this area suffers from non-trivial optimization and pays little attention to preserving the discriminating information. In this paper, we propose a novel metric learning algorithm to transfer knowledge from the source domain to the target domain in an information-theoretic setting, where a shared Mahalanobis distance across two domains is learnt by combining three goals together: 1) reducing the distribution difference between different domains; 2) preserving the geometry of target domain data; 3) aligning the geometry of source domain data with its label information. Based on this combination, the learnt Mahalanobis distance effectively transfers the discriminating power and propagates standard classifiers across these two domains. More importantly, our proposed method has closed-form solution and can be efficiently optimized. Experiments in two real-world applications demonstrate the effectiveness of our proposed method.

artificial intelligence, domain data, machine learning, (16 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Reconsidering Mutual Information Based Feature Selection: A Statistical Significance View

Vinh, Nguyen Xuan (The University of Melbourne) | Chan, Jeffrey (The University of Melbourne) | Bailey, James (The University of Melbourne)

Mutual information (MI) based approaches are a popular feature selection paradigm. Although the stated goal of MI-based feature selection is to identify a subset of features that share the highest mutual information with the class variable, most current MI-based techniques are greedy methods that make use of low dimensional MI quantities. The reason for using low dimensional approximation has been mostly attributed to the difficulty associated with estimating the high dimensional MI from limited samples. In this paper, we argue a different viewpoint that, given a very large amount of data, the high dimensional MI objective is still problematic to be employed as a meaningful optimization criterion, due to its overfitting nature: the MI almost always increases as more features are added, thus leading to a trivial solution which includes all features. We propose a novel approach to the MI-based feature selection problem, in which the overfitting phenomenon is controlled rigourously by means of a statistical test. We develop local and global optimization algorithms for this new feature selection model, and demonstrate its effectiveness in the applications of explaining variables and objects.

artificial intelligence, machine learning, optimization problem, (18 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > Middle East > Jordan (0.05)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre:

Research Report (0.34)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Locality Preserving Projection for Domain Adaptation with Multi-Objective Learning

Shu, Le (Temple University) | Ma, Tianyang (Temple University) | Latecki, Longin Jan (Temple University)

In many practical cases, we need to generalize a model trained in a source domain to a new target domain.However, the distribution of these two domains may differ very significantly, especially sometimes some crucial target features may not have support in the source domain.This paper proposes a novel locality preserving projection method for domain adaptation task,which can find a linear mapping preserving the 'intrinsic structure' for both source and target domains.We first construct two graphs encoding the neighborhood information for source and target domains separately.We then find linear projection coefficients which have the property of locality preserving for each graph.Instead of combing the two objective terms under compatibility assumption and requiring the user to decide the importance of each objective function,we propose a multi-objective formulation for this problem and solve it simultaneously using Pareto optimization.The Pareto frontier captures all possible good linear projection coefficients that are preferred by one or more objectives.The effectiveness of our approach is justified by both theoretical analysis and empirical results on real world data sets.The new feature representation shows better prediction accuracy as our experiments demonstrate.

artificial intelligence, machine learning, optimization problem, (15 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Maryland > Baltimore (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.47)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Sparse Compositional Metric Learning

Shi, Yuan (University of Southern California) | Bellet, Aurélien (University of Southern California) | Sha, Fei (University of Southern California)

We propose a new approach for metric learning by framing it as learning a sparse combination of locally discriminative metrics that are inexpensive to generate from the training data. This flexible framework allows us to naturally derive formulations for global, multi-task and local metric learning. The resulting algorithms have several advantages over existing methods in the literature: a much smaller number of parameters to be estimated and a principled way to generalize learned metrics to new testing data points. To analyze the approach theoretically, we derive a generalization bound that justifies the sparse combination. Empirically, we evaluate our algorithms on several datasets against state-of-the-art metric learning methods. The results are consistent with our theoretical findings and demonstrate the superiority of our approach in terms of classification performance and scalability.

artificial intelligence, machine learning, metric learning, (14 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia > Middle East > Jordan (0.04)

Industry: Government > Regional Government > North America Government > United States Government (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Ramirez-Loaiza, Maria Eugenia (Illinois Institute of Technology) | Culotta, Aron (Illinois Institute of Technology) | Bilgic, Mustafa (Illinois Institute of Technology)

Anytime Active Learning

A common bottleneck in deploying supervised learning systems is collecting human-annotated examples. In many domains, annotators form an opinion about the label of an example incrementally -- e.g., each additional word read from a document or each additional minute spent inspecting a video helps inform the annotation. In this paper, we investigate whether we can train learning systems more efficiently by requesting an annotation before inspection is fully complete -- e.g., after reading only 25 words of a document. While doing so may reduce the overall annotation time, it also introduces the risk that the annotator might not be able to provide a label if interrupted too early. We propose an anytime active learning approach that optimizes the annotation time and response rate simultaneously. We conduct user studies on two document classification datasets and develop simulated annotators that mimic the users. Our simulated experiments show that anytime active learning outperforms several baselines on these two datasets. For example, with an annotation budget of one hour, training a classifier by annotating the first 25 words of each document reduces classification error by 17% over annotating the first 100 words of each document.

artificial intelligence, inductive learning, machine learning, (16 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre:

Questionnaire & Opinion Survey (0.97)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Bagging by Design (on the Suboptimality of Bagging)

Papakonstantinou, Periklis (Tsinghua University) | Xu, Jia (Tsinghua University) | Cao, Zhu (Tsinghua University)

Bagging (Breiman 1996) and its variants is one of the most popular methods in aggregating classifiers and regressors. Originally, its analysis assumed that the bootstraps are built from an unlimited, independent source of samples, therefore we call this form of bagging ideal-bagging. However in the real world, base predictors are trained on data subsampled from a limited number of training samples and thus they behave very differently. We analyze the effect of intersections between bootstraps, obtained by subsampling, to train different base predictors. Most importantly, we provide an alternative subsampling method called design-bagging based on a new construction of combinatorial designs, and prove it universally better than bagging. Methodologically, we succeed at this level of generality because we compare the prediction accuracy of bagging and design-bagging relative to the accuracy ideal-bagging. This finds potential applications in more involved bagging-based methods. Our analytical results are backed up by experiments on classification and regression settings.

artificial intelligence, machine learning, predictor, (19 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country: Asia > China (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Robust Non-Negative Dictionary Learning

Pan, Qihe (Beihang University) | Kong, Deguang (University of Texas Arlington) | Ding, Chris (University of Texas Arlington) | Luo, Bin (Anhui University)

Dictionary learning plays an important role in machine learning, where data vectors are modeled as a sparse linear combinations of basis factors (i.e., dictionary). However, how to conduct dictionary learning in noisy environment has not been well studied. Moreover, in practice, the dictionary (i.e., the lower rank approximation of the data matrix) and the sparse representations are required to be nonnegative, such as applications for image annotation, document summarization, microarray analysis. In this paper, we propose a new formulation for non-negative dictionary learning in noisy environment, where structure sparsity is enforced on sparse representation. The proposed new formulation is also robust for data with noises and outliers, due to a robust loss function used. We derive an efficient multiplicative updating algorithm to solve the optimization problem, where dictionary and sparse representation are updated iteratively. We prove the convergence and correctness of proposed algorithm rigorously.We show the differences of dictionary at different level of sparsity constraint.The proposed algorithm can be adapted for clustering and semi-supervised learning.

algorithm, artificial intelligence, machine learning, (15 more...)

Twenty-Eighth AAAI Conference on Artificial Intelligence

Country:

Asia > China (0.04)
North America > United States > Texas > Tarrant County > Arlington (0.04)
North America > United States > California (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)