AITopics

Hubness has been recently identified as a problematic phenomenon occurring in high-dimensional space. In this paper, we address a different type of hubness that occurs when the number of samples is large. We investigate the difference between the hubness in high-dimensional data and the one in large-sample data. One finding is that centering, which is known to reduce the former, does not work for the latter. We then propose a new hub-reduction method, called localized centering. It is an extension of centering, yet works effectively for both types of hubness. Using real-world datasets consisting of a large number of documents, we demonstrate that the proposed method improves the accuracy of k-nearest neighbor classification.

artificial intelligence, machine learning, natural language, (18 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Serbia > Vojvodina > South Bačka District > Novi Sad (0.04)
Asia > Japan > Honshū > Chūbu > Shizuoka Prefecture > Shizuoka (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

Learning Multi-Level Task Groups in Multi-Task Learning

Han, Lei (Hong Kong Baptist University) | Zhang, Yu (Hong Kong Baptist University)

In multi-task learning (MTL), multiple related tasks are learned jointly by sharing information across them. Many MTL algorithms have been proposed to learn the underlying task groups. However, those methods are limited to learn the task groups at only a single level, which may be not sufficient to model the complex structure among tasks in many real-world applications. In this paper, we propose a Multi-Level Task Grouping (MeTaG) method to learn the multi-level grouping structure instead of only one level among tasks. Specifically, by assuming the number of levels to be H, we decompose the parameter matrix into a sum of H component matrices, each of which is regularized with a l2 norm on the pairwise difference among parameters of all the tasks to construct level-specific task groups. For optimization, we employ the smoothing proximal gradient method to efficiently solve the objective function of the MeTaG model. Moreover, we provide theoretical analysis to show that under certain conditions the MeTaG model can recover the true parameter matrix and the true task groups in each level with high probability. We experiment our approach on both synthetic and real-world datasets, showing competitive performance over state-of-the-art MTL methods.

artificial intelligence, machine learning, task group, (14 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Discriminative Feature Grouping

Han, Lei (Hong Kong Baptist University) | Zhang, Yu (Hong Kong Baptist University)

Feature grouping has been demonstrated to be promising in learning with high-dimensional data. It helps reduce the variances in the estimation and improves the stability of feature selection. One major limitation of existing feature grouping approaches is that some similar but different feature groups are often mis-fused, leading to impaired performance. In this paper, we propose a Discriminative Feature Grouping (DFG) method to discover the feature groups with enhanced discrimination. Different from existing methods, DFG adopts a novel regularizer for the feature coefficients to trade-off between fusing and discriminating feature groups. The proposed regularizer consists of a ell_1 norm to enforce feature sparsity and a pairwise ell_infty norm to encourage the absolute differences among any three feature coefficients to be similar. To achieve better asymptotic property, we generalize the proposed regularizer to an adaptive one where the feature coefficients are weighted based on the solution of some estimator with root-n consistency. For optimization, we employ the alternating direction method of multipliers to solve the proposed methods efficiently. Experimental results on synthetic and real-world datasets demonstrate that the proposed methods have good performance compared with the state-of-the-art feature grouping methods.

artificial intelligence, feature group, machine learning, (17 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Bayesian Maximum Margin Principal Component Analysis

Supervised dimensionality reduction has shown great advantages in finding predictive subspaces. Previous methods rarely consider the popular maximum margin principle and are prone to overfitting to usually small training data, especially for those under the maximum likelihood framework. In this paper, we present a posterior-regularized Bayesian approach to combine Principal Component Analysis (PCA) with the max-margin learning. Based on the data augmentation idea for max-margin learning and the probabilistic interpretation of PCA, our method can automatically infer the weight and penalty parameter of max-margin learning machine, while finding the most appropriate PCA subspace simultaneously under the Bayesian framework. We develop a fast mean-field variational inference algorithm to approximate the posterior. Experimental results on various classification tasks show that our method outperforms a number of competitors.

artificial intelligence, bayesian inference, machine learning, (15 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Ding, Hu (State University of New York at Buffalo) | Xu, Jinhui (State University of New York at Buffalo)

Random Gradient Descent Tree: A Combinatorial Approach for SVM with Outliers

Support Vector Machine (SVM) is a fundamental technique in machine learning. A long time challenge facing SVM is how to deal with outliers (caused by mislabeling), as they could make the classes in SVM nonseparable. Existing techniques, such as soft margin SVM, ν-SVM, and Core-SVM, can alleviate the problem to certain extent, but cannot completely resolve the issue. Recently, there are also techniques available for explicit outlier removal. But they suffer from high time complexity and cannot guarantee quality of solution. In this paper, we present a new combinatorial approach, called Random Gradient Descent Tree (or RGD-tree), to explicitly deal with outliers; this results in a new algorithm called RGD-SVM. Our technique yields provably good solution and can be efficiently implemented for practical purpose. The time and space complexities of our approach only linearly depend on the input size and the dimensionality of the space, which are significantly better than existing ones. Experiments on benchmark datasets suggest that our technique considerably outperforms several popular techniques in most of the cases.

artificial intelligence, machine learning, outlier, (16 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.61)

Graph-Sparse LDA: A Topic Model with Structured Sparsity

Doshi-Velez, Finale (Harvard University) | Wallace, Byron C. (University of Texas at Austin) | Adams, Ryan (Harvard University)

Topic modeling is a powerful tool for uncovering latent structure in many domains, including medicine, finance, and vision. The goals for the model vary depending on the application: sometimes the discovered topics are used for prediction or another downstream task. In other cases, the content of the topic may be of intrinsic scientific interest. Unfortunately, even when one uses modern sparse techniques, discovered topics are often difficult to interpret due to the high dimensionality of the underlying space. To improve topic interpretability, we introduce Graph-Sparse LDA, a hierarchical topic model that uses knowledge of relationships between words (e.g., as encoded by an ontology). In our model, topics are summarized by a few latent concept-words from the underlying graph that explain the observed words. Graph-Sparse LDA recovers sparse, interpretable summaries on two real-world biomedical datasets while matching state-of-the-art prediction performance.

artificial intelligence, natural language, topic model, (17 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Texas > Travis County > Austin (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.94)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.50)

Policy Tree: Adaptive Representation for Policy Gradient

Gupta, Ujjwal Das (University of Alberta) | Talvitie, Erik (Franklin and Marshall College) | Bowling, Michael (University of Alberta)

Much of the focus on finding good representations in reinforcement learning has been on learning complex non-linear predictors of value. Policy gradient algorithms, which directly represent the policy, often need fewer parameters to learn good policies. However, they typically employ a fixed parametric representation that may not be sufficient for complex domains. This paper introduces the Policy Tree algorithm, which can learn an adaptive representation of policy in the form of a decision tree over different instantiations of a base policy. Policy gradient is used both to optimize the parameters and to grow the tree by choosing splits that enable the maximum local increase in the expected return of the policy. Experiments show that this algorithm can choose genuinely helpful splits and significantly improve upon the commonly used linear Gibbs softmax policy, which we choose as our base policy.

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > Canada > Alberta (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Choi, Jaesik (Ulsan National Institute of Science and Technology) | Amir, Eyal (University of Illinois at Urbana-Champaign) | Xu, Tianfang (University of Illinois at Urbana-Champaign) | Valocchi, Albert J. (University of Illinois at Urbana-Champaign)

Learning Relational Kalman Filtering

The Kalman Filter (KF) is pervasively used to control a vast array of consumer, health and defense products. By grouping sets of symmetric state variables, the Relational Kalman Filter (RKF) enables us to scale the exact KF for large-scale dynamic systems. In this paper, we provide a parameter learning algorithm for RKF, and a regrouping algorithm that prevents the degeneration of the relational structure for efficient filtering. The proposed algorithms significantly expand the applicability of the RKFs by solving the following questions: (1) how to learn parameters for RKF from partial observations; and (2) how to regroup the degenerated state variables by noisy real-world observations. To our knowledge, this is the first paper on learning parameters in relational continuous probabilistic models. We show that our new algorithms significantly improve the accuracy and the efficiency of filtering large-scale dynamic systems.

artificial intelligence, machine learning, state variable, (15 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Nebraska (0.04)
North America > United States > Kansas (0.04)
(6 more...)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)

A Convex Formulation for Spectral Shrunk Clustering

Chang, Xiaojun (University of Technology Sydney) | Nie, Feiping (University of Texas at Arlington) | Ma, Zhigang (Carnegie Mellon University) | Yang, Yi (University of Technology Sydney) | Zhou, Xiaofang (The University of Queensland)

Spectral clustering is a fundamental technique in the field of data mining and information processing. Most existing spectral clustering algorithms integrate dimensionality reduction into the clustering process assisted by manifold learning in the original space. However, the manifold in reduced-dimensional subspace is likely to exhibit altered properties in contrast with the original space. Thus, applying manifold information obtained from the original space to the clustering process in a low-dimensional subspace is prone to inferior performance. Aiming to address this issue, we propose a novel convex algorithm that mines the manifold structure in the low-dimensional subspace. In addition, our unified learning process makes the manifold learning particularly tailored for the clustering. Compared with other related methods, the proposed algorithm results in more structured clustering result. To validate the efficacy of the proposed algorithm, we perform extensive experiments on several benchmark datasets in comparison with some state-of-the-art clustering approaches. The experimental results demonstrate that the proposed algorithm has quite promising clustering performance.

data mining, machine learning, spectral, (16 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.04)
(11 more...)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.55)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Cao, Wei (University of Technology, Sydney) | Hu, Liang (University of Technology and Shanghai Jiaotong University) | Cao, Longbing (University of Technology)

Deep Modeling Complex Couplings within Financial Markets

The global financial crisis occurred in 2008 and its contagion to other regions, as well as the long-lasting impact on different markets, show that it is increasingly important to understand the complicated coupling relationships across financial markets. This is indeed very difficult as complex hidden coupling relationships exist between different financial markets in various countries, which are very hard to model. The couplings involve interactions between homogeneous markets from various countries (we call intra-market coupling), interactions between heterogeneous markets (inter-market coupling) and interactions between current and past market behaviors (temporal coupling). Very limited work has been done towards modeling such complex couplings, whereas some existing methods predict market movement by simply aggregating indicators from various markets but ignoring the inbuilt couplings. As a result, these methods are highly sensitive to observations, and may often fail when financial indicators change slightly. In this paper, a coupled deep belief network is designed to accommodate the above three types of couplings across financial markets. With a deep-architecture model to capture the high-level coupled features, the proposed approach can infer market trends. Experimental results on data of stock and currency markets from three countries show that our approach outperforms other baselines, from both technical and business perspectives.

artificial intelligence, coupling, machine learning, (19 more...)

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.06)
Asia > India (0.05)
South America > Brazil (0.04)
(8 more...)

Industry: Banking & Finance > Trading (1.00)