AITopics

1210.5135

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningOct-18-2012

Least Absolute Gradient Selector: Statistical Regression via Pseudo-Hard Thresholding

Yang, Kun

Variable selection in linear models plays a pivotal role in modern statistics. Hard-thresholding methods such as $l_0$ regularization are theoretically ideal but computationally infeasible. In this paper, we propose a new approach, called the LAGS, short for "least absulute gradient selector", to this challenging yet interesting problem by mimicking the discrete selection process of $l_0$ regularization. To estimate $\beta$ under the influence of noise, we consider, nevertheless, the following convex program [\hat{\beta} = \textrm{arg min}\frac{1}{n}\|X^{T}(y - X\beta)\|_1 + \lambda_n\sum_{i = 1}^pw_i(y;X;n)|\beta_i|] $\lambda_n > 0$ controls the sparsity and $w_i > 0$ dependent on $y, X$ and $n$ is the weights on different $\beta_i$; $n$ is the sample size. Surprisingly, we shall show in the paper, both geometrically and analytically, that LAGS enjoys two attractive properties: (1) LAGS demonstrates discrete selection behavior and hard thresholding property as $l_0$ regularization by strategically chosen $w_i$, we call this property "pseudo-hard thresholding"; (2) Asymptotically, LAGS is consistent and capable of discovering the true model; nonasymptotically, LAGS is capable of identifying the sparsity in the model and the prediction error of the coefficients is bounded at the noise level up to a logarithmic factor---$\log p$, where $p$ is the number of predictors. Computationally, LAGS can be solved efficiently by convex program routines for its convexity or by simplex algorithm after recasting it into a linear program. The numeric simulation shows that LAGS is superior compared to soft-thresholding methods in terms of mean squared error and parsimony of the model.

artificial intelligence, machine learning, predictor, (18 more...)

1204.2353

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.46)
Health & Medicine > Therapeutic Area > Urology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Tembine, Hamidou, Tempone, Raul, Vilanova, Pedro

Mean-Field Learning: a Survey

arXiv.org Machine LearningOct-17-2012

In this paper we study iterative procedures for stationary equilibria in games with large number of players. Most of learning algorithms for games with continuous action spaces are limited to strict contraction best reply maps in which the Banach-Picard iteration converges with geometrical convergence rate. When the best reply map is not a contraction, Ishikawa-based learning is proposed. The algorithm is shown to behave well for Lipschitz continuous and pseudo-contractive maps. However, the convergence rate is still unsatisfactory. Several acceleration techniques are presented. We explain how cognitive users can improve the convergence rate based only on few number of measurements. The methodology provides nice properties in mean field games where the payoff function depends only on own-action and the mean of the mean-field (first moment mean-field games). A learning framework that exploits the structure of such games, called, mean-field learning, is proposed. The proposed mean-field learning framework is suitable not only for games but also for non-convex global optimization problems. Then, we introduce mean-field learning without feedback and examine the convergence to equilibria in beauty contest games, which have interesting applications in financial markets. Finally, we provide a fully distributed mean-field learning and its speedup versions for satisfactory solution in wireless networks. We illustrate the convergence rate improvement with numerical examples.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

1210.4657

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games (0.46)
Banking & Finance (0.34)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Lin, Hui, Bilmes, Jeff A.

Learning Mixtures of Submodular Shells with Application to Document Summarization

We introduce a method to learn a mixture of submodular "shells" in a large-margin setting. A submodular shell is an abstract submodular function that can be instantiated with a ground set and a set of parameters to produce a submodular function. A mixture of such shells can then also be so instantiated to produce a more complex submodular function. What our algorithm learns are the mixture weights over such shells. We provide a risk bound guarantee when learning in a large-margin structured-prediction setting using a projected subgradient method when only approximate submodular optimization is possible (such as with submodular function maximization). We apply this method to the problem of multi-document summarization and produce the best results reported so far on the widely used NIST DUC-05 through DUC-07 document summarization corpora.

inductive learning, optimization problem, submodular function, (18 more...)

1210.4871

Country:

North America > United States > New York (0.14)
North America > United States > Colorado (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(4 more...)

Genre: Research Report (0.50)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Amizadeh, Saeed, Thiesson, Bo, Hauskrecht, Milos

Variational Dual-Tree Framework for Large-Scale Transition Matrix Approximation

In recent years, non-parametric methods utilizing random walks on graphs have been used to solve a wide range of machine learning problems, but in their simplest form they do not scale well due to the quadratic complexity. In this paper, a new dual-tree based variational approach for approximating the transition matrix and efficiently performing the random walk is proposed. The approach exploits a connection between kernel density estimation, mixture modeling, and random walk on graphs in an optimization of the transition matrix for the data graph that ties together edge transitions probabilities that are similar. Compared to the de facto standard approximation method based on k-nearestneighbors, we demonstrate order of magnitudes speedup without sacrificing accuracy for Label Propagation tasks on benchmark data sets in semi-supervised learning.

graph, partition, refinement, (16 more...)

1210.4846

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Washington > King County > Redmond (0.04)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre:

Research Report (0.50)
Instructional Material (0.46)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Yuan, Changhe, Malone, Brandon

An Improved Admissible Heuristic for Learning Optimal Bayesian Networks

Recently two search algorithms, A* and breadth-first branch and bound (BFBnB), were developed based on a simple admissible heuristic for learning Bayesian network structures that optimize a scoring function. The heuristic represents a relaxation of the learning problem such that each variable chooses optimal parents independently. As a result, the heuristic may contain many directed cycles and result in a loose bound. This paper introduces an improved admissible heuristic that tries to avoid directed cycles within small groups of variables. A sparse representation is also introduced to store only the unique optimal parent choices. Empirical results show that the new techniques significantly improved the efficiency and scalability of A* and BFBnB on most of datasets tested in this paper.

artificial intelligence, machine learning, pattern database, (17 more...)

1210.4913

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.34)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Virtanen, Seppo, Jia, Yangqing, Klami, Arto, Darrell, Trevor

Factorized Multi-Modal Topic Model

Multi-modal data collections, such as corpora of paired images and text snippets, require analysis methods beyond single-view component and topic models. For continuous observations the current dominant approach is based on extensions of canonical correlation analysis, factorizing the variation into components shared by the different modalities and those private to each of them. For count data, multiple variants of topic models attempting to tie the modalities together have been presented. All of these, however, lack the ability to learn components private to one modality, and consequently will try to force dependencies even between minimally correlating modalities. In this work we combine the two approaches by presenting a novel HDP-based topic model that automatically learns both shared and private topics. The model is shown to be especially useful for querying the contents of one domain given samples of the other.

artificial intelligence, machine learning, natural language, (19 more...)

1210.492

Country:

Europe (0.93)
Asia > Middle East (0.15)

Genre: Research Report (0.40)

Industry: Transportation (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Latent Dirichlet Allocation Uncovers Spectral Characteristics of Drought Stressed Plants

Wahabzada, Mirwaes, Kersting, Kristian, Bauckhage, Christian, Roemer, Christoph, Ballvora, Agim, Pinto, Francisco, Rascher, Uwe, Leon, Jens, Ploemer, Lutz

Understanding the adaptation process of plants to drought stress is essential in improving management practices, breeding strategies as well as engineering viable crops for a sustainable agriculture in the coming decades. Hyper-spectral imaging provides a particularly promising approach to gain such understanding since it allows to discover non-destructively spectral characteristics of plants governed primarily by scattering and absorption characteristics of the leaf internal structure and biochemical constituents. Several drought stress indices have been derived using hyper-spectral imaging. However, they are typically based on few hyper-spectral images only, rely on interpretations of experts, and consider few wavelengths only. In this study, we present the first data-driven approach to discovering spectral drought stress indices, treating it as an unsupervised labeling problem at massive scale. To make use of short range dependencies of spectral wavelengths, we develop an online variational Bayes algorithm for latent Dirichlet allocation with convolved Dirichlet regularizer. This approach scales to massive datasets and, hence, provides a more objective complement to plant physiological practices. The spectral topics found conform to plant physiological knowledge and can be computed in a fraction of the time compared to existing LDA approaches.

artificial intelligence, machine learning, natural language, (22 more...)

1210.4919

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Food & Agriculture > Agriculture (1.00)
Education (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.71)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.61)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Walsh, Thomas J., Goschin, Sergiu

Dynamic Teaching in Sequential Decision Making Environments

We describe theoretical bounds and a practical algorithm for teaching a model by demonstration in a sequential decision making environment. Unlike previous efforts that have optimized learners that watch a teacher demonstrate a static policy, we focus on the teacher as a decision maker who can dynamically choose different policies to teach different parts of the environment. We develop several teaching frameworks based on previously defined supervised protocols, such as Teaching Dimension, extending them to handle noise and sequences of inputs encountered in an MDP. We provide theoretical bounds on the learnability of several important model classes in this setting and suggest a practical algorithm for dynamic teaching.

learner, machine learning, reinforcement learning, (18 more...)

1210.4918

Country: North America > United States > Kansas (0.28)

Genre: Research Report (0.50)

Industry:

Education (0.69)
Transportation > Passenger (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Fast Graph Construction Using Auction Algorithm

Wang, Jun, Xia, Yinglong

In practical machine learning systems, graph based data representation has been widely used in various learning paradigms, ranging from unsupervised clustering to supervised classification. Besides those applications with natural graph or network structure data, such as social network analysis and relational learning, many other applications often involve a critical step in converting data vectors to an adjacency graph. In particular, a sparse subgraph extracted from the original graph is often required due to both theoretic and practical needs. Previous study clearly shows that the performance of different learning algorithms, e.g., clustering and classification, benefits from such sparse subgraphs with balanced node connectivity. However, the existing graph construction methods are either computationally expensive or with unsatisfactory performance. In this paper, we utilize a scalable method called auction algorithm and its parallel extension to recover a sparse yet nearly balanced subgraph with significantly reduced computational cost. Empirical study and comparison with the stateof-art approaches clearly demonstrate the superiority of the proposed method in both efficiency and accuracy.

artificial intelligence, graph, machine learning, (14 more...)

1210.4917

Country:

North America > United States (0.46)
North America > Canada (0.28)

Genre: Research Report (1.00)

Industry: Information Technology (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)