AITopics

1009.0571

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.94)

Fox, Emily B., Sudderth, Erik B., Jordan, Michael I., Willsky, Alan S.

Joint Modeling of Multiple Related Time Series via the Beta Process

arXiv.org Machine LearningNov-17-2011

We propose a Bayesian nonparametric approach to the problem of jointly modeling multiple related time series. Our approach is based on the discovery of a set of latent, shared dynamical behaviors. Using a beta process prior, the size of the set and the sharing pattern are both inferred from data. We develop efficient Markov chain Monte Carlo methods based on the Indian buffet process representation of the predictive distribution of the beta process, without relying on a truncated model. In particular, our approach uses the sum-product algorithm to efficiently compute Metropolis-Hastings acceptance probabilities, and explores new dynamical behaviors via birth and death proposals. We examine the benefits of our proposed feature-based model on several synthetic datasets, and also demonstrate promising results on unsupervised segmentation of visual motion capture data.

bayesian inference, health & medicine, time series, (17 more...)

1111.4226

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.63)

Industry:

Health & Medicine (0.93)
Energy > Oil & Gas (0.46)

Vinyals, Oriol, Povey, Daniel

Krylov Subspace Descent for Deep Learning

arXiv.org Machine LearningNov-17-2011

Daniel Povey Microsoft Research One Microsoft Way Redmond, WA 98052 In this paper, we propose a second order optimization method to learn models where both the dimensionality of the parameter space and the number of training samples is high. In our method, we construct on each iteration a Krylov subspace formed by the gradient and an approximation to the Hessian matrix, and then use a subset of the training data samples to optimize over this subspace. As with the Hessian Free (HF) method of [7], the Hessian matrix is never explicitly constructed, and is computed using a subset of data. In practice, as in HF, we typically use a positive definite substitute for the Hessian matrix such as the Gauss-Newton matrix. We investigate the effectiveness of our proposed method on deep neural networks, and compare its performance to widely used methods such as stochastic gradient descent, conjugate gradient descent and L-BFGS, and also to HF. Our method leads to faster convergence than either L-BFGS or HF, and generally performs better than either of them in cross-validation accuracy. It is also simpler and more general than HF, as it does not require a positive semi-definite approximation of the Hessian matrix to work well nor the setting of a damping parameter. The chief drawback versus HF is the need for memory to store a basis for the Krylov subspace.

artificial intelligence, machine learning, matrix, (17 more...)

1111.4259

Country: North America > United States > Washington > King County > Redmond (0.24)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Charles, Adam S., Garrigues, Pierre, Rozell, Christopher J.

Analog Sparse Approximation with Applications to Compressed Sensing

arXiv.org Machine LearningNov-17-2011

Recent research has shown that performance in signal processing tasks can often be significantly improved by using signal models based on sparse representations, where a signal is approximated using a small number of elements from a fixed dictionary. Unfortunately, inference in this model involves solving non-smooth optimization problems that are computationally expensive. While significant efforts have focused on developing digital algorithms specifically for this problem, these algorithms are inappropriate for many applications because of the time and power requirements necessary to solve large optimization problems. Based on recent work in computational neuroscience, we explore the potential advantages of continuous time dynamical systems for solving sparse approximation problems if they were implemented in analog VLSI. Specifically, in the simulated task of recovering synthetic and MRI data acquired via compressive sensing techniques, we show that these systems can potentially perform recovery at time scales of 10-20{\mu}s, supporting datarates of 50-100 kHz (orders of magnitude faster that digital algorithms). Furthermore, we show analytically that a wide range of sparse approximation problems can be solved in the same basic architecture, including approximate $\ell^p$ norms, modified $\ell^1$ norms, re-weighted $\ell^1$ and $\ell^2$, the block $\ell^1$ norm and classic Tikhonov regularization.

artificial intelligence, bayesian inference, machine learning, (19 more...)

1111.4118

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)

Rodriguez-Toro, Victor A., Garzon, Jaime E., Lopez, Jesus A.

Control Neuronal por Modelo Inverso de un Servosistema Usando Algoritmos de Aprendizaje Levenberg-Marquardt y Bayesiano

arXiv.org Artificial IntelligenceNov-17-2011

In this paper we present the experimental results of the neural network control of a servo-system in order to control its speed. The control strategy is implemented by using an inverse-model control based on Artificial Neural Networks (ANNs). The network training was performed using two learning algorithms: Levenberg-Marquardt and Bayesian regularization. We evaluate the generalization capability for each method according to both the correct operation of the controller to follow the reference signal, and the control efforts developed by the ANN-based controller.

artificial intelligence, entrenamiento, machine learning, (15 more...)

1111.4267

Country: North America > United States (0.29)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Dimitrakakis, Christos, Rothkopf, Constantin

Bayesian multitask inverse reinforcement learning

arXiv.org Artificial IntelligenceNov-17-2011

We generalise the problem of inverse reinforcement learning to multiple tasks, from multiple demonstrations. Each one may represent one expert trying to solve a different task, or as different experts trying to solve the same task. Our main contribution is to formalise the problem as statistical preference elicitation, via a number of structured priors, whose form captures our biases about the relatedness of different tasks or expert policies. In doing so, we introduce a prior on policy optimality, which is more natural to specify. We show that our framework allows us not only to learn to efficiently from multiple experts but to also effectively differentiate between the goals of each. Possible applications include analysing the intrinsic motivations of subjects in behavioural experiments and learning from multiple teachers.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

doi: 10.1007/978-3-642-29946-9_27

1106.3655

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)

arXiv.org Machine LearningNov-16-2011

Fast Learning Rate of Non-Sparse Multiple Kernel Learning and Optimal Regularization Strategies

Suzuki, Taiji

In this paper, we give a new generalization error bound of Multiple Kernel Learning (MKL) for a general class of regularizations, and discuss what kind of regularization gives a favorable predictive accuracy. Our main target in this paper is dense type regularizations including \ellp-MKL. According to the recent numerical experiments, the sparse regularization does not necessarily show a good performance compared with dense type regularizations. Motivated by this fact, this paper gives a general theoretical tool to derive fast learning rates of MKL that is applicable to arbitrary mixed-norm-type regularizations in a unifying manner. This enables us to compare the generalization performances of various types of regularizations. As a consequence, we observe that the homogeneity of the complexities of candidate reproducing kernel Hilbert spaces (RKHSs) affects which regularization strategy (\ell1 or dense) is preferred. In fact, in homogeneous complexity settings where the complexities of all RKHSs are evenly same, \ell1-regularization is optimal among all isotropic norms. On the other hand, in inhomogeneous complexity settings, dense type regularizations can show better learning rate than sparse \ell1-regularization. We also show that our learning rate achieves the minimax lower bound in homogeneous complexity settings.

artificial intelligence, machine learning, regularization, (16 more...)

1111.3781

Country: North America > Canada (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Synnaeve, Gabriel, Bessière, Pierre

A Bayesian Model for Plan Recognition in RTS Games applied to StarCraft

arXiv.org Artificial IntelligenceNov-16-2011

The task of keyhole (unobtrusive) plan recognition is central to adaptive game AI. "Tech trees" or "build trees" are the core of real-time strategy (RTS) game strategic (long term) planning. This paper presents a generic and simple Bayesian model for RTS build tree prediction from noisy observations, which parameters are learned from replays (game logs). This unsupervised machine learning approach involves minimal work for the game developers as it leverage players' data (com- mon in RTS). We applied it to StarCraft1 and showed that it yields high quality and robust predictions, that can feed an adaptive AI.

artificial intelligence, build tree, machine learning, (16 more...)

1111.3735

Country: Europe (0.14)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Chevaleyre, Yann, Lang, Jérôme, Maudet, Nicolas, Monnot, Jérôme, Xia, Lirong

New Candidates Welcome! Possible Winners with respect to the Addition of New Candidates

arXiv.org Artificial IntelligenceNov-15-2011

In voting contexts, some new candidates may show up in the course of the process. In this case, we may want to determine which of the initial candidates are possible winners, given that a fixed number $k$ of new candidates will be added. We give a computational study of this problem, focusing on scoring rules, and we provide a formal comparison with related problems such as control via adding candidates or cloning.

artificial intelligence, cowinner, new candidate, (16 more...)

1111.369

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Journal of Artificial Intelligence ResearchNov-14-2011

Learning to Make Predictions In Partially Observable Environments Without a Generative Model

Talvitie, E., Singh, S.

When faced with the problem of learning a model of a high-dimensional environment, a common approach is to limit the model to make only a restricted set of predictions, thereby simplifying the learning problem. These partial models may be directly useful for making decisions or may be combined together to form a more complete, structured model. However, in partially observable (non-Markov) environments, standard model-learning methods learn generative models, i.e. models that provide a probability distribution over all possible futures (such as POMDPs). It is not straightforward to restrict such models to make only certain predictions, and doing so does not always simplify the learning problem. In this paper we present prediction profile models: non-generative partial models for partially observable systems that make only a given set of predictions, and are therefore far simpler than generative models in some cases. We formalize the problem of learning a prediction profile model as a transformation of the original model-learning problem, and show empirically that one can learn prediction profile models that make a small set of important predictions even in systems that are too complex for standard generative models.

prediction, prediction profile model, prediction profile system, (12 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3396

AI Access Foundation

10729

Journal of Artificial Intelligence Research

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(3 more...)

Genre: Research Report (0.87)

Industry: Education > Focused Education > Special Education (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.92)