AITopics

Most conventional Reinforcement Learning (RL) algorithms aim to optimize decision-making rules in terms of the expected returns. However, especially for risk management purposes, other risk-sensitive criteria such as the value-at-risk or the expected shortfall are sometimes preferred in real applications. Here, we describe a parametric method for estimating density of the returns, which allows us to handle various criteria in a unified manner. We first extend the Bellman equation for the conditional expected return to cover a conditional probability density of the returns. Then we derive an extension of the TD-learning algorithm for estimating the return densities in an unknown environment. As test instances, several parametric density estimation algorithms are presented for the Gaussian, Laplace, and skewed Laplace distributions. We show that these algorithms lead to risk-sensitive as well as robust RL paradigms through numerical experiments.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

1203.3497

Country: Asia > Japan > Honshū (0.14)

Genre: Research Report (0.50)

Industry: Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Meila, Marina, Chen, Harr

Dirichlet Process Mixtures of Generalized Mallows Models

We present a Dirichlet process mixture model over discrete incomplete rankings and study two Gibbs sampling inference techniques for estimating posterior clusterings. The first approach uses a slice sampling subcomponent for estimating cluster parameters. The second approach marginalizes out several cluster parameters by taking advantage of approximations to the conditional posteriors. We empirically demonstrate (1) the effectiveness of this approximation for improving convergence, (2) the benefits of the Dirichlet process model over alternative clustering techniques for ranked data, and (3) the applicability of the approach to exploring large realworld ranking datasets.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1203.3496

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Inference-less Density Estimation using Copula Bayesian Networks

Elidan, Gal

We consider learning continuous probabilistic graphical models in the face of missing data. For non-Gaussian models, learning the parameters and structure of such models depends on our ability to perform efficient inference, and can be prohibitive even for relatively modest domains. Recently, we introduced the Copula Bayesian Network (CBN) density model - a flexible framework that captures complex high-dimensional dependency structures while offering direct control over the univariate marginals, leading to improved generalization. In this work we show that the CBN model also offers significant computational advantages when training data is partially observed. Concretely, we leverage on the specialized form of the model to derive a computationally amenable learning objective that is a lower bound on the log-likelihood function. Importantly, our energy-like bound circumvents the need for costly inference of an auxiliary distribution, thus facilitating practical learning of highdimensional densities. We demonstrate the effectiveness of our approach for learning the structure and parameters of a CBN model for two reallife continuous domains.

artificial intelligence, bayesian inference, machine learning, (16 more...)

1203.3476

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Blundell, Charles, Teh, Yee Whye, Heller, Katherine A.

Bayesian Rose Trees

Hierarchical structure is ubiquitous in data across many domains. There are many hierarchical clustering methods, frequently used by domain experts, which strive to discover this structure. However, most of these methods limit discoverable hierarchies to those with binary branching structure. This limitation, while computationally convenient, is often undesirable. In this paper we explore a Bayesian hierarchical clustering algorithm that can produce trees with arbitrary branching structure at each node, known as rose trees. We interpret these trees as mixtures over partitions of a data set, and use a computationally efficient, greedy agglomerative algorithm to find the rose trees which have high marginal likelihood given the data. Lastly, we perform experiments which demonstrate that rose trees are better models of data than the typical binary trees returned by other hierarchical clustering algorithms.

artificial intelligence, machine learning, partition, (19 more...)

1203.3468

Country: Europe > United Kingdom > England (0.28)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Armagan, Artin, Dunson, David B., Clyde, Merlise

Generalized Beta Mixtures of Gaussians

arXiv.org Machine LearningMar-13-2012

In recent years, a rich variety of shrinkage priors have been proposed that have great promise in addressing massive regression problems. In general, these new priors can be expressed as scale mixtures of normals, but have more complex forms and better properties than traditional Cauchy and double exponential priors. We first propose a new class of normal scale mixtures through a novel generalized beta distribution that encompasses many interesting priors as special cases. This encompassing framework should prove useful in comparing competing priors, considering properties and revealing close connections. We then develop a class of variational Bayes approximations through the new hierarchy presented that will scale more efficiently to the types of truly massive data sets that are now encountered routinely.

artificial intelligence, hierarchy, machine learning, (18 more...)

1107.4976

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Ahmed, Mohamed Osama, Bibalan, Pouyan T., de Freitas, Nando, Fauvel, Simon

Decentralized, Adaptive, Look-Ahead Particle Filtering

arXiv.org Machine LearningMar-11-2012

The decentralized particle filter (DPF) was proposed recently to increase the level of parallelism of particle filtering. Given a decomposition of the state space into two nested sets of variables, the DPF uses a particle filter to sample the first set and then conditions on this sample to generate a set of samples for the second set of variables. The DPF can be understood as a variant of the popular Rao-Blackwellized particle filter (RBPF), where the second step is carried out using Monte Carlo approximations instead of analytical inference. As a result, the range of applications of the DPF is broader than the one for the RBPF. In this paper, we improve the DPF in two ways. First, we derive a Monte Carlo approximation of the optimal proposal distribution and, consequently, design and implement a more efficient look-ahead DPF. Although the decentralized filters were initially designed to capitalize on parallel implementation, we show that the look-ahead DPF can outperform the standard particle filter even on a single machine. Second, we propose the use of bandit algorithms to automatically configure the state space decomposition of the DPF.

algorithm, artificial intelligence, machine learning, (17 more...)

1203.2394

Country: North America > Canada > British Columbia (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Caron, François, Bornn, Luke, Doucet, Arnaud

Sparsity-Promoting Bayesian Dynamic Linear Models

arXiv.org Machine LearningMar-1-2012

Sparsity-promoting priors have become increasingly popular over recent years due to an increased number of regression and classification applications involving a large number of predictors. In time series applications where observations are collected over time, it is often unrealistic to assume that the underlying sparsity pattern is fixed. We propose here an original class of flexible Bayesian linear models for dynamic sparsity modelling. The proposed class of models expands upon the existing Bayesian literature on sparse regression using generalized multivariate hyperbolic distributions. The properties of the models are explored through both analytic results and simulation studies. We demonstrate the model on a financial application where it is shown that it accurately represents the patterns seen in the analysis of stock and derivative data, and is able to detect major events by filtering an artificial portfolio of assets.

banking & finance, bayesian inference, sparsity pattern, (19 more...)

1203.0106

Country:

Europe > France (0.28)
North America > United States (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (1.00)
Banking & Finance > Real Estate (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceFeb-28-2012

One Decade of Universal Artificial Intelligence

Hutter, Marcus

The first decade of this century has seen the nascency of the first mathematical theory of general artificial intelligence. This theory of Universal Artificial Intelligence (UAI) has made significant contributions to many theoretical, philosophical, and practical AI questions. In a series of papers culminating in book (Hutter, 2005), an exciting sound and complete mathematical model for a super intelligent agent (AIXI) has been developed and rigorously analyzed. While nowadays most AI researchers avoid discussing intelligence, the award-winning PhD thesis (Legg, 2008) provided the philosophical embedding and investigated the UAI-based universal measure of rational intelligence, which is formal, objective and non-anthropocentric. Recently, effective approximations of AIXI have been derived and experimentally investigated in JAIR paper (Veness et al. 2011). This practical breakthrough has resulted in some impressive applications, finally muting earlier critique that UAI is only a theory. For the first time, without providing any domain knowledge, the same agent is able to self-adapt to a diverse range of interactive environments. For instance, AIXI is able to learn from scratch to play TicTacToe, Pacman, Kuhn Poker, and other games by trial and error, without even providing the rules of the games. These achievements give new hope that the grand goal of Artificial General Intelligence is not elusive. This article provides an informal overview of UAI in context. It attempts to gently introduce a very theoretical, formal, and mathematical subject, and discusses philosophical and technical ingredients, traits of intelligence, some social questions, and the past and future of UAI.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

1202.6153

Country:

North America > United States (0.68)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
(2 more...)

Journal of Artificial Intelligence ResearchFeb-27-2012

Exploiting Model Equivalences for Solving Interactive Dynamic Influence Diagrams

Zeng, Y., Doshi, P.

We focus on the problem of sequential decision making in partially observable environments shared with other agents of uncertain types having similar or conflicting objectives. This problem has been previously formalized by multiple frameworks one of which is the interactive dynamic influence diagram (I-DID), which generalizes the well-known influence diagram to the multiagent setting. I-DIDs are graphical models and may be used to compute the policy of an agent given its belief over the physical state and others' models, which changes as the agent acts and observes in the multiagent setting. As we may expect, solving I-DIDs is computationally hard. This is predominantly due to the large space of candidate models ascribed to the other agents and its exponential growth over time. We present two methods for reducing the size of the model space and stemming its exponential growth. Both these methods involve aggregating individual models into equivalence classes. Our first method groups together behaviorally equivalent models and selects only those models for updating which will result in predictive behaviors that are distinct from others in the updated model space. The second method further compacts the model space by focusing on portions of the behavioral predictions. Specifically, we cluster actionally equivalent models that prescribe identical actions at a single time step. Exactly identifying the equivalences would require us to solve all models in the initial set. We avoid this by selectively solving some of the models, thereby introducing an approximation. We discuss the error introduced by the approximation, and empirically demonstrate the improved efficiency in solving I-DIDs due to the equivalences.

agent, i-did, node, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3461

AI Access Foundation

10749

Journal of Artificial Intelligence Research

Country:

North America > United States > Georgia > Clarke County > Athens (0.14)
Europe > Denmark > North Jutland > Aalborg (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Jiao, Jiantao, Zhang, Lin, Nowak, Robert

Minimax-Optimal Bounds for Detectors Based on Estimated Prior Probabilities

arXiv.org Machine LearningFeb-27-2012

In many signal detection and classification problems, we have knowledge of the distribution under each hypothesis, but not the prior probabilities. This paper is aimed at providing theory to quantify the performance of detection via estimating prior probabilities from either labeled or unlabeled training data. The error or {\em risk} is considered as a function of the prior probabilities. We show that the risk function is locally Lipschitz in the vicinity of the true prior probabilities, and the error of detectors based on estimated prior probabilities depends on the behavior of the risk function in this locality. In general, we show that the error of detectors based on the Maximum Likelihood Estimate (MLE) of the prior probabilities converges to the Bayes error at a rate of $n^{-1/2}$, where $n$ is the number of training data. If the behavior of the risk function is more favorable, then detectors based on the MLE have errors converging to the corresponding Bayes errors at optimal rates of the form $n^{-(1+\alpha)/2}$, where $\alpha>0$ is a parameter governing the behavior of the risk function with a typical value $\alpha = 1$. The limit $\alpha \rightarrow \infty$ corresponds to a situation where the risk function is flat near the true probabilities, and thus insensitive to small errors in the MLE; in this case the error of the detector based on the MLE converges to the Bayes error exponentially fast with $n$. We show the bounds are achievable no matter given labeled or unlabeled training data and are minimax-optimal in labeled case.

artificial intelligence, bayesian inference, machine learning, (17 more...)

doi: 10.1109/TIT.2012.2201914

1107.6027

Country: North America > United States > Wisconsin (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)