AITopics

We present a nonparametric Bayesian approach to inverse reinforcement learning (IRL) for multiple reward functions. Most previous IRL algorithms assume that the behaviour data is obtained from an agent who is optimizing a single reward function, but this assumption is hard to be met in practice. Our approach is based on integrating the Dirichlet process mixture model into Bayesian IRL. We provide an efficient Metropolis-Hastings sampling algorithm utilizing the gradient of the posterior to estimate the underlying reward functions, and demonstrate that our approach outperforms the previous ones via experiments on a number of problem domains.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Coding efficiency and detectability of rate fluctuations with non-Poisson neuronal firing

Koyama, Shinsuke

Statistical features of neuronal spike trains are known to be non-Poisson. Here, we investigate the extent to which the non-Poissonian feature affects the efficiency of transmitting information on fluctuating firing rates. For this purpose, we introduce the Kullbuck-Leibler (KL) divergence as a measure of the efficiency of information encoding, and assume that spike trains are generated by time-rescaled renewal processes. We show that the KL divergence determines the lower bound of the degree of rate fluctuations below which the temporal variation of the firing rates is undetectable from sparse data. We also show that the KL divergence, as well as the lower bound, depends not only on the variability of spikes in terms of the coefficient of variation, but also significantly on the higher-order moments of interspike interval (ISI) distributions. We examine three specific models that are commonly used for describing the stochastic nature of spikes (the gamma, inverse Gaussian (IG) and lognormal ISI distributions), and find that the time-rescaled renewal process with the IG distribution achieves the largest KL divergence, followed by the lognormal and gamma distributions.

artificial intelligence, kl divergence, machine learning, (17 more...)

Country: Asia > Japan > Honshū (0.28)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Azari, Hossein, Parks, David, Xia, Lirong

Random Utility Theory for Social Choice

A special case that has received significant attention is the Plackett-Luce model, for which fast inference methods for maximum likelihood estimators are available. This paper develops conditions on general random utility models that enable fast inference within a Bayesian framework through MC-EM, providing concave loglikelihood functionsand bounded sets of global maxima solutions. Results on both real-world and simulated data provide support for the scalability of the approach andcapability for model selection among general random utility models including Plackett-Luce.

artificial intelligence, machine learning, variance, (17 more...)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Adamskiy, Dmitry, Warmuth, Manfred K., Koolen, Wouter M.

Putting Bayes to sleep

We consider sequential prediction algorithms that are given the predictions from a set of models as inputs. If the nature of the data is changing over time in that different models predict well on different segments of the data, then adaptivity is typically achieved by mixing into the weights in each round a bit of the initial prior (kind of like a weak restart). However, what if the favored models in each segment are from a small subset, i.e. the data is likely to be predicted well by models that predicted well before? Curiously, fitting such ''sparse composite models'' is achieved by mixing in a bit of all the past posteriors. This self-referential updating method is rather peculiar, but it is efficient and gives superior performance on many natural data sets. Also it is important because it introduces a long-term memory: any model that has done well in the past can be recovered quickly. While Bayesian interpretations can be found for mixing in a bit of the initial prior, no Bayesian interpretation is known for mixing in past posteriors. We build atop the ''specialist'' framework from the online learning literature to give the Mixing Past Posteriors update a proper Bayesian foundation. We apply our method to a well-studied multitask learning problem and obtain a new intriguing efficient update that achieves a significantly better bound.

artificial intelligence, machine learning, specialist, (15 more...)

Country: North America > United States (0.46)

Industry: Education > Educational Setting > Online (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Bayesian Hierarchical Reinforcement Learning

Cao, Feng, Ray, Soumya

We describe an approach to incorporating Bayesian priors in the maxq framework for hierarchical reinforcement learning (HRL). We define priors on the primitive environment model and on task pseudo-rewards. Since models for composite tasks can be complex, we use a mixed model-based/model-free learning approach to find an optimal hierarchical policy. We show empirically that (i) our approach results in improved convergence over non-Bayesian baselines, given sensible priors, (ii) task hierarchies and Bayesian priors can be complementary sources of information, and using both sources is better than either alone, (iii) taking advantage of the structural decomposition induced by the task hierarchy significantly reduces the computational cost of Bayesian reinforcement learning and (iv) in this framework, task pseudo-rewards can be learned instead of being manually specified, leading to automatic learning of hierarchically optimal rather than recursively optimal policies.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.97)

Nonparametric Max-Margin Matrix Factorization for Collaborative Prediction

Xu, Minjie, Zhu, Jun, Zhang, Bo

We present a probabilistic formulation of max-margin matrix factorization and build accordingly a nonparametric Bayesian model which automatically resolves the unknown number of latent factors. Our work demonstrates a successful example thatintegrates Bayesian nonparametrics and max-margin learning, which are conventionally two separate paradigms and enjoy complementary advantages. We develop an efficient variational algorithm for posterior inference, and our extensive empiricalstudies on large-scale MovieLens and EachMovie data sets appear to justify the aforementioned dual advantages.

artificial intelligence, machine learning, matrix factorization, (14 more...)

Country: Asia > China (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Weller, Adrian, Jebara, Tony

Bethe Bounds and Approximating the Global Optimum

arXiv.org Machine LearningDec-31-2012

Inference in general Markov random fields (MRFs) is NP-hard, though identifying the maximum a posteriori (MAP) configuration of pairwise MRFs with submodular cost functions is efficiently solvable using graph cuts. Marginal inference, however, even for this restricted class, is in #P. We prove new formulations of derivatives of the Bethe free energy, provide bounds on the derivatives and bracket the locations of stationary points, introducing a new technique called Bethe bound propagation. Several results apply to pairwise models whether associative or not. Applying these to discretized pseudo-marginals in the associative case we present a polynomial time approximation scheme for global optimization provided the maximum degree is $O(\log n)$, and discuss several extensions.

artificial intelligence, bayesian inference, machine learning, (13 more...)

arXiv.org Machine Learning

1301.0015

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Radinsky, K., Davidovich, S., Markovitch, S.

Learning to Predict from Textual Data

Journal of Artificial Intelligence ResearchDec-26-2012

Given a current news event, we tackle the problem of generating plausible predictions of future events it might cause. We present a new methodology for modeling and predicting such future news events using machine learning and data mining techniques. Our Pundit algorithm generalizes examples of causality pairs to infer a causality predictor. To obtain precisely labeled causality examples, we mine 150 years of news articles and apply semantic natural language modeling techniques to headlines containing certain predefined causality patterns. For generalization, the model uses a vast number of world knowledge ontologies. Empirical evaluation on real news articles shows that our Pundit algorithm performs as well as non-expert humans.

algorithm, prediction, proceedings, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.3865

AI Access Foundation

10792

Journal of Artificial Intelligence Research

Country:

Europe > Germany (0.14)
Asia > Middle East > Iraq > Baghdad Governorate > Baghdad (0.04)
North America > United States > Texas (0.04)
(31 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Military (0.68)
Government > Regional Government > North America Government > United States Government (0.68)
(3 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
(8 more...)

arXiv.org Machine LearningDec-20-2012

An Experiment with Hierarchical Bayesian Record Linkage

Larsen, Michael D.

In record linkage (RL), or exact file matching, the goal is to identify the links between entities with information on two or more files. RL is an important activity in areas including counting the population, enhancing survey frames and data, and conducting epidemiological and follow-up studies. RL is challenging when files are very large, no accurate personal identification (ID) number is present on all files for all units, and some information is recorded with error. Without an unique ID number one must rely on comparisons of names, addresses, dates, and other information to find the links. Latent class models can be used to automatically score the value of information for determining match status. Data for fitting models come from comparisons made within groups of units that pass initial file blocking requirements. Data distributions can vary across blocks. This article examines the use of prior information and hierarchical latent class models in the context of RL.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

1212.5203

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (0.46)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)