AITopics

1312.5398

Country: North America > United States (0.14)

Genre: Research Report (0.65)

Industry:

Education > Educational Setting > Continuing Education (0.41)
Education > Curriculum > Subject-Specific Education (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Aravkin, Aleksandr Y., Choromanska, Anna, Jebara, Tony, Kanevsky, Dimitri

Semistochastic Quadratic Bound Methods

arXiv.org Machine LearningFeb-17-2014

Partition functions arise in a variety of settings, including conditional random fields, logistic regression, and latent gaussian models. In this paper, we consider semistochastic quadratic bound (SQB) methods for maximum likelihood estimation based on partition function optimization. Batch methods based on the quadratic bound were recently proposed for this class of problems, and performed favorably in comparison to state-of-the-art techniques. Semistochastic methods fall in between batch algorithms, which use all the data, and stochastic gradient type methods, which use small random selections at each iteration. We build semistochastic quadratic bound-based methods, and prove both global convergence (to a stationary point) under very weak assumptions, and linear convergence rate under stronger assumptions on the objective. To make the proposed methods faster and more stable, we consider inexact subproblem minimization and batch-size selection schemes. The efficacy of SQB methods is demonstrated via comparison with several state-of-the-art techniques on commonly used datasets.

artificial intelligence, iteration, machine learning, (14 more...)

1309.1369

Country: Europe (0.28)

Genre:

Research Report > Promising Solution (0.55)
Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.37)

Reverdy, Paul, Srivastava, Vaibhav, Leonard, Naomi E.

Modeling Human Decision-making in Generalized Gaussian Multi-armed Bandits

We present a formal model of human decision-making in explore-exploit tasks using the context of multi-armed bandit problems, where the decision-maker must choose among multiple options with uncertain rewards. We address the standard multi-armed bandit problem, the multi-armed bandit problem with transition costs, and the multi-armed bandit problem on graphs. We focus on the case of Gaussian rewards in a setting where the decision-maker uses Bayesian inference to estimate the reward values. We model the decision-maker's prior knowledge with the Bayesian prior on the mean reward. We develop the upper credible limit (UCL) algorithm for the standard multi-armed bandit problem and show that this deterministic algorithm achieves logarithmic cumulative expected regret, which is optimal performance for uninformative priors. We show how good priors and good assumptions on the correlation structure among arms can greatly enhance decision-making performance, even over short time horizons. We extend to the stochastic UCL algorithm and draw several connections to human decision-making behavior. We present empirical data from human experiments and show that human performance is efficiently captured by the stochastic UCL algorithm with appropriate parameters. For the multi-armed bandit problem with transition costs and the multi-armed bandit problem on graphs, we generalize the UCL algorithm to the block UCL algorithm and the graphical block UCL algorithm, respectively. We show that these algorithms also achieve logarithmic cumulative expected regret and require a sub-logarithmic expected number of transitions among arms. We further illustrate the performance of these algorithms with numerical examples.

algorithm, bayesian inference, upstream oil & gas, (20 more...)

1307.6134

Country:

North America > United States > Wisconsin (0.14)
Europe > United Kingdom > Scotland (0.14)
Europe > Spain (0.14)
(3 more...)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Brümmer, Niko, Garcia-Romero, Daniel

Generative Modelling for Unsupervised Score Calibration

ABSTRACT Score calibration enables automatic speaker recognizers to make cost-effective accept / reject decisions. Traditional calibration requires supervised data, which is an expensive resource. We propose a 2-component GMM for unsupervised calibration and demonstrate good performance relative to a supervised baseline on NIST SRE'10 and SRE'12. A Bayesian analysis demonstrates that the uncertainty associated with the unsupervised calibration parameter estimates is surprisingly small. Index Terms-- calibration, unsupervised learning, Laplace approximation, automatic speaker recognition 1. INTRODUCTION Automatic speaker recognizers map trials to scores.

calibration, machine learning, pattern recognition, (18 more...)

1311.0707

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.36)

The Law of Total Odds

Tasche, Dirk

The law of total probability may be deployed in binary classification exercises to estimate the unconditional class probabilities if the class proportions in the training set are not representative of the population class proportions. We argue that this is not a conceptually sound approach and suggest an alternative based on the new law of total odds. We quantify the bias of the total probability estimator of the unconditional class probabilities and show that the total odds estimator is unbiased. The sample version of the total odds estimator is shown to coincide with a maximum-likelihood estimator known from the literature. The law of total odds can also be used for transforming the conditional class probabilities if independent estimates of the unconditional class probabilities of the population are available. Keywords: Total probability, likelihood ratio, Bayes' formula, binary classification, relative odds, unbiased estimator, supervised learning, dataset shift.

artificial intelligence, bayesian inference, machine learning, (19 more...)

1312.0365

Country: Europe > United Kingdom (0.14)

Genre: Research Report (0.50)

Industry: Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Korattikara, Anoop, Chen, Yutian, Welling, Max

Austerity in MCMC Land: Cutting the Metropolis-Hastings Budget

Markov chain Monte Carlo (MCMC) sampling has been the main workhorse of Bayesian computation since the 1990s. A canonical MCMC algorithm proposes samples from a distribution q and then accepts or rejects these proposals with a certain probability given by the Metropolis-Hastings (MH) formula [Metropolis et al., 1953, Hastings, 1970]. For each proposed sample, the MH rule needs to examine the likelihood of all dataitems. When the number of data-cases is large this is an awful lot of computation for one bit of information, namely whether to accept or reject a proposal. In today's Big Data world, we need to rethink our Bayesian inference algorithms. Standard MCMC methods do not meet the Big Data challenge for the reason described above. Researchers have made some progress in terms of making MCMC more efficient, mostly by focusing on parallelization. Very few question the algorithm itself: is the standard MCMC paradigm really optimally efficient in achieving its goals?

artificial intelligence, bayesian inference, machine learning, (19 more...)

1304.5299

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Wu, Yue, Lobato, Jose Miguel Hernandez, Ghahramani, Zoubin

Gaussian Process Volatility Model

arXiv.org Machine LearningFeb-13-2014

The accurate prediction of time-changing variances is an important task in the modeling of financial data. Standard econometric models are often limited as they assume rigid functional relationships for the variances. Moreover, function parameters are usually learned using maximum likelihood, which can lead to overfitting. To address these problems we introduce a novel model for time-changing variances using Gaussian Processes. A Gaussian Process (GP) defines a distribution over functions, which allows us to capture highly flexible functional relationships for the variances. In addition, we develop an online algorithm to perform inference. The algorithm has two main advantages. First, it takes a Bayesian approach, thereby avoiding overfitting. Second, it is much quicker than current offline inference procedures. Finally, our new model was evaluated on financial data and showed significant improvement in predictive performance over current standard models.

artificial intelligence, gp-v ol, machine learning, (18 more...)

1402.3085

Country:

Europe > United Kingdom (0.28)
Asia (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Kapelner, Adam, Bleich, Justin

Prediction with Missing Data via Bayesian Additive Regression Trees

arXiv.org Machine LearningFeb-12-2014

This article addresses prediction problems where covariate information is missing during model construction and is also missing in future observations for which we are obligated to generate a forecast. Our aim is to innovate a nonparametric statistical learning extension which incorporates missingness into both the training and the forecasting phases. In the spirit of nonparametric learning, we wish to incorporate the missingness in both phases automatically, without the need for pre-specified modeling. We limit our focus to tree-based statistical learning, which has demonstrated strong predictive performance and has consequently received considerable attention in recent years. State-of-the-art algorithms include Random Forests (RF, Breiman, 2001b), stochastic gradient boosting (Friedman, 2002), and Bayesian Additive and Regression Trees (BART, Chipman et al., 2010), the algorithm of interest in this study.

artificial intelligence, machine learning, missingness, (17 more...)

1306.0618

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.41)

arXiv.org Artificial IntelligenceFeb-12-2014

Bayesian Inference with Posterior Regularization and applications to Infinite Latent SVMs

Zhu, Jun, Chen, Ning, Xing, Eric P.

Existing Bayesian models, especially nonparametric Bayesian methods, rely on specially conceived priors to incorporate domain knowledge for discovering improved latent representations. While priors can affect posterior distributions through Bayes' rule, imposing posterior regularization is arguably more direct and in some cases more natural and general. In this paper, we present regularized Bayesian inference (RegBayes), a novel computational framework that performs posterior inference with a regularization term on the desired post-data posterior distribution under an information theoretical formulation. RegBayes is more flexible than the procedure that elicits expert knowledge via priors, and it covers both directed Bayesian networks and undirected Markov networks whose Bayesian formulation results in hybrid chain graph models. When the regularization is induced from a linear operator on the posterior distributions, such as the expectation operator, we present a general convex-analysis theorem to characterize the solution of RegBayes. Furthermore, we present two concrete examples of RegBayes, infinite latent support vector machines (iLSVM) and multi-task infinite latent support vector machines (MT-iLSVM), which explore the large-margin idea in combination with a nonparametric Bayesian model for discovering predictive latent features for classification and multi-task learning, respectively. We present efficient inference methods and report empirical studies on several benchmark datasets, which appear to demonstrate the merits inherited from both large-margin learning and Bayesian nonparametrics. Such results were not available until now, and contribute to push forward the interface between these two important subfields, which have been largely treated as isolated in the community.

artificial intelligence, constraint, machine learning, (13 more...)

arXiv.org Artificial Intelligence

1210.1766

Country: North America > United States (0.92)

Genre: Research Report (0.63)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Guez, Arthur, Silver, David, Dayan, Peter

Better Optimism By Bayes: Adaptive Planning with Rich Models

arXiv.org Machine LearningFeb-9-2014

The computational costs of inference and planning have confined Bayesian model-based reinforcement learning to one of two dismal fates: powerful Bayes-adaptive planning but only for simplistic models, or powerful, Bayesian non-parametric models but using simple, myopic planning strategies such as Thompson sampling. We ask whether it is feasible and truly beneficial to combine rich probabilistic models with a closer approximation to fully Bayesian planning. First, we use a collection of counterexamples to show formal problems with the over-optimism inherent in Thompson sampling. Then we leverage state-of-the-art techniques in efficient Bayes-adaptive planning and non-parametric Bayesian methods to perform qualitatively better than both existing conventional algorithms and Thompson sampling on two contextual bandit-like problems.

bayesian inference, mushroom, upstream oil & gas, (21 more...)

1402.1958

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.84)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.94)