AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Spatial Mixture Models with Learnable Deep Priors for Perceptual Grouping

Yuan, Jinyang, Li, Bin, Xue, Xiangyang

arXiv.org Machine LearningFeb-7-2019

Humans perceive the seemingly chaotic world in a structured and compositional way with the prerequisite of being able to segregate conceptual entities from the complex visual scenes. The mechanism of grouping basic visual elements of scenes into conceptual entities is termed as perceptual grouping. In this work, we propose a new type of spatial mixture models with learnable priors for perceptual grouping. Different from existing methods, the proposed method disentangles the representation of an object into `shape' and `appearance' which are modeled separately by the mixture weights and the conditional probability distributions. More specifically, each object in the visual scene is modeled by one mixture component, whose mixture weights and the parameter of the conditional probability distribution are generated by two neural networks, respectively. The mixture weights focus on modeling spatial dependencies (i.e., shape) and the conditional probability distributions deal with intra-object variations (i.e., appearance). In addition, the background is separately modeled as a special component complementary to the foreground objects. Our extensive empirical tests on two perceptual grouping datasets demonstrate that the proposed method outperforms the state-of-the-art methods under most experimental configurations. The learned conceptual entities are generalizable to novel visual scenes and insensitive to the diversity of objects.

mixture weight, pixel, visual scene, (15 more...)

arXiv.org Machine Learning

1902.02502

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.77)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)

Add feedback

A Simple Baseline for Bayesian Uncertainty in Deep Learning

Maddox, Wesley, Garipov, Timur, Izmailov, Pavel, Vetrov, Dmitry, Wilson, Andrew Gordon

arXiv.org Machine LearningFeb-7-2019

We propose SWA-Gaussian (SWAG), a simple, scalable, and general purpose approach for uncertainty representation and calibration in deep learning. Stochastic Weight Averaging (SWA), which computes the first moment of stochastic gradient descent (SGD) iterates with a modified learning rate schedule, has recently been shown to improve generalization in deep learning. With SWAG, we fit a Gaussian using the SWA solution as the first moment and a low rank plus diagonal covariance also derived from the SGD iterates, forming an approximate posterior distribution over neural network weights; we then sample from this Gaussian distribution to perform Bayesian model averaging. We empirically find that SWAG approximates the shape of the true posterior, in accordance with results describing the stationary distribution of SGD iterates. Moreover, we demonstrate that SWAG performs well on a wide variety of computer vision tasks, including out of sample detection, calibration, and transfer learning, in comparison to many popular alternatives including MC dropout, KFAC Laplace, and temperature scaling.

bayesian uncertainty, simple baseline, swag, (12 more...)

arXiv.org Machine Learning

1902.02476

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Improving Latent User Models in Online Social Media

Krishnan, Adit, Sharma, Ashish, Sundaram, Hari

arXiv.org Artificial IntelligenceFeb-7-2019

Modern social platforms are characterized by the presence of rich user-behavior data associated with the publication, sharing and consumption of textual content. Users interact with content and with each other in a complex and dynamic social environment while simultaneously evolving over time. In order to effectively characterize users and predict their future behavior in such a setting, it is necessary to overcome several challenges. Content heterogeneity and temporal inconsistency of behavior data result in severe sparsity at the user level. In this paper, we propose a novel mutual-enhancement framework to simultaneously partition and learn latent activity profiles of users. We propose a flexible user partitioning approach to effectively discover rare behaviors and tackle user-level sparsity. We extensively evaluate the proposed framework on massive datasets from real-world platforms including Q&A networks and interactive online courses (MOOCs). Our results indicate significant gains over state-of-the-art behavior models ( 15% avg ) in a varied range of tasks and our gains are further magnified for users with limited interaction data. The proposed algorithms are amenable to parallelization, scale linearly in the size of datasets, and provide flexibility to model diverse facets of user behavior.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

1711.11124

Country: North America > United States (0.15)

Genre:

Instructional Material > Online (0.71)
Instructional Material > Course Syllabus & Notes (0.48)
Research Report > New Finding (0.48)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Bidirectional Inference Networks: A Class of Deep Bayesian Networks for Health Profiling

Wang, Hao, Mao, Chengzhi, He, Hao, Zhao, Mingmin, Jaakkola, Tommi S., Katabi, Dina

arXiv.org Machine LearningFeb-6-2019

We consider the problem of inferring the values of an arbitrary set of variables (e.g., risk of diseases) given other observed variables (e.g., symptoms and diagnosed diseases) and high-dimensional signals (e.g., MRI images or EEG). This is a common problem in healthcare since variables of interest often differ for different patients. Existing methods including Bayesian networks and structured prediction either do not incorporate high-dimensional signals or fail to model conditional dependencies among variables. To address these issues, we propose bidirectional inference networks (BIN), which stich together multiple probabilistic neural networks, each modeling a conditional dependency. Predictions are then made via iteratively updating variables using backpropagation (BP) to maximize corresponding posterior probability. Furthermore, we extend BIN to composite BIN (CBIN), which involves the iterative prediction process in the training stage and improves both accuracy and computational efficiency by adaptively smoothing the optimization landscape. Experiments on synthetic and real-world datasets (a sleep study and a dermatology dataset) show that CBIN is a single model that can achieve state-of-the-art performance and obtain better accuracy in most inference tasks than multiple models each specifically trained for a different task.

bayesian inference, inference, neural network, (20 more...)

arXiv.org Machine Learning

1902.02037

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (1.00)
Health & Medicine > Therapeutic Area > Sleep (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.70)

Add feedback

Augmenting Learning Components for Safety in Resource Constrained Autonomous Robots

Ramakrishna, Shreyas, Dubey, Abhishek, Burruss, Matthew P, Hartsell, Charles, Mahadevan, Nagabhushan, Nannapaneni, Saideep, Laszka, Aron, Karsai, Gabor

arXiv.org Artificial IntelligenceFeb-6-2019

This paper deals with resource constrained autonomous robots commonly found in factories, hospitals, and education laboratories, which popularly use learning enabled components (LEC) to make control actions. However, these LECs do not provide any safety guarantees, and testing them is challenging. To overcome these challenges, we introduce a framework that performs confidence estimation, resource management, and supervised safety control of autonomous systems with LECs. Using this framework, we make the following contributions: (1) allow for seamless integration of safety controllers and different simplex strategies to aid the LEC, (2) introduce RL-Simplex and illustrate the use of Q-learning to learn the optimal weights for the arbitration logic of the Simplex Architecture, (3) design a system level monitor that uses the current state information and a discrete Bayesian network model learned from past data to estimate a metric, which indicates if the car will remain in the safe region, and (4) a Resource Manager which performs dynamic task offloading depending on the resource temperature and CPU utilization while continually adjusting vehicle speed to compensate for the latency overhead. We compare the speed, steering and safety performance of the different controllers and simplex strategies, and we find RL-Simplex to have 60\% fewer safety violations and higher optimized speed during indoor driving ($\sim\,0.40\,m/s$) than the original system (using only LEC).

machine learning, reinforcement learning, rl-simplex, (16 more...)

arXiv.org Artificial Intelligence

1902.02432

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Spain > Aragón > Zaragoza Province > Zaragoza (0.04)

Genre: Research Report (0.82)

Industry:

Automobiles & Trucks (0.68)
Aerospace & Defense (0.68)
Information Technology > Robotics & Automation (0.46)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Add feedback

A Bayesian Approach for Accurate Classification-Based Aggregates

Meertens, Q. A., Diks, C. G. H., Herik, H. J. van den, Takes, F W

arXiv.org Machine LearningFeb-6-2019

In this paper, we study the accuracy of values aggregated over classes predicted by a classification algorithm. The problem is that the resulting aggregates (e.g., sums of a variable) are known to be biased. The bias can be large even for highly accurate classification algorithms, in particular when dealing with class-imbalanced data. To correct this bias, the algorithm's classification error rates have to be estimated. In this estimation, two issues arise when applying existing bias correction methods. First, inaccuracies in estimating classification error rates have to be taken into account. Second, impermissible estimates, such as a negative estimate for a positive value, have to be dismissed. We show that both issues are relevant in applications where the true labels are known only for a small set of data points. We propose a novel bias correction method using Bayesian inference. The novelty of our method is that it imposes constraints on the model parameters. We show that our method solves the problem of biased classification-based aggregates as well as the two issues above, in the general setting of multi-class classification. In the empirical evaluation, using a binary classifier on a real-world dataset of company tax returns, we show that our method outperforms existing methods in terms of mean squared error.

application, bias correction method, posterior distribution, (14 more...)

arXiv.org Machine Learning

1902.02412

Country:

Europe > Netherlands > North Holland > Amsterdam (0.05)
North America > United States > New York (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Add feedback

Un mod\`ele Bay\'esien de co-clustering de donn\'ees mixtes

Bouchareb, Aichetou, Boullé, Marc, Rossi, Fabrice, Clérot, Fabrice

arXiv.org Machine LearningFeb-6-2019

We propose a MAP Bayesian approach to perform and evaluate a co-clustering of mixed-type data tables. The proposed model infers an optimal segmentation of all variables then performs a co-clustering by minimizing a Bayesian model selection cost function. One advantage of this approach is that it is user parameter-free. Another main advantage is the proposed criterion which gives an exact measure of the model quality, measured by probability of fitting it to the data. Continuous optimization of this criterion ensures finding better and better models while avoiding data over-fitting. The experiments conducted on real data show the interest of this co-clustering approach in exploratory data analysis of large data sets.

gorielle, inconnu, nombre, (15 more...)

arXiv.org Machine Learning

1902.02056

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.34)

Add feedback

Explanation in Human-AI Systems: A Literature Meta-Review, Synopsis of Key Ideas and Publications, and Bibliography for Explainable AI

Mueller, Shane T., Hoffman, Robert R., Clancey, William, Emrey, Abigail, Klein, Gary

arXiv.org Artificial IntelligenceFeb-5-2019

This is an integrative review that address the question, "What makes for a good explanation?" with reference to AI systems. Pertinent literatures are vast. Thus, this review is necessarily selective. That said, most of the key concepts and issues are expressed in this Report. The Report encapsulates the history of computer science efforts to create systems that explain and instruct (intelligent tutoring systems and expert systems). The Report expresses the explainability issues and challenges in modern AI, and presents capsule views of the leading psychological theories of explanation. Certain articles stand out by virtue of their particular relevance to XAI, and their methods, results, and key points are highlighted. It is recommended that AI/XAI researchers be encouraged to include in their research reports fuller details on their empirical or experimental methods, in the fashion of experimental psychology research reports: details on Participants, Instructions, Procedures, Tasks, Dependent Variables (operational definitions of the measures and metrics), Independent Variables (conditions), and Control Conditions.

information processing and management, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

1902.01876

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.13)
(42 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (1.00)
Research Report > Experimental Study > Negative Result (0.45)

Industry:

Media (1.00)
Leisure & Entertainment > Games (1.00)
Law (1.00)
(10 more...)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
(10 more...)

Add feedback

Asymptotic Consistency of $\alpha-$R\'enyi-Approximate Posteriors

Jaiswal, Prateek, Rao, Vinayak A., Honnappa, Harsha

arXiv.org Machine LearningFeb-5-2019

In this work, we study consistency properties of $\alpha$-R\'enyi approximate posteriors, a class of variational Bayesian methods that approximate an intractable Bayesian posterior with a member of a tractable family of distributions, the latter chosen to minimize the $\alpha$-R\'enyi divergence from the true posterior. Unique to our work is that we consider settings with $\alpha > 1$, resulting in approximations that upperbound the log-likelihood, and result in approximations with a wider spread than traditional variational approaches that minimize the Kullback-Liebler divergence from the posterior. We provide sufficient conditions under which consistency holds, centering around the existence of a 'good' sequence of distributions in the approximating family. We discuss examples where this holds and show how the existence of such a good sequence implies posterior consistency in the limit of an infinite number of observations.

assumption 2, divergence, sequence, (15 more...)

arXiv.org Machine Learning

1902.01902

Country:

North America > United States > New Jersey > Hudson County > Hoboken (0.04)
North America > United States > Massachusetts (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Add feedback

Unbiased Smoothing using Particle Independent Metropolis-Hastings

Middleton, Lawrence, Deligiannidis, George, Doucet, Arnaud, Jacob, Pierre E.

arXiv.org Machine LearningFeb-5-2019

We consider the approximation of expectations with respect to the distribution of a latent Markov process given noisy measurements. This is known as the smoothing problem and is often approached with particle and Markov chain Monte Carlo (MCMC) methods. These methods provide consistent but biased estimators when run for a finite time. We propose a simple way of coupling two MCMC chains built using Particle Independent Metropolis-Hastings (PIMH) to produce unbiased smoothing estimators. Unbiased estimators are appealing in the context of parallel computing, and facilitate the construction of confidence intervals. The proposed scheme only requires access to off-the-shelf Particle Filters (PF) and is thus easier to implement than recently proposed unbiased smoothers. The approach is demonstrated on a L\'evy-driven stochastic volatility model and a stochastic kinetic model.

arnaud doucet, estimator, unbiased estimator, (13 more...)

arXiv.org Machine Learning

1902.01781

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York (0.04)
Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.55)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback