AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

A dynamic network model with persistent links and node-specific latent variables, with an application to the interbank market

Mazzarisi, Piero, Barucca, Paolo, Lillo, Fabrizio, Tantari, Daniele

arXiv.org Machine LearningDec-30-2017

We propose a dynamic network model where two mechanisms control the probability of a link between two nodes: (i) the existence or absence of this link in the past, and (ii) node-specific latent variables (dynamic fitnesses) describing the propensity of each node to create links. Assuming a Markov dynamics for both mechanisms, we propose an Expectation-Maximization algorithm for model estimation and inference of the latent variables. The estimated parameters and fitnesses can be used to forecast the presence of a link in the future. We apply our methodology to the e-MID interbank network for which the two linkage mechanisms are associated with two different trading behaviors in the process of network formation, namely preferential trading and trading driven by node-specific characteristics. The empirical results allow to recognise preferential lending in the interbank market and indicate how a method that does not account for time-varying network topologies tends to overestimate preferential linkage.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1801.00185

Country: Europe > Italy (0.28)

Genre: Research Report (0.40)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Data-Driven Stochastic Robust Optimization: A General Computational Framework and Algorithm for Optimization under Uncertainty in the Big Data Era

Ning, Chao, You, Fengqi

arXiv.org Artificial IntelligenceDec-29-2017

A novel data-driven stochastic robust optimization (DDSRO) framework is proposed for optimization under uncertainty leveraging labeled multi-class uncertainty data. Uncertainty data in large datasets are often collected from various conditions, which are encoded by class labels. Machine learning methods including Dirichlet process mixture model and maximum likelihood estimation are employed for uncertainty modeling. A DDSRO framework is further proposed based on the data-driven uncertainty model through a bi-level optimization structure. The outer optimization problem follows a two-stage stochastic programming approach to optimize the expected objective across different data classes; adaptive robust optimization is nested as the inner problem to ensure the robustness of the solution while maintaining computational tractability. A decomposition-based algorithm is further developed to solve the resulting multi-level optimization problem efficiently. Case studies on process network design and planning are presented to demonstrate the applicability of the proposed framework and algorithm.

optimization problem, renewable energy, uncertainty data, (16 more...)

arXiv.org Artificial Intelligence

1707.09198

Country: North America > United States > New York (0.14)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (1.00)
Materials > Chemicals (0.94)
Energy > Renewable (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Learning Structural Weight Uncertainty for Sequential Decision-Making

Zhang, Ruiyi, Li, Chunyuan, Chen, Changyou, Carin, Lawrence

arXiv.org Machine LearningDec-29-2017

Learning probability distributions on the weights of neural networks (NNs) has recently proven beneficial in many applications. Bayesian methods, such as Stein variational gradient descent (SVGD), offer an elegant framework to reason about NN model uncertainty. However, by assuming independent Gaussian priors for the individual NN weights (as often applied), SVGD does not impose prior knowledge that there is often structural information (dependence) among weights. We propose efficient posterior learning of structural weight uncertainty, within an SVGD framework, by employing matrix variate Gaussian priors on NN parameters. We further investigate the learned structural uncertainty in sequential decision-making problems, including contextual bandits and reinforcement learning. Experiments on several synthetic and real datasets indicate the superiority of our model, compared with state-of-the-art methods.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

1801.00085

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
(2 more...)

Add feedback

Finite-sample risk bounds for maximum likelihood estimation with arbitrary penalties

Brinda, W. D., Klusowski, Jason M.

arXiv.org Machine LearningDec-28-2017

Remarkably general method for bounding the statistical risk of penalized likelihood estimators comes from work on two-part coding, one of the minimum description length (MDL) approaches to statistical inference. Two-part coding MDL prescribes assigning codelengths to a model (or model class) then selecting the distribution that provides the most efficient description of one's data [1]. The total description length has two parts: the part that specifies a distribution within the model (as well as a model within the model class if necessary) and the part that specifies the data with reference to the specified distribution. If the codelengths are exactly Kraft-valid, this approach is equivalent to Bayesian maximum a posteriori (MAP) estimation, in that the two parts correspond to log reciprocal of prior and log reciprocal of likelihood respectively. More generally, one can call the part of the codelength specifying the distribution a penalty term; it is called the complexity in MDL literature. Let (Θ, L) denote a discrete set indexing distributions along with a complexity function. With X P, the (pointwise) redundancy of any θ Θ is its two-part codelength minus log(1/p(X)), the codelength one gets by using P as the coding distribution.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1712.10087

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.68)

Add feedback

Declarative Statistics

Rossi, Roberto, Akgün, Özgür, Prestwich, Steven, Tarim, S. Armagan

arXiv.org Artificial IntelligenceDec-28-2017

In this work we introduce declarative statistics, a suite of declarative modelling tools for statistical analysis. Statistical constraints represent the key building block of declarative statistics. First, we introduce a range of relevant counting and matrix constraints and associated decompositions, some of which novel, that are instrumental in the design of statistical constraints. Second, we introduce a selection of novel statistical constraints and associated decompositions, which constitute a self-contained toolbox that can be used to tackle a wide range of problems typically encountered by statisticians. Finally, we deploy these statistical constraints to a wide range of application areas drawn from classical statistics and we contrast our framework against established practices.

artificial intelligence, constraint, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1708.01829

Country:

Europe (1.00)
North America > United States > New York > New York County > New York City (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Extrapolating Expected Accuracies for Large Multi-Class Problems

Zheng, Charles, Achanta, Rakesh, Benjamini, Yuval

arXiv.org Machine LearningDec-27-2017

Many machine learning tasks are interested in recognizing or identifying an individual instance within a large set of possible candidates. These problems are usually modeled as multi-class classification problems, with a large and possibly complex label set. Leading examples include detecting the speaker from his voice patterns (Togneri and Pullella, 2011), identifying the author from her written text (Stamatatos et al., 2014), or labeling the object category from its image (Duygulu et al., 2002, Deng et al., 2010, Oquab et al., 2014). In all these examples, the algorithm observes an input x, and uses the classifier function h to guess the label y from a large label set S. 1 There are multiple practical challenges in developing classifiers for large label sets. Collecting high quality training data is perhaps the main obstacle, as the costs scale with the number of classes. It can be affordable to first collect data for a small set of classes, even if the long-term goal is to generalize to a larger set. Furthermore, classifier development can be accelerated by training first on fewer classes, as each training cycle may require substantially less resources. Indeed, due to interest in how small-set performance generalizes to larger sets, such comparisons can found in the literature (Oquab et al., 2014, Griffin et al., 2007). A natural question is: how does changing the size of the label set affect the classification accuracy?

accuracy, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

1712.09713

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

Add feedback

On Connecting Stochastic Gradient MCMC and Differential Privacy

Li, Bai, Chen, Changyou, Liu, Hao, Carin, Lawrence

arXiv.org Machine LearningDec-25-2017

Significant success has been realized recently on applying machine learning to real-world applications. There have also been corresponding concerns on the privacy of training data, which relates to data security and confidentiality issues. Differential privacy provides a principled and rigorous privacy guarantee on machine learning models. While it is common to design a model satisfying a required differential-privacy property by injecting noise, it is generally hard to balance the trade-off between privacy and utility. We show that stochastic gradient Markov chain Monte Carlo (SG-MCMC) -- a class of scalable Bayesian posterior sampling algorithms proposed recently -- satisfies strong differential privacy with carefully chosen step sizes. We develop theory on the performance of the proposed differentially-private SG-MCMC method. We conduct experiments to support our analysis and show that a standard SG-MCMC sampler without any modification (under a default setting) can reach state-of-the-art performance in terms of both privacy and utility on Bayesian learning.

artificial intelligence, bayesian inference, machine learning, (18 more...)

arXiv.org Machine Learning

1712.09097

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback

Bayesian Computational Analyses with R Udemy

@machinelearnbotDec-24-2017, 21:00:39 GMT

Bayesian Computational Analyses with R is an introductory course on the use and implementation of Bayesian modeling using R software. The Bayesian approach is an alternative to the "frequentist" approach where one simply takes a sample of data and makes inferences about the likely parameters of the population. In contrast, the Bayesian approach uses both likelihood functions and a sample of observed data (the'prior') to estimate the most likely values and distributions for the estimated population parameters (the'posterior'). The course is useful to anyone who wishes to learn about Bayesian concepts and is suited to both novice and intermediate Bayesian students and Bayesian practitioners. It is both a practical, "hands-on" course with many examples using R scripts and software, and is conceptual, as the course explains the Bayesian concepts. All materials, software, R scripts, slides, exercises and solutions are included with the course materials.

artificial intelligence, bayesian computational analysis, machine learning, (11 more...)

@machinelearnbot

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (0.40)
Education > Educational Setting > Online (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

On Statistical Optimality of Variational Bayes

Pati, Debdeep, Bhattacharya, Anirban, Yang, Yun

arXiv.org Machine LearningDec-24-2017

Variational inference [25, 7, 40] is now a well-established tool to approximate intractable posterior distributions in hierarchical multi-layered Bayesian models. The traditional Markov chain Monte Carlo (MCMC; [17]) approach of approximating distributions with intractable normalizing constants draws (correlated) samples according to a discrete-time Markov chain whose stationary distribution is the target distribution. Despite their success and popularity, MCMC methods can be slow to converge and lack scalability in big data problems and/or problems involving very many latent variables, which has fueled search for alternatives. In contrast to the sampling approach of MCMC, variational inference approaches the problem from an optimization viewpoint. First, a class of analytically tractable distributions, referred to as the variational family, is identified for the problem at hand. For example, in mean-field approximation, the set of parameters and latent variables is divided into blocks and the variational distribution is assumed to be independent across blocks.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

1712.08983

Country: North America > United States (0.46)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.86)

Add feedback

An Approximate Bayesian Long Short-Term Memory Algorithm for Outlier Detection

Chen, Chao, Lin, Xiao, Terejanu, Gabriel

arXiv.org Machine LearningDec-23-2017

Abstract--Long Short-T erm Memory networks trained with gradient descent and back-propagation have received great success in various applications. However, point estimation of the weights of the networks is prone to over-fitting problems and lacks important uncertainty information associated with the estimation. However, exact Bayesian neural network methods are intractable and non-applicable for real-world applications. In this study, we propose an approximate estimation of the weights uncertainty using Ensemble Kalman Filter, which is easily scalable to a large number of weights. T o assess the proposed algorithm, we apply it to outlier detection in five real-world events retrieved from the Twitter platform. I NTRODUCTION The recent resurgence of neural network trained with back-propagation has established state-of-art results in a wide range of domains. However, backpropagation-based neural networks (NN) are associated with many disadvantages, including but not limited to, the lack of uncertainty estimation, tendency of overfitting small data, and tuning of many hyper-parameters.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1712.08773

Country: North America > United States > South Carolina (0.28)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Sports (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback