AITopics | Bayesian Inference

Collaborating Authors

Bayesian Inference

Bayes' Theorem allows a program to infer the probabilities of likely causes from the probabilities of their effects, when what it is given are the probabilities of effects, given the causes.

News Overviews Instructional Materials AI-Alerts Classics

Relevance Vector Machine with Weakly Informative Hyperprior and Extended Predictive Information Criterion

Murayama, Kazuaki., Kawano, Shuichi.

arXiv.org Machine LearningMay-7-2020

In the variational relevance vector machine, the gamma distribution is representative as a hyperprior over the noise precision of automatic relevance determination prior. Instead of the gamma hyperprior, we propose to use the inverse gamma hyperprior with a shape parameter close to zero and a scale parameter not necessary close to zero. This hyperprior is associated with the concept of a weakly informative prior. The effect of this hyperprior is investigated through regression to non-homogeneous data. Because it is difficult to capture the structure of such data with a single kernel function, we apply the multiple kernel method, in which multiple kernel functions with different widths are arranged for input data. We confirm that the degrees of freedom in a model is controlled by adjusting the scale parameter and keeping the shape parameter close to zero. A candidate for selecting the scale parameter is the predictive information criterion. However the estimated model using this criterion seems to cause over-fitting. This is because the multiple kernel method makes the model a situation where the dimension of the model is larger than the data size. To select an appropriate scale parameter even in such a situation, we also propose an extended prediction information criterion. It is confirmed that a multiple kernel relevance vector regression model with good predictive accuracy can be obtained by selecting the scale parameter minimizing extended prediction information criterion.

artificial intelligence, hyperprior, machine learning, (17 more...)

arXiv.org Machine Learning

2005.03419

Country: Asia (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Restricted maximum-likelihood method for learning latent variance components in gene expression data with known and unknown confounders

Malik, Muhammad Ammar, Michoel, Tom

arXiv.org Machine LearningMay-6-2020

Linear mixed modelling is a popular approach for detecting and correcting spurious sample correlations due to hidden confounders in genome-wide gene expression data. In applications where some confounding factors are known, estimating simultaneously the contribution of known and latent variance components in linear mixed models is a challenge that has so far relied on numerical gradient-based optimizers to maximize the likelihood function. This is unsatisfactory because the resulting solution is poorly characterized and the efficiency of the method may be suboptimal. Here we prove analytically that maximum-likelihood latent variables can always be chosen orthogonal to the known confounding factors, in other words, that maximum-likelihood latent variables explain sample covariances not already explained by known factors. Based on this result we propose a restricted maximum-likelihood method which estimates the latent variables by maximizing the likelihood on the restricted subspace orthogonal to the known confounding factors, and show that this reduces to probabilistic PCA on that subspace. The method then estimates the variance-covariance parameters by maximizing the remaining terms in the likelihood function given the latent variables, using a newly derived analytic solution for this problem. Compared to gradient-based optimizers, our method attains equal or higher likelihood values, can be computed using standard matrix operations, results in latent factors that don't overlap with any known factors, and has a runtime reduced by several orders of magnitude. We anticipate that the restricted maximum-likelihood method will facilitate the application of linear mixed modelling strategies for learning latent variance components to much larger gene expression datasets than currently possible.

artificial intelligence, covariate, machine learning, (17 more...)

arXiv.org Machine Learning

2005.02921

Country:

North America > United States (0.15)
North America > Panama (0.06)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Norway > Western Norway > Vestland > Bergen (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian Entailment Hypothesis: How Brains Implement Monotonic and Non-monotonic Reasoning

Kido, Hiroyuki

arXiv.org Artificial IntelligenceMay-6-2020

Recent success of Bayesian methods in neuroscience and artificial intelligence gives rise to the hypothesis that the brain is a Bayesian machine. Since logic, as the laws of thought, is a product and practice of the human brain, it leads to another hypothesis that there is a Bayesian algorithm and data-structure for logical reasoning. In this paper, we give a Bayesian account of entailment and characterize its abstract inferential properties. The Bayesian entailment is shown to be a monotonic consequence relation in an extreme case. In general, it is a sort of non-monotonic consequence relation without Cautious monotony or Cut. The preferential entailment, which is a representative non-monotonic consequence relation, is shown to be maximum a posteriori entailment, which is an approximation of the Bayesian entailment. We finally discuss merits of our proposals in terms of encoding preferences on defaults, handling change and contradiction, and modeling human entailment.

artificial intelligence, entailment, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2005.00961

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.89)

Add feedback

Ensuring Fairness under Prior Probability Shifts

Biswas, Arpita, Mukherjee, Suvam

arXiv.org Artificial IntelligenceMay-6-2020

In this paper, we study the problem of fair classification in the presence of prior probability shifts, where the training set distribution differs from the test set. This phenomenon can be observed in the yearly records of several real-world datasets, such as recidivism records and medical expenditure surveys. If unaccounted for, such shifts can cause the predictions of a classifier to become unfair towards specific population subgroups. While the fairness notion called Proportional Equality (PE) accounts for such shifts, a procedure to ensure PE-fairness was unknown. In this work, we propose a method, called CAPE, which provides a comprehensive solution to the aforementioned problem. CAPE makes novel use of prevalence estimation techniques, sampling and an ensemble of classifiers to ensure fair predictions under prior probability shifts. We introduce a metric, called prevalence difference (PD), which CAPE attempts to minimize in order to ensure PE-fairness. We theoretically establish that this metric exhibits several desirable properties. We evaluate the efficacy of CAPE via a thorough empirical evaluation on synthetic datasets. We also compare the performance of CAPE with several popular fair classifiers on real-world datasets like COMPAS (criminal risk assessment) and MEPS (medical expenditure panel survey). The results indicate that CAPE ensures PE-fair predictions, while performing well on other performance metrics.

artificial intelligence, classifier, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2005.03474

Country: North America > United States > Florida > Broward County (0.04)

Genre: Research Report (0.50)

Industry:

Law (0.46)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.70)

Add feedback

A Ladder of Causal Distances

Peyrard, Maxime, West, Robert

arXiv.org Artificial IntelligenceMay-5-2020

Causal discovery, the task of automatically constructing a causal model from data, is of major significance across the sciences. Evaluating the performance of causal discovery algorithms should ideally involve comparing the inferred models to ground-truth models available for benchmark datasets, which in turn requires a notion of distance between causal models. While such distances have been proposed previously, they are limited by focusing on graphical properties of the causal models being compared. Here, we overcome this limitation by defining distances derived from the causal distributions induced by the models, rather than exclusively from their graphical structure. Pearl and Mackenzie (2018) have arranged the properties of causal models in a hierarchy called the "ladder of causation" spanning three rungs: observational, interventional, and counterfactual. Following this organization, we introduce a hierarchy of three distances, one for each rung of the ladder. Our definitions are intuitively appealing as well as efficient to compute approximately. We put our causal distances to use by benchmarking standard causal discovery systems on both synthetic and real-world datasets for which ground-truth causal models are available. Finally, we highlight the usefulness of our causal distances by briefly discussing further applications beyond the evaluation of causal discovery techniques.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2005.0248

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)

Add feedback

Variational Bayes In Private Settings (VIPS)

Park, Mijung (Max Planck Institute for Intelligent Systems) | Foulds, James | Chaudhuri, Kamalika | Welling, Max

Journal of Artificial Intelligence ResearchMay-5-2020

Many applications of Bayesian data analysis involve sensitive information such as personal documents or medical records, motivating methods which ensure that privacy is protected. We introduce a general privacy-preserving framework for Variational Bayes (VB), a widely used optimization-based Bayesian inference method. Our framework respects differential privacy, the gold-standard privacy criterion, and encompasses a large class of probabilistic models, called the Conjugate Exponential (CE) family. We observe that we can straightforwardly privatise VB's approximate posterior distributions for models in the CE family, by perturbing the expected sufficient statistics of the complete-data likelihood. For a broadly-used class of non-CE models, those with binomial likelihoods, we show how to bring such models into the CE family, such that inferences in the modified model resemble the private variational Bayes algorithm as closely as possible, using the Pólya-Gamma data augmentation scheme. The iterative nature of variational Bayes presents a further challenge since iterations increase the amount of noise needed. We overcome this by combining: (1) an improved composition method for differential privacy, called the moments accountant, which provides a tight bound on the privacy cost of multiple VB iterations and thus significantly decreases the amount of additive noise; and (2) the privacy amplification effect of subsampling mini-batches from large-scale data in stochastic learning. We empirically demonstrate the effectiveness of our method in CE and non-CE models including latent Dirichlet allocation, Bayesian logistic regression, and sigmoid belief networks, evaluated on real-world datasets.

data mining, machine learning, natural language, (18 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11763

AI Access Foundation

11763

Journal of Artificial Intelligence Research

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
North America > United States > Maryland > Baltimore (0.14)
Asia > Middle East > Jordan (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Vocabulary Alignment in Openly Specified Interactions

Chocron, Paula Daniela (Hutoma) | Schorlemmer, Marco

Journal of Artificial Intelligence ResearchMay-4-2020

The problem of achieving common understanding between agents that use different vocabularies has been mainly addressed by techniques that assume the existence of shared external elements, such as a meta-language or a physical environment. In this article, we consider agents that use different vocabularies and only share knowledge of how to perform a task, given by the specification of an interaction protocol. We present a framework that lets agents learn a vocabulary alignment from the experience of interacting. Unlike previous work in this direction, we use open protocols that constrain possible actions instead of defining procedures, making our approach more general. We present two techniques that can be used either to learn an alignment from scratch or to repair an existent one, and we evaluate their performance experimentally.

artificial intelligence, machine learning, natural language, (19 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11497

AI Access Foundation

11497

Journal of Artificial Intelligence Research

Country:

Europe > Austria > Vienna (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
(25 more...)

Genre: Research Report (0.45)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Communications (0.93)
(3 more...)

Add feedback

Hierarchical Bayesian Approach for Improving Weights for Solving Multi-Objective Route Optimization Problem

Beed, Romit S, Sarkar, Sunita, Roy, Arindam, Bhattacharya, Durba

arXiv.org Artificial IntelligenceMay-3-2020

The weighted sum method is a simple and widely used technique that scalarizes multiple conflicting objectives into a single objective function. It suffers from the problem of determining the appropriate weights corresponding to the objectives. This paper proposes a novel Hierarchical Bayesian model based on Multinomial distribution and Dirichlet prior to refine the weights for solving such multi-objective route optimization problems. The model and methodologies revolve around data obtained from a small scale pilot survey. The method aims at improving the existing methods of weight determination in the field of Intelligent Transport Systems as data driven choice of weights through appropriate probabilistic modelling ensures, on an average, much reliable results than non-probabilistic techniques. Application of this model and methodologies to simulated as well as real data sets revealed quite encouraging performances with respect to stabilizing the estimates of weights.

artificial intelligence, machine learning, objective, (18 more...)

arXiv.org Artificial Intelligence

2005.02811

Country:

Asia > India > West Bengal > Kolkata (0.04)
Asia > India > Assam (0.04)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
(2 more...)

Genre: Research Report (0.83)

Industry: Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Large-scale Uncertainty Estimation and Its Application in Revenue Forecast of SMEs

Zhang, Zebang, Zhao, Kui, Huang, Kai, Jia, Quanhui, Fang, Yanming, Yu, Quan

arXiv.org Machine LearningMay-2-2020

The economic and banking importance of the small and medium enterprise (SME) sector is well recognized in contemporary society. Business credit loans are very important for the operation of SMEs, and the revenue is a key indicator of credit limit management. Therefore, it is very beneficial to construct a reliable revenue forecasting model. If the uncertainty of an enterprise's revenue forecasting can be estimated, a more proper credit limit can be granted. Natural gradient boosting approach, which estimates the uncertainty of prediction by a multi-parameter boosting algorithm based on the natural gradient. However, its original implementation is not easy to scale into big data scenarios, and computationally expensive compared to state-of-the-art tree-based models (such as XGBoost). In this paper, we propose a Scalable Natural Gradient Boosting Machines that is simple to implement, readily parallelizable, interpretable and yields high-quality predictive uncertainty estimates. According to the characteristics of revenue distribution, we derive an uncertainty quantification function. We demonstrate that our method can distinguish between samples that are accurate and inaccurate on revenue forecasting of SMEs. What's more, interpretability can be naturally obtained from the model, satisfying the financial needs.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2005.00718

Genre: Research Report (0.64)

Industry: Banking & Finance > Credit (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

35 Words About Uncertainty, Every AI-Savvy Leader Must Know

#artificialintelligenceMay-1-2020, 14:41:20 GMT

Bayes' rule: (or Bayes' theorem) of one probability theory's most important rules, describing the probability of an event, based on prior knowledge of conditions that might be related:

probability, probability information, probability rule, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Add feedback