AITopics

2007.09331

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceJun-29-2020

Handling Missing Data in Decision Trees: A Probabilistic Approach

Khosravi, Pasha, Vergari, Antonio, Choi, YooJung, Liang, Yitao, Broeck, Guy Van den

However, most of these are heuristics in nature (Twala et al., 2008), tailored towards some specific tree induction algorithm Decision trees are a popular family of models (Chen & Guestrin, 2016; Prokhorenkova et al., 2018), due to their attractive properties such as interpretability or make strong distributional assumptions about the data, and ability to handle heterogeneous such as the feature distribution factorizing completely (e.g., data. Concurrently, missing data is a prevalent mean, median imputation (Rubin, 1976)) or according to the occurrence that hinders performance of machine tree structure (Quinlan, 1993). As many works have compared learning models. As such, handling missing data the most prominent ones in empirical studies (Batista in decision trees is a well studied problem. In & Monard, 2003; Saar-Tsechansky & Provost, 2007), there this paper, we tackle this problem by taking a is no clear winner and ultimately, the adoption of a particular probabilistic approach. At deployment time, we strategy in practice boils down to its availability in the use tractable density estimators to compute the ML libraries employed. "expected prediction" of our models. At learning time, we fine-tune parameters of already learned In this work, we tackle handling missing data in trees at trees by minimizing their "expected prediction both learning and deployment time from a principled probabilistic loss" w.r.t.

artificial intelligence, decision tree learning, prediction, (16 more...)

2006.16341

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

arXiv.org Artificial IntelligenceOct-4-2019

On Tractable Computation of Expected Predictions

Khosravi, Pasha, Choi, YooJung, Liang, Yitao, Vergari, Antonio, Broeck, Guy Van den

Computing expected predictions has many interesting applications in areas such as fairness, handling missing values, and data analysis. Unfortunately, computing expectations of a discriminative model with respect to a probability distribution defined by an arbitrary generative model has been proven to be hard in general. In fact, the task is intractable even for simple models such as logistic regression and a naive Bayes distribution. In this paper, we identify a pair of generative and discriminative models that enables tractable computation of expectations of the latter with respect to the former, as well as moments of any order, in case of regression. Specifically, we consider expressive probabilistic circuits with certain structural constraints that support tractable probabilistic inference. Moreover, we exploit the tractable computation of high-order moments to derive an algorithm to approximate the expectations, for classification scenarios in which exact computations are intractable. We evaluate the effectiveness of our exact and approximate algorithms in handling missing data during prediction time where they prove to be competitive to standard imputation techniques on a variety of datasets. Finally, we illustrate how expected prediction framework can be used to reason about the behaviour of discriminative models.

artificial intelligence, data mining, prediction, (18 more...)

1910.02182

Country: North America > United States > California (0.14)

Genre:

Research Report > New Finding (0.35)
Research Report > Experimental Study (0.35)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.36)
(2 more...)

arXiv.org Artificial IntelligenceSep-20-2019

Hybrid Probabilistic Inference with Logical Constraints: Tractability and Message-Passing

Zeng, Zhe, Yan, Fanqi, Morettin, Paolo, Vergari, Antonio, Broeck, Guy Van den

Weighted model integration (WMI) is a very appealing framework for probabilistic inference: it allows to express the complex dependencies of real-world hybrid scenarios where variables are heterogeneous in nature (both continuous and discrete) via the language of Satisfiability Modulo Theories (SMT); as well as computing probabilistic queries with arbitrarily complex logical constraints. Recent work has shown WMI inference to be reducible to a model integration (MI) problem, under some assumptions, thus effectively allowing hybrid probabilistic reasoning by volume computations. In this paper, we introduce a novel formulation of MI via a message passing scheme that allows to efficiently compute the marginal densities and statistical moments of all the variables in linear time. As such, we are able to amortize inference for arbitrarily rich MI queries when they conform to the problem structure, here represented as the primal graph associated to the SMT formula. Furthermore, we theoretically trace the tractability boundaries of exact MI. Indeed, we prove that in terms of the structural requirements on the primal graph that make our MI algorithm tractable - bounding its diameter and treewidth - the bounds are not only sufficient, but necessary for tractable inference via MI.

artificial intelligence, formula, integration, (15 more...)

1909.09362

Country: North America > United States (0.68)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)

arXiv.org Machine LearningMay-21-2019

Conditional Sum-Product Networks: Imposing Structure on Deep Probabilistic Architectures

Shao, Xiaoting, Molina, Alejandro, Vergari, Antonio, Stelzner, Karl, Peharz, Robert, Liebig, Thomas, Kersting, Kristian

Bayesian networks are a central tool in machine learning and artificial intelligence, and make use of conditional independencies to impose structure on joint distributions. However, they are generally not as expressive as deep learning models and inference is hard and slow. In contrast, deep probabilistic models such as sum-product networks (SPNs) capture joint distributions in a tractable fashion, but use little interpretable structure. Here, we extend the notion of SPNs towards conditional distributions, which combine simple conditional models into high-dimensional ones. As shown in our experiments, the resulting conditional SPNs can be naturally used to impose structure on deep probabilistic models, allow for mixed data types, while maintaining fast and efficient inference.

cspn, deep learning, neural network, (19 more...)

1905.0855

Country: Europe > Germany > North Rhine-Westphalia (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

arXiv.org Machine LearningApr-1-2019

From Variational to Deterministic Autoencoders

Ghosh, Partha, Sajjadi, Mehdi S. M., Vergari, Antonio, Black, Michael, Schölkopf, Bernhard

Variational Autoencoders (VAEs) provide a theoretically-backed framework for deep generative models. However, they often produce "blurry" images, which is linked to their training objective. Sampling in the most popular implementation, the Gaussian VAE, can be interpreted as simply injecting noise to the input of a deterministic decoder. In practice, this simply enforces a smooth latent space structure. We challenge the adoption of the full VAE framework on this specific point in favor of a simpler, deterministic one. Specifically, we investigate how substituting stochasticity with other explicit and implicit regularization schemes can lead to a meaningful latent space without having to force it to conform to an arbitrarily chosen prior. To retrieve a generative mechanism for sampling new data points, we propose to employ an efficient ex-post density estimation step that can be readily adopted both for the proposed deterministic autoencoders as well as to improve sample quality of existing VAEs. We show in a rigorous empirical study that regularized deterministic autoencoding achieves state-of-the-art sample quality on the common MNIST, CIFAR-10 and CelebA datasets.

deep learning, latent space, neural network, (20 more...)

1903.12436

Country: Europe > Germany (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Machine LearningFeb-10-2019

Automatic Bayesian Density Analysis

Vergari, Antonio, Molina, Alejandro, Peharz, Robert, Ghahramani, Zoubin, Kersting, Kristian, Valera, Isabel

Making sense of a dataset in an automatic and unsupervised fashion is a challenging problem in statistics and AI. Classical approaches for {exploratory data analysis} are usually not flexible enough to deal with the uncertainty inherent to real-world data: they are often restricted to fixed latent interaction models and homogeneous likelihoods; they are sensitive to missing, corrupt and anomalous data; moreover, their expressiveness generally comes at the price of intractable inference. As a result, supervision from statisticians is usually needed to find the right model for the data. However, since domain experts are not necessarily also experts in statistics, we propose Automatic Bayesian Density Analysis (ABDA) to make exploratory data analysis accessible at large. Specifically, ABDA allows for automatic and efficient missing value estimation, statistical data type and likelihood discovery, anomaly detection and dependency structure mining, on top of providing accurate density estimation. Extensive empirical evidence shows that ABDA is a suitable tool for automatic exploratory analysis of mixed continuous and discrete tabular data.

abda, bayesian inference, health & medicine, (17 more...)

1807.09306

Country:

Europe (0.68)
North America > United States > Virginia (0.14)
North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Data Science > Data Mining (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Machine LearningJan-11-2019

SPFlow: An Easy and Extensible Library for Deep Probabilistic Learning using Sum-Product Networks

Molina, Alejandro, Vergari, Antonio, Stelzner, Karl, Peharz, Robert, Subramani, Pranav, Di Mauro, Nicola, Poupart, Pascal, Kersting, Kristian

We introduce SPFlow, an open-source Python library providing a simple interface to inference, learning and manipulation routines for deep and tractable probabilistic models called Sum-Product Networks (SPNs). The library allows one to quickly create SPNs both from data and through a domain specific language (DSL). It efficiently implements several probabilistic inference routines like computing marginals, conditionals and (approximate) most probable explanations (MPEs) along with sampling as well as utilities for serializing, plotting and structure statistics on an SPN. Moreover, many of the algorithms proposed in the literature to learn the structure and parameters of SPNs are readily available in SPFlow. Furthermore, SPFlow is extremely extensible and customizable, allowing users to promptly distill new inference and learning routines by injecting custom code into a lightweight functional-oriented API framework. This is achieved in SPFlow by keeping an internal Python representation of the graph structure that also enables practical compilation of an SPN into a TensorFlow graph, C, CUDA or FPGA custom code, significantly speeding-up computations.

artificial intelligence, machine learning, spn, (18 more...)

1901.03704

Country: Europe > Germany (0.31)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Machine LearningJun-5-2018

Probabilistic Deep Learning using Random Sum-Product Networks

Peharz, Robert, Vergari, Antonio, Stelzner, Karl, Molina, Alejandro, Trapp, Martin, Kersting, Kristian, Ghahramani, Zoubin

Probabilistic deep learning currently receives an increased interest, as consistent treatment of uncertainty is one of the most important goals in machine learning and AI. Most current approaches, however, have severe limitations concerning inference. Sum-Product networks (SPNs), although having excellent properties in that regard, have so far not been explored as serious deep learning models, likely due to their special structural requirements. In this paper, we make a drastic simplification and use a random structure which is trained in a "classical deep learning manner" such as automatic differentiation, SGD, and GPU support. The resulting models, called RAT-SPNs, yield comparable prediction results to deep neural networks, but maintain well-calibrated uncertainty estimates which makes them highly robust against missing data. Furthermore, they successfully capture uncertainty over their inputs in a convincing manner, yielding robust outlier and peculiarity detection.

deep learning, neural network, sum-product network, (19 more...)

1806.0191

Country: Europe (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

AAAI ConferencesFeb-8-2018

Mixed Sum-Product Networks: A Deep Architecture for Hybrid Domains

While all kinds of mixed data---from personal data, over panel and scientific data, to public and commercial data---are collected and stored, building probabilistic graphical models for these hybrid domains becomes more difficult. Users spend significant amounts of time in identifying the parametric form of the random variables (Gaussian, Poisson, Logit, etc.) involved and learning the mixed models. To make this difficult task easier, we propose the first trainable probabilistic deep architecture for hybrid domains that features tractable queries. It is based on Sum-Product Networks (SPNs) with piecewise polynomial leaf distributions together with novel nonparametric decomposition and conditioning steps using the Hirschfeld-Gebelein-Renyi Maximum Correlation Coefficient. This relieves the user from deciding a-priori the parametric form of the random variables but is still expressive enough to effectively approximate any distribution and permits efficient learning and inference.Our experiments show that the architecture, called Mixed SPNs, can indeed capture complex distributions across a wide range of hybrid domains.

deep learning, neural network, sum-product network, (21 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Europe > Germany (0.28)
North America > United States (0.28)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.61)