AITopics

2501.13748

Country:

North America > United States > California (0.46)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

arXiv.org Machine LearningJun-4-2024

One-Shot Federated Learning with Bayesian Pseudocoresets

d'Hondt, Tim, Pechenizkiy, Mykola, Peharz, Robert

Optimization-based techniques for federated learning (FL) often come with prohibitive communication cost, as high dimensional model parameters need to be communicated repeatedly between server and clients. In this paper, we follow a Bayesian approach allowing to perform FL with one-shot communication, by solving the global inference problem as a product of local client posteriors. For models with multi-modal likelihoods, such as neural networks, a naive application of this scheme is hampered, since clients will capture different posterior modes, causing a destructive collapse of the posterior on the server side. Consequently, we explore approximate inference in the function-space representation of client posteriors, hence suffering less or not at all from multi-modality. We show that distributed function-space inference is tightly related to learning Bayesian pseudocoresets and develop a tractable Bayesian FL algorithm on this insight. We show that this approach achieves prediction performance competitive to state-of-the-art while showing a striking reduction in communication cost of up to two orders of magnitude. Moreover, due to its Bayesian nature, our method also delivers well-calibrated uncertainty estimates.

artificial intelligence, machine learning, posterior, (15 more...)

2406.02177

Country: Europe (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningFeb-22-2024

Rao-Blackwellising Bayesian Causal Inference

Toth, Christian, Knoll, Christian, Pernkopf, Franz, Peharz, Robert

Bayesian causal inference, i.e., inferring a posterior over causal models for the use in downstream causal reasoning tasks, poses a hard computational inference problem that is little explored in literature. In this work, we combine techniques from order-based MCMC structure learning with recent advances in gradient-based graph learning into an effective Bayesian causal inference framework. Specifically, we decompose the problem of inferring the causal structure into (i) inferring a topological order over variables and (ii) inferring the parent sets for each variable. When limiting the number of parents per variable, we can exactly marginalise over the parent sets in polynomial time. We further use Gaussian processes to model the unknown causal mechanisms, which also allows their exact marginalisation. This introduces a Rao-Blackwellization scheme, where all components are eliminated from the model, except for the causal order, for which we learn a distribution via gradient-based optimisation. The combination of Rao-Blackwellization with our sequential inference procedure for causal orders yields state-of-the-art on linear and non-linear additive noise benchmarks with scale-free and Erdos-Renyi graph structures.

artificial intelligence, causal order, machine learning, (15 more...)

2402.14781

Country: Europe > Austria (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

arXiv.org Artificial IntelligenceJan-16-2024

How to Turn Your Knowledge Graph Embeddings into Generative Models

Loconte, Lorenzo, Di Mauro, Nicola, Peharz, Robert, Vergari, Antonio

Some of the most successful knowledge graph embedding (KGE) models for link prediction -- CP, RESCAL, TuckER, ComplEx -- can be interpreted as energy-based models. Under this perspective they are not amenable for exact maximum-likelihood estimation (MLE), sampling and struggle to integrate logical constraints. This work re-interprets the score functions of these KGEs as circuits -- constrained computational graphs allowing efficient marginalisation. Then, we design two recipes to obtain efficient generative circuit models by either restricting their activations to be non-negative or squaring their outputs. Our interpretation comes with little or no loss of performance for link prediction, while the circuits framework unlocks exact learning by MLE, efficient sampling of new triples, and guarantee that logical constraints are satisfied by design. Furthermore, our models scale more gracefully than the original KGEs on graphs with millions of entities.

artificial intelligence, logic & formal reasoning, machine learning, (22 more...)

2305.15944

Country:

Europe > Austria (0.14)
Europe > Portugal (0.14)
Europe > Italy (0.14)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.86)
(2 more...)

arXiv.org Artificial IntelligenceOct-25-2023

Probabilistic Integral Circuits

Gala, Gennaro, de Campos, Cassio, Peharz, Robert, Vergari, Antonio, Quaeghebeur, Erik

Continuous latent variables (LVs) are a key ingredient of many generative models, as they allow modelling expressive mixtures with an uncountable number of components. In contrast, probabilistic circuits (PCs) are hierarchical discrete mixtures represented as computational graphs composed of input, sum and product units. Unlike continuous LV models, PCs provide tractable inference but are limited to discrete LVs with categorical (i.e. unordered) states. We bridge these model classes by introducing probabilistic integral circuits (PICs), a new language of computational graphs that extends PCs with integral units representing continuous LVs. In the first place, PICs are symbolic computational graphs and are fully tractable in simple cases where analytical integration is possible. In practice, we parameterise PICs with light-weight neural nets delivering an intractable hierarchical continuous mixture that can be approximated arbitrarily well with large PCs using numerical quadrature. On several distribution estimation benchmarks, we show that such PIC-approximating PCs systematically outperform PCs commonly learned via expectation-maximization or SGD.

artificial intelligence, machine learning, probabilistic integral circuit

2310.16986

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

arXiv.org Artificial IntelligenceMar-24-2023

Continuous Mixtures of Tractable Probabilistic Models

Correia, Alvaro H. C., Gala, Gennaro, Quaeghebeur, Erik, de Campos, Cassio, Peharz, Robert

Probabilistic models based on continuous latent spaces, such as variational autoencoders, can be understood as uncountable mixture models where components depend continuously on the latent code. They have proven to be expressive tools for generative and probabilistic modelling, but are at odds with tractable probabilistic inference, that is, computing marginals and conditionals of the represented probability distribution. Meanwhile, tractable probabilistic models such as probabilistic circuits (PCs) can be understood as hierarchical discrete mixture models, and thus are capable of performing exact inference efficiently but often show subpar performance in comparison to continuous latent-space models. In this paper, we investigate a hybrid approach, namely continuous mixtures of tractable models with a small latent dimension. While these models are analytically intractable, they are well amenable to numerical integration schemes based on a finite set of integration points. With a large enough number of integration points the approximation becomes de-facto exact. Moreover, for a finite set of integration points, the integration method effectively compiles the continuous mixture into a standard PC. In experiments, we show that this simple scheme proves remarkably effective, as PCs learnt this way set new state of the art for tractable models on many standard density estimation benchmarks.

artificial intelligence, deep learning, machine learning, (17 more...)

2209.10584

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

arXiv.org Artificial IntelligenceFeb-23-2023

Bayesian Structure Scores for Probabilistic Circuits

Yang, Yang, Gala, Gennaro, Peharz, Robert

Probabilistic circuits (PCs) are a prominent representation of probability distributions with tractable inference. While parameter learning in PCs is rigorously studied, structure learning is often more based on heuristics than on principled objectives. In this paper, we develop Bayesian structure scores for deterministic PCs, i.e., the structure likelihood with parameters marginalized out, which are well known as rigorous objectives for structure learning in probabilistic graphical models. When used within a greedy cutset algorithm, our scores effectively protect against overfitting and yield a fast and almost hyper-parameter-free structure learner, distinguishing it from previous approaches. In experiments, we achieve good trade-offs between training time and model fit in terms of log-likelihood. Moreover, the principled nature of Bayesian scores unlocks PCs for accommodating frameworks such as structural expectation-maximization.

artificial intelligence, bayesian inference, machine learning, (14 more...)

2302.1213

Country: Europe (0.68)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceJul-11-2020

Joints in Random Forests

Correia, Alvaro H. C., Peharz, Robert, de Campos, Cassio P.

Decision Trees (DTs) and Random Forests (RFs) are powerful discriminative learners and tools of central importance to the everyday machine learning practitioner and data scientist. Due to their discriminative nature, however, they lack principled methods to process inputs with missing features or to detect outliers, which requires pairing them with imputation techniques or a separate generative model. In this paper, we demonstrate that DTs and RFs can naturally be interpreted as generative models, by drawing a connection to Probabilistic Circuits, a prominent class of tractable probabilistic models. This reinterpretation equips them with a full joint distribution over the feature space and leads to Generative Decision Trees (GeDTs) and Generative Forests (GeFs), a family of novel hybrid generative-discriminative models. This family of models retains the overall characteristics of DTs and RFs while additionally being able to handle missing features by means of marginalisation. Under certain assumptions, frequently made for Bayes consistency results, we show that consistency in GeDTs and GeFs extend to any pattern of missing input features, if missing at random. Empirically, we show that our models often outperform common routines to treat missing data, such as K-nearest neighbour imputation, and moreover, that our models can naturally detect outliers by monitoring the marginal probability of input features.

decision tree learning, gef gef, health & medicine, (19 more...)

2006.14937

Country: Europe (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.46)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Machine LearningJul-11-2020

Towards Robust Classification with Deep Generative Forests

Correia, Alvaro H. C., Peharz, Robert, de Campos, Cassio

Decision Trees and Random Forests are among the most widely used machine learning models, and often achieve state-of-the-art performance in tabular, domain-agnostic datasets. Nonetheless, being primarily discriminative models they lack principled methods to manipulate the uncertainty of predictions. In this paper, we exploit Generative Forests (GeFs), a recent class of deep probabilistic models that addresses these issues by extending Random Forests to generative models representing the full joint distribution over the feature space. We demonstrate that GeFs are uncertainty-aware classifiers, capable of measuring the robustness of each prediction as well as detecting out-of-distribution samples.

artificial intelligence, decision tree learning, generative forest, (18 more...)

2007.05721

Genre: Research Report (0.50)

Industry: Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.78)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.49)

arXiv.org Machine LearningOct-10-2019

Deep Structured Mixtures of Gaussian Processes

Trapp, Martin, Peharz, Robert, Pernkopf, Franz, Rasmussen, Carl E.

Gaussian Processes (GPs) are powerful non-parametric Bayesian regression models that allow exact posterior inference, but exhibit high computational and memory costs. In order to improve scalability of GPs, approximate posterior inference is frequently employed, where a prominent class of approximation techniques is based on local GP experts. However, the local-expert techniques proposed so far are either not well-principled, come with limited approximation guarantees, or lead to intractable models. In this paper, we introduce deep structured mixtures of GP experts, a stochastic process model which i) allows exact posterior inference, ii) has attractive computational and memory costs, and iii), when used as GP approximation, captures predictive uncertainties consistently better than previous approximations. In a variety of experiments, we show that deep structured mixtures have a low approximation error and outperform existing expert-based approaches.

artificial intelligence, bayesian inference, dsmgp, (17 more...)

1910.04536

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)