Goto

Collaborating Authors

 sum unit




Supplementary Material A Proofs

Neural Information Processing Systems

This section provides the full proof of the theorems stated in the main paper. Specifically, for a "softened" sample We proceed to show the correctness of the backward pass. Therefore, the claim at the beginning of this paragraph holds. Plug in Eq. (14), we have context Finally, we note that Alg. 1 and 2 both run in time O ( |p| |D|). A.2 Useful Lemmas This section provides several useful lemmas that are later used in the proof of Thm. 4. Lemma 1. F ollowing this decomposition, we construct Alg. 4 that computes the entropy of every nodes in a Consider the example PC in Figure 1.


A Proofs

Neural Information Processing Systems

The proof directly follows from Theorem 3.2 from V ergari et al. [75]. Note that O ( |q ||c|) is a loose upperbound and the size of r is in practice smaller [75]. Analogously, the second statement of Theorem 3.1 follows from Proposition A.1 and by recalling For our experiments we use standard compilation tools to obtain a constraint circuit starting from a propositional logical formula in conjunctive normal form. We now illustrate step-by-step one example of such a compilation for a simple logical formula. Deterministic sum units represent disjoint solutions to the logical formula, meaning there exists distinct assignments, characterized by the children, that satisfy the logical constraint e.g.



Algorithm 1 S

Neural Information Processing Systems

This section introduces the algorithmic construction of gadget circuits that will be adopted in our proofs of tractability as well as hardness. A construction algorithm for the support circuit is provided in Alg. 1. This construction is summarized in Alg. 2. It is a key component in the algorithms for many tractable We define a circuit representation of the #3SA T problem, following the construction in Khosravi et al. This section formally presents the tractability and hardness results w.r.t. The hardness of the sum of two circuits to yield a deterministic circuit has been proven by Shen et al.


Sum of Squares Circuits

Loconte, Lorenzo, Mengel, Stefan, Vergari, Antonio

arXiv.org Artificial Intelligence

Designing expressive generative models that support exact and efficient inference is a core question in probabilistic ML. Probabilistic circuits (PCs) offer a framework where this tractability-vs-expressiveness trade-off can be analyzed theoretically. Recently, squared PCs encoding subtractive mixtures via negative parameters have emerged as tractable models that can be exponentially more expressive than monotonic PCs, i.e., PCs with positive parameters only. In this paper, we provide a more precise theoretical characterization of the expressiveness relationships among these models. First, we prove that squared PCs can be less expressive than monotonic ones. Second, we formalize a novel class of PCs -- sum of squares PCs -- that can be exponentially more expressive than both squared and monotonic PCs. Around sum of squares PCs, we build an expressiveness hierarchy that allows us to precisely unify and separate different tractable model classes such as Born Machines and PSD models, and other recently introduced tractable probabilistic models by using complex parameters. Finally, we empirically show the effectiveness of sum of squares circuits in performing distribution estimation.


Sum-Product-Set Networks: Deep Tractable Models for Tree-Structured Graphs

Papež, Milan, Rektoris, Martin, Pevný, Tomáš, Šmídl, Václav

arXiv.org Artificial Intelligence

Daily internet communication relies heavily on tree-structured graphs, embodied by popular data formats such as XML and JSON. However, many recent generative (probabilistic) models utilize neural networks to learn a probability distribution over undirected cyclic graphs. This assumption of a generic graph structure brings various computational challenges, and, more importantly, the presence of non-linearities in neural networks does not permit tractable probabilistic inference. We address these problems by proposing sum-product-set networks, an extension of probabilistic circuits from unstructured tensor data to tree-structured graph data. To this end, we use random finite sets to reflect a variable number of nodes and edges in the graph and to allow for exact and efficient inference. We demonstrate that our tractable model performs comparably to various intractable models based on neural networks.


Probabilistic Neural Circuits

Martires, Pedro Zuidberg Dos

arXiv.org Machine Learning

Probabilistic circuits (PCs) have gained prominence in recent years as a versatile framework for discussing probabilistic models that support tractable queries and are yet expressive enough to model complex probability distributions. Nevertheless, tractability comes at a cost: PCs are less expressive than neural networks. In this paper we introduce probabilistic neural circuits (PNCs), which strike a balance between PCs and neural nets in terms of tractability and expressive power. Theoretically, we show that PNCs can be interpreted as deep mixtures of Bayesian networks. Experimentally, we demonstrate that PNCs constitute powerful function approximators.


Tractable Probabilistic Graph Representation Learning with Graph-Induced Sum-Product Networks

Errica, Federico, Niepert, Mathias

arXiv.org Artificial Intelligence

We introduce Graph-Induced Sum-Product Networks (GSPNs), a new probabilistic framework for graph representation learning that can tractably answer probabilistic queries. Inspired by the computational trees induced by vertices in the context of message-passing neural networks, we build hierarchies of sum-product networks (SPNs) where the parameters of a parent SPN are learnable transformations of the a-posterior mixing probabilities of its children's sum units. Due to weight sharing and the tree-shaped computation graphs of GSPNs, we obtain the efficiency and efficacy of deep graph networks with the additional advantages of a purely probabilistic model. We show the model's competitiveness on scarce supervision scenarios, handling missing data, and graph classification in comparison to popular neural models. We complement the experiments with qualitative analyses on hyper-parameters and the model's ability to answer probabilistic queries.