Goto

Collaborating Authors

 product unit






Algorithm 1 S

Neural Information Processing Systems

This section introduces the algorithmic construction of gadget circuits that will be adopted in our proofs of tractability as well as hardness. A construction algorithm for the support circuit is provided in Alg. 1. This construction is summarized in Alg. 2. It is a key component in the algorithms for many tractable We define a circuit representation of the #3SA T problem, following the construction in Khosravi et al. This section formally presents the tractability and hardness results w.r.t. The hardness of the sum of two circuits to yield a deterministic circuit has been proven by Shen et al.



Deep residual learning with product units

arXiv.org Artificial Intelligence

We propose a deep product-unit residual neural network (PURe) that integrates product units into residual blocks to improve the expressiveness and parameter efficiency of deep convolutional networks. Unlike standard summation neurons, product units enable multiplicative feature interactions, potentially offering a more powerful representation of complex patterns. PURe replaces conventional convolutional layers with 2D product units in the second layer of each residual block, eliminating nonlinear activation functions to preserve structural information. We validate PURe on three benchmark datasets. On Galaxy10 DECaLS, PURe34 achieves the highest test accuracy of 84.89%, surpassing the much deeper ResNet152, while converging nearly five times faster and demonstrating strong robustness to Poisson noise. On ImageNet, PURe architectures outperform standard ResNet models at similar depths, with PURe34 achieving a top-1 accuracy of 80.27% and top-5 accuracy of 95.78%, surpassing deeper ResNet variants (ResNet50, ResNet101) while utilizing significantly fewer parameters and computational resources. On CIFAR-10, PURe consistently outperforms ResNet variants across varying depths, with PURe272 reaching 95.01% test accuracy, comparable to ResNet1001 but at less than half the model size. These results demonstrate that PURe achieves a favorable balance between accuracy, efficiency, and robustness. Compared to traditional residual networks, PURe not only achieves competitive classification performance with faster convergence and fewer parameters, but also demonstrates greater robustness to noise. Its effectiveness across diverse datasets highlights the potential of product-unit-based architectures for scalable and reliable deep learning in computer vision.


Reviews: Linear Time Computation of Moments in Sum-Product Networks

Neural Information Processing Systems

I thank the authors for their response. As mentioned in my review, I also see the contributions of this paper on an algorithmic level. Nevertheless I would encourage the authors to at least comment on the existing empirical results to give readers a more complete picture. I like the contributions of this paper and I think the proposed algorithms are novel (given that the authors are the same as those of an arXiv paper from a couple of months ago) and could turn out to be useful. On the other hand, the contributions of the paper are somewhat limited.


Sum of Squares Circuits

arXiv.org Artificial Intelligence

Designing expressive generative models that support exact and efficient inference is a core question in probabilistic ML. Probabilistic circuits (PCs) offer a framework where this tractability-vs-expressiveness trade-off can be analyzed theoretically. Recently, squared PCs encoding subtractive mixtures via negative parameters have emerged as tractable models that can be exponentially more expressive than monotonic PCs, i.e., PCs with positive parameters only. In this paper, we provide a more precise theoretical characterization of the expressiveness relationships among these models. First, we prove that squared PCs can be less expressive than monotonic ones. Second, we formalize a novel class of PCs -- sum of squares PCs -- that can be exponentially more expressive than both squared and monotonic PCs. Around sum of squares PCs, we build an expressiveness hierarchy that allows us to precisely unify and separate different tractable model classes such as Born Machines and PSD models, and other recently introduced tractable probabilistic models by using complex parameters. Finally, we empirically show the effectiveness of sum of squares circuits in performing distribution estimation.


Sum-Product-Set Networks: Deep Tractable Models for Tree-Structured Graphs

arXiv.org Artificial Intelligence

Daily internet communication relies heavily on tree-structured graphs, embodied by popular data formats such as XML and JSON. However, many recent generative (probabilistic) models utilize neural networks to learn a probability distribution over undirected cyclic graphs. This assumption of a generic graph structure brings various computational challenges, and, more importantly, the presence of non-linearities in neural networks does not permit tractable probabilistic inference. We address these problems by proposing sum-product-set networks, an extension of probabilistic circuits from unstructured tensor data to tree-structured graph data. To this end, we use random finite sets to reflect a variable number of nodes and edges in the graph and to allow for exact and efficient inference. We demonstrate that our tractable model performs comparably to various intractable models based on neural networks.