AITopics | variational parameter

The softmax representation of probabilities for categorical variables plays a prominent role in modern machine learning with numerous applications in areas such as large scale classification, neural language modeling and recommendation systems. However, softmax estimation is very expensive for large scale inference because of the high cost associated with computing the normalizing constant. Here, we introduce an efficient approximation to softmax probabilities which takes the form of a rigorous lower bound on the exact probability. This bound is expressed as a product over pairwise probabilities and it leads to scalable estimation based on stochastic optimization. It allows us to perform doubly stochastic estimation by subsampling both training instances and class labels. We show that the new bound has interesting theoretical properties and we demonstrate its use in classification problems.

Add feedback

Scaling Factorial Hidden Markov Models: Stochastic Variational Inference without Messages

Yin Cheng Ng, Pawel M. Chilinski, Ricardo Silva

Neural Information Processing SystemsMar-23-2026, 13:07:20 GMT

Factorial Hidden Markov Models (FHMMs) are powerful models for sequential data but they do not scale well with long sequences. We propose a scalable inference and learning algorithm for FHMMs that draws on ideas from the stochastic variational inference, neural network and copula literatures. Unlike existing approaches, the proposed algorithm requires no message passing procedure among latent variables and can be distributed to a network of computers to speed up learning. Our experiments corroborate that the proposed algorithm does not introduce further approximation bias compared to the proven structured mean-field algorithm, and achieves better performance with long sequences and large FHMMs.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Iterative Refinement of the Approximate Posterior for Directed Belief Networks

Devon Hjelm, Russ R. Salakhutdinov, Kyunghyun Cho, Nebojsa Jojic, Vince Calhoun, Junyoung Chung

Neural Information Processing SystemsMar-23-2026, 03:37:03 GMT

Variational methods that rely on a recognition network to approximate the posterior of directed graphical models offer better inference and learning than previous methods. Recent advances that exploit the capacity and flexibility in this approach have expanded what kinds of models can be trained. However, as a proposal for the posterior, the capacity of the recognition network is limited, which can constrain the representational power of the generative model and increase the variance of Monte Carlo estimates. To address these issues, we introduce an iterative refinement procedure for improving the approximate posterior of the recognition network and show that training with the refined posterior is competitive with state-of-the-art methods. The advantages of refinement are further evident in an increased effective sample size, which implies a lower variance of gradient estimates.

artificial intelligence, deep learning, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Variational Bayes on Monte Carlo Steroids

Aditya Grover, Stefano Ermon

Neural Information Processing SystemsMar-23-2026, 01:23:19 GMT

Neural Information Processing Systems http://nips.cc/

Add feedback

d9dc5573f7368201d6409e07e882aa77-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-17-2026, 10:55:27 GMT

artificial intelligence, machine learning, objective, (14 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

Appendix

Neural Information Processing SystemsFeb-13-2026, 17:56:53 GMT

I{ } is the indicator function. It's sufficient to prove that the denominator converges to that of softmax at each point We have shown that softmax is translational invariant w.r.t. Without the loss of generality, we use τ = 1 in the following proof. To begin with, we prove the first equation and then give the proof of the second part of Theorem 3.3. We introduce some extra notations that are used throughout the proof.

artificial intelligence, likelihood, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback