AITopics | log likelihood

The softmax representation of probabilities for categorical variables plays a prominent role in modern machine learning with numerous applications in areas such as large scale classification, neural language modeling and recommendation systems. However, softmax estimation is very expensive for large scale inference because of the high cost associated with computing the normalizing constant. Here, we introduce an efficient approximation to softmax probabilities which takes the form of a rigorous lower bound on the exact probability. This bound is expressed as a product over pairwise probabilities and it leads to scalable estimation based on stochastic optimization. It allows us to perform doubly stochastic estimation by subsampling both training instances and class labels. We show that the new bound has interesting theoretical properties and we demonstrate its use in classification problems.

Add feedback

Online Bayesian Moment Matching for Topic Modeling with Unknown Number of Topics

Wei-Shou Hsu, Pascal Poupart

Neural Information Processing SystemsFeb-18-2026, 21:24:27 GMT

Neural Information Processing Systems http://nips.cc/

likelihood, online hdp, posterior, (13 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Effective Bayesian Heteroscedastic Regression with Deep Neural Networks Alexander Immer

Neural Information Processing SystemsFeb-16-2026, 10:17:45 GMT

Further, we emphasize the significance of principled regularization of the network parameters and prediction.

approximation, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback

Proximal Graphical Event Models

Debarun Bhattacharjya, Dharmashankar Subramanian, Tian Gao

Neural Information Processing SystemsFeb-15-2026, 05:06:21 GMT

Neural Information Processing Systems http://nips.cc/

dataset, event dataset, pgem, (14 more...)

Neural Information Processing Systems

Country:

South America > Brazil (0.04)
North America > Mexico (0.04)
South America > Colombia (0.04)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)

Add feedback

Projected Stein Variational Newton: A Fast and Scalable Bayesian Inference Method in High Dimensions

Peng Chen, Keyi Wu, Joshua Chen, Tom O'Leary-Roseberry, Omar Ghattas

Neural Information Processing SystemsFeb-15-2026, 00:02:00 GMT

Contributions: In this work, we develop a projected Stein variational Newton method (pSVN) to tackle the challenge of high-dimensional Bayesian inference by exploiting the intrinsic lowdimensional geometric structure of the posterior distribution (where it departs from the prior), as characterized by the dominant spectrum of the prior-preconditioned Hessian of the negative log likelihood.

artificial intelligence, arxivpreprintarxiv, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: