AITopics | maximum entropy distribution

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

Andrej Risteski, Yuanzhi Li

Neural Information Processing SystemsMar-23-2026, 00:21:27 GMT

Neural Information Processing Systems http://nips.cc/

entropy, relaxation, variational method, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.44)

Add feedback

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

Neural Information Processing SystemsMar-17-2026, 06:33:39 GMT

The well known maximum-entropy principle due to Jaynes, which states that given mean parameters, the maximum entropy distribution matching them is in an exponential family has been very popular in machine learning due to its "Occam's razor" interpretation. Unfortunately, calculating the potentials in the maximum entropy distribution is intractable [BGS14]. We provide computationally efficient versions of this principle when the mean parameters are pairwise moments: we design distributions that approximately match given pairwise moments, while having entropy which is comparable to the maximum entropy distribution matching those moments. We additionally provide surprising applications of the approximate maximum entropy principle to designing provable variational methods for partition function calculations for Ising models without any assumptions on the potentials of the model. More precisely, we show that we can get approximation guarantees for the log-partition function comparable to those in the low-temperature limit, which is the setting of optimization of quadratic forms over the hypercube.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

Neural Information Processing SystemsNov-21-2025, 14:13:00 GMT

The well known maximum-entropy principle due to Jaynes, which states that given mean parameters, the maximum entropy distribution matching them is in an exponential family has been very popular in machine learning due to its "Occam's razor" interpretation. Unfortunately, calculating the potentials in the maximum entropy distribution is intractable [BGS14]. We provide computationally efficient versions of this principle when the mean parameters are pairwise moments: we design distributions that approximately match given pairwise moments, while having entropy which is comparable to the maximum entropy distribution matching those moments. We additionally provide surprising applications of the approximate maximum entropy principle to designing provable variational methods for partition function calculations for Ising models without any assumptions on the potentials of the model. More precisely, we show that we can get approximation guarantees for the log-partition function comparable to those in the low-temperature limit, which is the setting of optimization of quadratic forms over the hypercube.

application, approximate maximum entropy principle, provable variational method, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

3e5190eeb51ebe6c5bbc54ee8950c548-Supplemental.pdf

Neural Information Processing SystemsOct-2-2025, 17:56:58 GMT

artificial intelligence, augmentation, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Hyperbolic Coarse-to-Fine Few-Shot Class-Incremental Learning

Dai, Jiaxin, Xiang, Xiang

arXiv.org Machine LearningSep-24-2025

In the field of machine learning, hyperbolic space demonstrates superior representation capabilities for hierarchical data compared to conventional Euclidean space. This work focuses on the Coarse-To-Fine Few-Shot Class-Incremental Learning (C2FSCIL) task. Our study follows the Knowe approach, which contrastively learns coarse class labels and subsequently normalizes and freezes the classifier weights of learned fine classes in the embedding space. To better interpret the "coarse-to-fine" paradigm, we propose embedding the feature extractor into hyperbolic space. Specifically, we employ the Poincaré ball model of hyperbolic space, enabling the feature extractor to transform input images into feature vectors within the Poincaré ball instead of Euclidean space. We further introduce hyperbolic contrastive loss and hyperbolic fully-connected layers to facilitate model optimization and classification in hyperbolic space. Additionally, to enhance performance under few-shot conditions, we implement maximum entropy distribution in hyperbolic space to estimate the probability distribution of fine-class feature vectors. This allows generation of augmented features from the distribution to mitigate overfitting during training with limited samples. Experiments on C2FSCIL benchmarks show that our method effectively improves both coarse and fine class accuracies.

euclidean space, hyperbolic space, learning, (12 more...)

arXiv.org Machine Learning

2509.18504

Country:

Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Europe > Switzerland (0.04)
Asia > China (0.04)

Genre: Research Report (0.50)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.56)

Add feedback

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

Andrej Risteski, Yuanzhi Li

Neural Information Processing SystemsOct-6-2024, 11:13:15 GMT

The well known maximum-entropy principle due to Jaynes, which states that given mean parameters, the maximum entropy distribution matching them is in an exponential family has been very popular in machine learning due to its "Occam's razor" interpretation. Unfortunately, calculating the potentials in the maximumentropy distribution is intractable [BGS14]. We provide computationally efficient versions of this principle when the mean parameters are pairwise moments: we design distributions that approximately match given pairwise moments, while having entropy which is comparable to the maximum entropy distribution matching those moments. We additionally provide surprising applications of the approximate maximum entropy principle to designing provable variational methods for partition function calculations for Ising models without any assumptions on the potentials of the model. More precisely, we show that we can get approximation guarantees for the log-partition function comparable to those in the low-temperature limit, which is the setting of optimization of quadratic forms over the hypercube.

entropy, relaxation, variational method, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback

Out-of-Distribution Detection using Maximum Entropy Coding

Abolfazli, Mojtaba, Amirani, Mohammad Zaeri, Høst-Madsen, Anders, Zhang, June, Bratincsak, Andras

arXiv.org Artificial IntelligenceApr-25-2024

Given a default distribution $P$ and a set of test data $x^M=\{x_1,x_2,\ldots,x_M\}$ this paper seeks to answer the question if it was likely that $x^M$ was generated by $P$. For discrete distributions, the definitive answer is in principle given by Kolmogorov-Martin-L\"{o}f randomness. In this paper we seek to generalize this to continuous distributions. We consider a set of statistics $T_1(x^M),T_2(x^M),\ldots$. To each statistic we associate its maximum entropy distribution and with this a universal source coder. The maximum entropy distributions are subsequently combined to give a total codelength, which is compared with $-\log P(x^M)$. We show that this approach satisfied a number of theoretical properties. For real world data $P$ usually is unknown. We transform data into a standard distribution in the latent space using a bidirectional generate network and use maximum entropy coding there. We compare the resulting method to other methods that also used generative neural networks to detect anomalies. In most cases, our results show better performance.

detection, maximum entropy distribution, statistics, (14 more...)

arXiv.org Artificial Intelligence

2404.17023

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.68)

Add feedback

Learning from the Wisdom of Crowds by Minimax Entropy

Neural Information Processing SystemsMar-14-2024, 07:05:35 GMT

An important way to make large training sets is to gather noisy labels from crowds of nonexperts. We propose a minimax entropy principle to improve the quality of these labels. Our method assumes that labels are generated by a probability distribution over workers, items, and labels.

equation, ijk, ikl, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Washington > King County > Redmond (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Towards a Mathematical Theory of Abstraction

Millidge, Beren

arXiv.org Machine LearningJun-3-2021

While the utility of well-chosen abstractions for understanding and predicting the behaviour of complex systems is well appreciated, precisely what an abstraction $\textit{is}$ has so far has largely eluded mathematical formalization. In this paper, we aim to set out a mathematical theory of abstraction. We provide a precise characterisation of what an abstraction is and, perhaps more importantly, suggest how abstractions can be learnt directly from data both for static datasets and for dynamical systems. We define an abstraction to be a small set of `summaries' of a system which can be used to answer a set of queries about the system or its behaviour. The difference between the ground truth behaviour of the system on the queries and the behaviour of the system predicted only by the abstraction provides a measure of the `leakiness' of the abstraction which can be used as a loss function to directly learn abstractions from data. Our approach can be considered a generalization of classical statistics where we are not interested in reconstructing `the data' in full, but are instead only concerned with answering a set of arbitrary queries about the data. While highly theoretical, our results have deep implications for statistical inference and machine learning and could be used to develop explicit methods for learning precise kinds of abstractions directly from data.

abstraction, information, query, (14 more...)

arXiv.org Machine Learning

2106.01826

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Monterey County > Pacific Grove (0.04)
(2 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)

Add feedback

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

Risteski, Andrej, Li, Yuanzhi

Neural Information Processing SystemsFeb-14-2020, 16:14:23 GMT

The well known maximum-entropy principle due to Jaynes, which states that given mean parameters, the maximum entropy distribution matching them is in an exponential family has been very popular in machine learning due to its "Occam's razor" interpretation. Unfortunately, calculating the potentials in the maximum entropy distribution is intractable [BGS14]. We provide computationally efficient versions of this principle when the mean parameters are pairwise moments: we design distributions that approximately match given pairwise moments, while having entropy which is comparable to the maximum entropy distribution matching those moments. We additionally provide surprising applications of the approximate maximum entropy principle to designing provable variational methods for partition function calculations for Ising models without any assumptions on the potentials of the model. More precisely, we show that we can get approximation guarantees for the log-partition function comparable to those in the low-temperature limit, which is the setting of optimization of quadratic forms over the hypercube.

approximate maximum entropy principle, maximum entropy principle, provable variational method, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback

Filters

Collaborating Authors

maximum entropy distribution

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

3e5190eeb51ebe6c5bbc54ee8950c548-Supplemental.pdf

Hyperbolic Coarse-to-Fine Few-Shot Class-Incremental Learning

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

Out-of-Distribution Detection using Maximum Entropy Coding

Learning from the Wisdom of Crowds by Minimax Entropy

Towards a Mathematical Theory of Abstraction

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods