AITopics | Maximum Entropy

Collaborating Authors

Maximum Entropy

News Overviews Instructional Materials AI-Alerts Classics

The Maximum Entropy Relaxation Path

Dubiner, Moshe, Gavish, Matan, Singer, Yoram

arXiv.org Machine LearningNov-7-2013

The relaxed maximum entropy problem is concerned with finding a probability distribution on a finite set that minimizes the relative entropy to a given prior distribution, while satisfying relaxed max-norm constraints with respect to a third observed multinomial distribution. We study the entire relaxation path for this problem in detail. We show existence and a geometric description of the relaxation path. Specifically, we show that the maximum entropy relaxation path admits a planar geometric description as an increasing, piecewise linear function in the inverse relaxation parameter. We derive fast algorithms for tracking the path. In various realistic settings, our algorithms require $O(n\log(n))$ operations for probability distributions on $n$ points, making it possible to handle large problems. Once the path has been recovered, we show that given a validation set, the family of admissible models is reduced from an infinite family to a small, discrete set. We demonstrate the merits of our approach in experiments with synthetic data and discuss its potential for the estimation of compact n-gram language models.

artificial intelligence, machine learning, relaxation path, (16 more...)

arXiv.org Machine Learning

1311.1644

Country: North America > United States > California > Santa Clara County (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.82)

Add feedback

Multi-View Maximum Entropy Discrimination

Sun, Shiliang (East China Normal University) | Chao, Guoqing (East China Normal University)

AAAI ConferencesAug-3-2013

Maximum entropy discrimination (MED) is a general framework for discriminative estimation based on the well known maximum entropy principle, which embodies the Bayesian integration of prior information with large margin constraints on observations. It is a successful combination of maximum entropy learning and maximum margin learning, and can subsume support vector machines (SVMs) as a special case. In this paper, we present a multi-view maximum entropy discrimination framework that is an extension of MED to the scenario of learning with multiple feature sets. Different from existing approaches to exploiting multiple views, such as co-training style algorithms and co-regularization style algorithms, we propose a new method to make use of the distinct views where classification margins from these views are required to be identical. We give the general form of the solution to the multi-view maximum entropy discrimination, and provide an instantiation under a specific prior formulation which is analogical to a multi-view version of SVMs. Experimental results on real-world data sets show the effectiveness of the proposed multi-view maximum entropy discrimination approach.

multi-view maximum entropy discrimination

AAAI Conferences

Twenty-Third International Joint Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback

Uncertain Reasoning Using Maximum Entropy Inference

Hunter, Daniel

arXiv.org Artificial IntelligenceMar-27-2013

The use of maximum entropy inference in reasoning with uncertain information is commonly justified by an information-theoretic argument. This paper discusses a possible objection to this information-theoretic justification and shows how it can be met. I then compare maximum entropy inference with certain other currently popular methods for uncertain reasoning. In making such a comparison, one must distinguish between static and dynamic theories of degrees of belief: a static theory concerns the consistency conditions for degrees of belief at a given time; whereas a dynamic theory concerns how one's degrees of belief should change in the light of new information. It is argued that maximum entropy is a dynamic theory and that a complete theory of uncertain reasoning can be gotten by combining maximum entropy inference with probability theory, which is a static theory. This total theory, I argue, is much better grounded than are other theories of uncertain reasoning.

artificial intelligence, information, machine learning, (14 more...)

arXiv.org Artificial Intelligence

1304.342

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback

MESA: Maximum Entropy by Simulated Annealing

Paaß, Gerhard

arXiv.org Artificial IntelligenceMar-13-2013

Probabilistic reasoning systems combine different probabilistic rules and probabilistic facts to arrive at the desired probability values of consequences. In this paper we describe the MESA-algorithm (Maximum Entropy by Simulated Annealing) that derives a joint distribution of variables or propositions. It takes into account the reliability of probability values and can resolve conflicts between contradictory statements. The joint distribution is represented in terms of marginal distributions and therefore allows to process large inference networks and to determine desired probability values with high precision. The procedure derives a maximum entropy distribution subject to the given constraints. It can be applied to inference networks of arbitrary topology and may be extended into a number of directions.

artificial intelligence, machine learning, probability, (16 more...)

arXiv.org Artificial Intelligence

1303.5422

Country: North America > United States (0.68)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.63)

Add feedback

Efficient high dimensional maximum entropy modeling via symmetric partition functions

Vernaza, Paul, Bagnell, Drew

Neural Information Processing SystemsDec-31-2012

The application of the maximum entropy principle to sequence modeling has been popularized by methods such as Conditional Random Fields (CRFs). However, these approaches are generally limited to modeling paths in discrete spaces of low dimensionality. We consider the problem of modeling distributions over paths in continuous spaces of high dimensionality---a problem for which inference is generally intractable. Our main contribution is to show that maximum entropy modeling of high-dimensional, continuous paths is tractable as long as the constrained features possess a certain kind of low dimensional structure. In this case, we show that the associated {\em partition function} is symmetric and that this symmetry can be exploited to compute the partition function efficiently in a compressed form. Empirical results are given showing an application of our method to maximum entropy modeling of high dimensional human motion capture data.

artificial intelligence, machine learning, partition function, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback

Boltzmann Machine Learning with the Latent Maximum Entropy Principle

Wang, Shaojun, Schuurmans, Dale, Peng, Fuchun, Zhao, Yunxin

arXiv.org Machine LearningOct-19-2012

We present a new statistical learning paradigm for Boltzmann machines based on a new inference principle we have proposed: the latent maximum entropy principle (LME). LME is different both from Jaynes maximum entropy principle and from standard maximum likelihood estimation.We demonstrate the LME principle BY deriving new algorithms for Boltzmann machine parameter estimation, and show how robust and fast new variant of the EM algorithm can be developed.Our experiments show that estimation based on LME generally yields better results than maximum likelihood estimation, particularly when inferring hidden units from small amounts of data.

algorithm, artificial intelligence, bayesian inference, (17 more...)

arXiv.org Machine Learning

1212.2514

Country:

North America > United States (0.46)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.84)

Add feedback

Mixture-of-Parents Maximum Entropy Markov Models

Rosenberg, David S., Klein, Dan, Taskar, Ben

arXiv.org Artificial IntelligenceJun-20-2012

We present the mixture-of-parents maximum entropy Markov model (MoP-MEMM), a class of directed graphical models extending MEMMs. The MoP-MEMM allows tractable incorporation of long-range dependencies between nodes by restricting the conditional distribution of each node to be a mixture of distributions given the parents. We show how to efficiently compute the exact marginal posterior node distributions, regardless of the range of the dependencies. This enables us to model non-sequential correlations present within text documents, as well as between interconnected documents, such as hyperlinked web pages. We apply the MoP-MEMM to a named entity recognition task and a web page classification task. In each, our model shows significant improvement over the basic MEMM, and is competitive with other long-range sequence models that use approximate inference.

artificial intelligence, mixture-of-parent maximum entropy markov model, text processing, (2 more...)

arXiv.org Artificial Intelligence

1206.5261

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.60)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.53)

Add feedback

A Graphical Model Formulation of Collaborative Filtering Neighbourhood Methods with Fast Maximum Entropy Training

Defazio, Aaron, Caetano, Tiberio

arXiv.org Machine LearningJun-18-2012

Item neighbourhood methods for collaborative filtering learn a weighted graph over the set of items, where each item is connected to those it is most similar to. The prediction of a user's rating on an item is then given by that rating of neighbouring items, weighted by their similarity. This paper presents a new neighbourhood approach which we call item fields, whereby an undirected graphical model is formed over the item graph. The resulting prediction rule is a simple generalization of the classical approaches, which takes into account non-local information in the graph, allowing its best results to be obtained when using drastically fewer edges than other neighbourhood approaches. A fast approximate maximum entropy training method based on the Bethe approximation is presented, which uses a simple gradient ascent procedure. When using precomputed sufficient statistics on the Movielens datasets, our method is faster than maximum likelihood approaches by two orders of magnitude.

artificial intelligence, machine learning, neighbourhood method, (16 more...)

arXiv.org Machine Learning

1206.4622

Country: Asia > Middle East > Israel > Mediterranean Sea (0.27)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.64)

Add feedback

Asymptotic Maximum Entropy Principle for Utility Elicitation under High Uncertainty and Partial Information

Hadfi, Rafik (Nagoya Institute of Technology) | Ito, Takayuki (Nagoya Institute of Technology)

AAAI ConferencesMay-20-2012

Decision making has proposed multiple methods to help the decision maker in his analysis, by suggesting ways of formalization of the preferences as well as the assessment of the uncertainties. Although these techniques are established and proven to be mathematically sound, experience has shown that in certain situations we tend to avoid the formal approach by acting intuitively. Especially, when the decision involves a large number of attributes and outcomes, and where we need to use pragmatic and heuristic simplifications such as considering only the most important attributes and omitting the others. In this paper, we provide a model for decision making in situations subject to a large predictive uncertainty with a small learning sample. The high predictive uncertainty is concretized by a countably infinite number of prospects, making the preferences assessment more difficult. Our main result is an extension of the Maximum Entropy utility (MEU) principle into an asymptotic maximum entropy utility principle for preferences elicitation. This will allow us to overcome the limits of the existing MEU method to the extend that we focus on utility assessment when the set of the available discrete prospects is countably infinite. Furthermore, our proposed model can be used to analyze situations of high-cognitive load as well as to understand how humans handle these problems under Ceteris Paribus assumption.

artificial intelligence, machine learning, utility increment vector, (15 more...)

AAAI Conferences

Twenty-Fifth International FLAIRS Conference

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.81)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

How biased are maximum entropy models?

Macke, Jakob H., Murray, Iain, Latham, Peter E.

Neural Information Processing SystemsDec-31-2011

Maximum entropy models have become popular statistical models in neuroscience and other areas in biology, and can be useful tools for obtaining estimates of mu- tual information in biological systems. However, maximum entropy models fit to small data sets can be subject to sampling bias; i.e. the true entropy of the data can be severely underestimated. Here we study the sampling properties of estimates of the entropy obtained from maximum entropy models. We show that if the data is generated by a distribution that lies in the model class, the bias is equal to the number of parameters divided by twice the number of observations. However, in practice, the true distribution is usually outside the model class, and we show here that this misspecification can lead to much larger bias. We provide a perturba- tive approximation of the maximally expected bias when the true model is out of model class, and we illustrate our results using numerical simulations of an Ising model; i.e. the second-order maximum entropy distribution on binary data.

health & medicine, neurology, normalized bias, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.35)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback