AITopics | Maximum Entropy

Collaborating Authors

Maximum Entropy

News Overviews Instructional Materials AI-Alerts Classics

Can we use gradient desent method in maximum entropy model?

#artificialintelligenceJul-14-2016, 08:30:37 GMT

I see a lot of implementations use GIS or IIS to train the maximum entropy model. Can we use gradient desent method? If we can use it, why most tutorial directly tell GIS or IIS methos, but do not show the simple gradient desent method to train maximum entropy model? As we know, softmax regression is equivalent to the maxent model, but I never heard GIS or IIS in softmax. Is there a toy code use simple gradient desent method to train maxent model?

artificial intelligence, gradient desent method, machine learning, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.99)

Add feedback

Approximate maximum entropy principles via Goemans-Williamson with applications to provable variational methods

Li, Yuanzhi, Risteski, Andrej

arXiv.org Machine LearningJul-12-2016

The well known maximum-entropy principle due to Jaynes, which states that given mean parameters, the maximum entropy distribution matching them is in an exponential family, has been very popular in machine learning due to its "Occam's razor" interpretation. Unfortunately, calculating the potentials in the maximum-entropy distribution is intractable \cite{bresler2014hardness}. We provide computationally efficient versions of this principle when the mean parameters are pairwise moments: we design distributions that approximately match given pairwise moments, while having entropy which is comparable to the maximum entropy distribution matching those moments. We additionally provide surprising applications of the approximate maximum entropy principle to designing provable variational methods for partition function calculations for Ising models without any assumptions on the potentials of the model. More precisely, we show that in every temperature, we can get approximation guarantees for the log-partition function comparable to those in the low-temperature limit, which is the setting of optimization of quadratic forms over the hypercube. \cite{alon2006approximating}

artificial intelligence, machine learning, relaxation, (17 more...)

arXiv.org Machine Learning

1607.0336

Country: Asia (0.15)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback

Propositional Probabilistic Reasoning at Maximum Entropy Modulo Theories

Wilhelm, Marco (Technische Universität Dortmund) | Kern-Isberner, Gabriele (Technische Universität Dortmund) | Ecke, Andreas (Technische Universität Dortmund)

AAAI ConferencesMay-8-2016

The principle of maximum entropy (MaxEnt principle) provides a valuable methodology for reasoning with probabilistic conditional knowledge bases realizing an idea of information economy in the sense of adding a minimal amount of assumed information. The conditional structure of such a knowledge base allows for classifying possible worlds regarding their influence on the MaxEnt distribution. In this paper, we present an algorithm that determines these equivalence classes and computes their cardinality by performing satisfiability tests of propositional formulas built upon the premises and conclusions of the conditionals. An example illustrates how the output of our algorithm can be used to simplify calculations when drawing nonmonotonic inferences under maximum entropy. For this, we use a characterization of the MaxEnt distribution in terms of conditional structure that completely abstracts from the propositional logic underlying the conditionals.

maximum entropy modulo theory, propositional probabilistic reasoning

AAAI Conferences

The Twenty-Ninth International Flairs Conference

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.80)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Add feedback

Regression, Logistic Regression and Maximum Entropy

#artificialintelligenceApr-6-2016, 23:30:59 GMT

One of the most important tasks in Machine Learning are the Classification tasks (a.k.a. Classification is used to make an accurate prediction of the class of entries in the test set (a dataset of which the entries have not been labelled yet) with the model which was constructed from a training set. You could think of classifying crime in the field of Pre-Policing, classifying patients in the Health sector, classifying houses in the Real-Estate sector. Another field in which classification is big, is Natural Lanuage Processing (NLP). This is the field of science with the goal to makes machines (computers) understand (written) human language.

artificial intelligence, logistic regression, machine learning, (15 more...)

#artificialintelligence

Genre:

Research Report > New Finding (0.58)
Research Report > Experimental Study (0.58)

Industry: Banking & Finance > Real Estate (0.56)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.44)

Add feedback

Confidence-Constrained Maximum Entropy Framework for Learning from Multi-Instance Data

Behmardi, Behrouz, Briggs, Forrest, Fern, Xiaoli Z., Raich, Raviv

arXiv.org Machine LearningMar-6-2016

Multi-instance data, in which each object (bag) contains a collection of instances, are widespread in machine learning, computer vision, bioinformatics, signal processing, and social sciences. We present a maximum entropy (ME) framework for learning from multi-instance data. In this approach each bag is represented as a distribution using the principle of ME. We introduce the concept of confidence-constrained ME (CME) to simultaneously learn the structure of distribution space and infer each distribution. The shared structure underlying each density is used to learn from instances inside each bag. The proposed CME is free of tuning parameters. We devise a fast optimization algorithm capable of handling large scale multi-instance data. In the experimental section, we evaluate the performance of the proposed approach in terms of exact rank recovery in the space of distributions and compare it with the regularized ME approach. Moreover, we compare the performance of CME with Multi-Instance Learning (MIL) state-of-the-art algorithms and show a comparable performance in terms of accuracy with reduced computational complexity.

health & medicine, oncology, rmde, (18 more...)

arXiv.org Machine Learning

1603.01901

Country: North America > United States > Oregon (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.63)

Add feedback

Maximum Entropy Kernels for System Identification

Carli, Francesca Paola, Chen, Tianshi, Ljung, Lennart

arXiv.org Machine LearningJan-15-2016

A new nonparametric approach for system identification has been recently proposed where the impulse response is modeled as the realization of a zero-mean Gaussian process whose covariance (kernel) has to be estimated from data. In this scheme, quality of the estimates crucially depends on the parametrization of the covariance of the Gaussian process. A family of kernels that have been shown to be particularly effective in the system identification framework is the family of Diagonal/Correlated (DC) kernels. Maximum entropy properties of a related family of kernels, the Tuned/Correlated (TC) kernels, have been recently pointed out in the literature. In this paper we show that maximum entropy properties indeed extend to the whole family of DC kernels. The maximum entropy interpretation can be exploited in conjunction with results on matrix completion problems in the graphical models literature to shed light on the structure of the DC kernel. In particular, we prove that the DC kernel admits a closed-form factorization, inverse and determinant. These results can be exploited both to improve the numerical stability and to reduce the computational complexity associated with the computation of the DC estimator.

artificial intelligence, kernel, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1109/TAC.2016.2582642

1411.562

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback

Unification of field theory and maximum entropy methods for learning probability densities

Kinney, Justin B.

arXiv.org Machine LearningJul-28-2015

The need to estimate smooth probability distributions (a.k.a. probability densities) from finite sampled data is ubiquitous in science. Many approaches to this problem have been described, but none is yet regarded as providing a definitive solution. Maximum entropy estimation and Bayesian field theory are two such approaches. Both have origins in statistical physics, but the relationship between them has remained unclear. Here I unify these two methods by showing that every maximum entropy density estimate can be recovered in the infinite smoothness limit of an appropriate Bayesian field theory. I also show that Bayesian field theory estimation can be performed without imposing any boundary conditions on candidate densities, and that the infinite smoothness limit of these theories recovers the most common types of maximum entropy estimates. Bayesian field theory is thus seen to provide a natural test of the validity of the maximum entropy null hypothesis. Bayesian field theory also returns a lower entropy density estimate when the maximum entropy hypothesis is falsified. The computations necessary for this approach can be performed rapidly for one-dimensional data, and software for doing this is provided. Based on these results, I argue that Bayesian field theory is poised to provide a definitive solution to the density estimation problem in one dimension.

artificial intelligence, boundary condition, machine learning, (18 more...)

arXiv.org Machine Learning

doi: 10.1103/PhysRevE.92.032107

1411.5371

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

On the Maximum Entropy Property of the First-Order Stable Spline Kernel and its Implications

Carli, Francesca Paola

arXiv.org Machine LearningSep-21-2014

A new nonparametric approach for system identification has been recently proposed where the impulse response is seen as the realization of a zero--mean Gaussian process whose covariance, the so--called stable spline kernel, guarantees that the impulse response is almost surely stable. Maximum entropy properties of the stable spline kernel have been pointed out in the literature. In this paper we provide an independent proof that relies on the theory of matrix extension problems in the graphical model literature and leads to a closed form expression for the inverse of the first order stable spline kernel as well as to a new factorization in the form $UWU^\top$ with $U$ upper triangular and $W$ diagonal. Interestingly, all first--order stable spline kernels share the same factor $U$ and $W$ admits a closed form representation in terms of the kernel hyperparameter, making the factorization computationally inexpensive. Maximum likelihood properties of the stable spline kernel are also highlighted. These results can be applied both to improve the stability and to reduce the computational complexity associated with the computation of stable spline estimators.

artificial intelligence, machine learning, stable spline kernel, (16 more...)

arXiv.org Machine Learning

doi: 10.1109/CCA.2014.6981380

1406.5706

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

A Novel Methodology for Processing Probabilistic Knowledge Bases Under Maximum Entropy

Kern-Isberner, Gabriele (TU Dortmund University) | Wilhelm, Marco (TU Dortmund University) | Beierle, Christoph (University of Hagen)

AAAI ConferencesMay-7-2014

Probabilistic reasoning under the so-called principle of maximum entropy is a viable and convenient alternative to Bayesian networks, relieving the user from providing complete (local) probabilistic information and observing rigorous conditional independence assumptions. In this paper, we present a novel approach to performing computational MaxEnt reasoning that makes use of symbolic computations instead of graph-based techniques. Given a probabilistic knowledge base, we encode the MaxEnt optimization problem into a system of polynomial equations, and then apply Gröbner basis theory to find MaxEnt inferences as solutions to the polynomials. We illustrate our approach with an example of a knowledge base that represents findings on fraud detection in enterprises.

maximum entropy, novel methodology, processing probabilistic knowledge base

AAAI Conferences

The Twenty-Seventh International Flairs Conference

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.60)

Add feedback

Implementation of a Transformation System for Relational Probabilistic Knowledge Bases Simplifying the Maximum Entropy Model Computation

Beierle, Christoph (University of Hagen) | Höhnerbach, Markus (University of Hagen) | Marto, Marcus (University of Hagen)

AAAI ConferencesMay-7-2014

The maximum entropy (ME) model of a knowledge base R consisting of relational probabilistic conditionals can be defined referring to the set of all ground instances of the conditionals. The logic FO-PCL employs the notion of parametric uniformity for avoiding the full grounding of R. We present an implementation of a rule system transforming R into a knowledge base that is parametrically uniform and has the same ME model, simplifying the ME model computation. The implementation provides different execution and evaluation modes, including the generation of all possible solutions.

maximum entropy model computation, relational probabilistic knowledge base simplifying, transformation system, (1 more...)

AAAI Conferences

The Twenty-Seventh International Flairs Conference

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (0.80)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.60)

Add feedback