AITopics | Ruozzi, Nicholas

Collaborating Authors

Ruozzi, Nicholas

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Sparse Approximate Conic Hulls

Buskirk, Greg Van, Raichel, Benjamin, Ruozzi, Nicholas

Neural Information Processing SystemsDec-31-2017

We consider the problem of computing a restricted nonnegative matrix factorization (NMF) of an m\times n matrix X. Specifically, we seek a factorization X\approx BC, where the k columns of B are a subset of those from X and C\in\Re_{\geq 0}^{k\times n}. Equivalently, given the matrix X, consider the problem of finding a small subset, S, of the columns of X such that the conic hull of S \eps-approximates the conic hull of the columns of X, i.e., the distance of every column of X to the conic hull of the columns of S should be at most an \eps-fraction of the angular diameter of X. If k is the size of the smallest \eps-approximation, then we produce an O(k/\eps^{2/3}) sized O(\eps^{1/3})-approximation, yielding the first provable, polynomial time \eps-approximation for this class of NMF problems, where also desirably the approximation is independent of n and m. Furthermore, we prove an approximate conic Carathéodory theorem, a general sparsity result, that shows that any column of X can be \eps-approximated with an O(1/\eps^2) sparse combination from S. Our results are facilitated by a reduction to the problem of approximating convex hulls, and we prove that both the convex and conic hull variants are d-sum-hard, resolving an open problem. Finally, we provide experimental results for the convex and conic algorithms on a variety of feature selection tasks.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Texas (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

On Parameter Tying by Quantization

Chou, Li (The University of Texas at Dallas) | Sarkhel, Somdeb (The University of Texas at Dallas) | Ruozzi, Nicholas (The University of Texas at Dallas) | Gogate, Vibhav (The University of Texas at Dallas)

AAAI ConferencesApr-19-2016

The maximum likelihood estimator (MLE) is generally asymptotically consistent but is susceptible to over-fitting. To combat this problem, regularization methods which reduce the variance at the cost of (slightly) increasing the bias are often employed in practice. In this paper, we present an alternative variance reduction (regularization) technique that quantizes the MLE estimates as a post processing step, yielding a smoother model having several tied parameters. We provide and prove error bounds for our new technique and demonstrate experimentally that it often yields models having higher test-set log-likelihood than the ones learned using the MLE. We also propose a new importance sampling algorithm for fast approximate inference in models having several tied parameters. Our experiments show that our new inference algorithm is superior to existing approaches such as Gibbs sampling and MC-SAT on models having tied parameters, learned using our quantization-based approach.

Add feedback

Exactness of Approximate MAP Inference in Continuous MRFs

Ruozzi, Nicholas

Neural Information Processing SystemsDec-31-2015

Computing the MAP assignment in graphical models is generally intractable. As a result, for discrete graphical models, the MAP problem is often approximated using linear programming relaxations. Much research has focused on characterizing when these LP relaxations are tight, and while they are relatively well-understood in the discrete case, only a few results are known for their continuous analog. In this work, we use graph covers to provide necessary and sufficient conditions for continuous MAP relaxations to be tight. We use this characterization to give simple proofs that the relaxation is tight for log-concave decomposable and log-supermodular decomposable models. We conclude by exploring the relationship between these two seemingly distinct classes of functions and providing specific conditions under which the MAP relaxation can and cannot be tight.

artificial intelligence, optimization problem, relaxation, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Quebec (0.14)
North America > United States > Texas (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.35)

Add feedback

Bethe Learning of Conditional Random Fields via MAP Decoding

Tang, Kui, Ruozzi, Nicholas, Belanger, David, Jebara, Tony

arXiv.org Machine LearningMar-4-2015

Many machine learning tasks can be formulated in terms of predicting structured outputs. In frameworks such as the structured support vector machine (SVM-Struct) and the structured perceptron, discriminative functions are learned by iteratively applying efficient maximum a posteriori (MAP) decoding. However, maximum likelihood estimation (MLE) of probabilistic models over these same structured spaces requires computing partition functions, which is generally intractable. This paper presents a method for learning discrete exponential family models using the Bethe approximation to the MLE. Remarkably, this problem also reduces to iterative (MAP) decoding. This connection emerges by combining the Bethe approximation with a Frank-Wolfe (FW) algorithm on a convex dual objective which circumvents the intractable partition function. The result is a new single loop algorithm MLE-Struct, which is substantially more efficient than previous double-loop methods for approximate maximum likelihood estimation. Our algorithm outperforms existing methods in experiments involving image segmentation, matching problems from vision, and a new dataset of university roommate assignments.

algorithm, bayesian inference, educational setting, (20 more...)

arXiv.org Machine Learning

1503.01228

Country: North America > Canada (0.14)

Genre: Research Report (0.50)

Industry: Energy (0.46)

Add feedback

Making Pairwise Binary Graphical Models Attractive

Ruozzi, Nicholas, Jebara, Tony

Neural Information Processing SystemsDec-31-2014

Computing the partition function (i.e., the normalizing constant) of a given pairwise binary graphical model is NP-hard in general. As a result, the partition function is typically estimated by approximate inference algorithms such as belief propagation (BP) and tree-reweighted belief propagation (TRBP). The former provides reasonable estimates in practice but has convergence issues. The later has better convergence properties but typically provides poorer estimates. In this work, we propose a novel scheme that has better convergence properties than BP and provably provides better partition function estimates in many instances than TRBP. In particular, given an arbitrary pairwise binary graphical model, we construct a specific ``attractive'' 2-cover. We explore the properties of this special cover and show that it can be used to construct an algorithm with the desired properties.

artificial intelligence, optimization problem, partition function, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.14)
North America > United States > Virginia (0.14)
North America > United States > Oregon (0.14)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.34)

Add feedback

The Bethe Partition Function of Log-supermodular Graphical Models

Ruozzi, Nicholas

Neural Information Processing SystemsDec-31-2012

Sudderth, Wainwright, and Willsky conjectured that the Bethe approximation corresponding to any fixed point of the belief propagation algorithm over an attractive, pairwise binary graphical model provides a lower bound on the true partition function. In this work, we resolve this conjecture in the affirmative by demonstrating that, for any graphical model with binary variables whose potential functions (not necessarily pairwise) are all log-supermodular, the Bethe partition function always lower bounds the true partition function. The proof of this result follows from a new variant of the “four functions” theorem that may be of independent interest.

artificial intelligence, partition function, theorem, (13 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Europe > Switzerland (0.14)

Technology: Information Technology > Artificial Intelligence > Systems & Languages (0.84)

Add feedback

Message-Passing Algorithms for Quadratic Minimization

Ruozzi, Nicholas, Tatikonda, Sekhar

arXiv.org Machine LearningDec-1-2012

Gaussian belief propagation (GaBP) is an iterative algorithm for computing the mean of a multivariate Gaussian distribution, or equivalently, the minimum of a multivariate positive definite quadratic function. Sufficient conditions, such as walk-summability, that guarantee the convergence and correctness of GaBP are known, but GaBP may fail to converge to the correct solution given an arbitrary positive definite quadratic function. As was observed in previous work, the GaBP algorithm fails to converge if the computation trees produced by the algorithm are not positive definite. In this work, we will show that the failure modes of the GaBP algorithm can be understood via graph covers, and we prove that a parameterized generalization of the min-sum algorithm can be used to ensure that the computation trees remain positive definite whenever the input matrix is positive definite. We demonstrate that the resulting algorithm is closely related to other iterative schemes for quadratic minimization such as the Gauss-Seidel and Jacobi algorithms. Finally, we observe, empirically, that there always exists a choice of parameters such that the above generalization of the GaBP algorithm converges.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Machine Learning

1212.0171

Country:

Europe > Switzerland (0.14)
North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.49)
Information Technology > Architecture > Distributed Systems (0.45)

Add feedback

Message-Passing Algorithms: Reparameterizations and Splittings

Ruozzi, Nicholas, Tatikonda, Sekhar

arXiv.org Artificial IntelligenceDec-1-2012

The max-product algorithm, a local message-passing scheme that attempts to compute the most probable assignment (MAP) of a given probability distribution, has been successfully employed as a method of approximate inference for applications arising in coding theory, computer vision, and machine learning. However, the max-product algorithm is not guaranteed to converge to the MAP assignment, and if it does, is not guaranteed to recover the MAP assignment. Alternative convergent message-passing schemes have been proposed to overcome these difficulties. This work provides a systematic study of such message-passing algorithms that extends the known results by exhibiting new sufficient conditions for convergence to local and/or global optima, providing a combinatorial characterization of these optima based on graph covers, and describing a new convergent and correct message-passing algorithm whose derivation unifies many of the known convergent message-passing algorithms. While convergent and correct message-passing algorithms represent a step forward in the analysis of max-product style message-passing algorithms, the conditions needed to guarantee convergence to a global optimum can be too restrictive in both theory and practice. This limitation of convergent and correct message-passing schemes is characterized by graph covers and illustrated by example.

algorithm, artificial intelligence, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TIT.2013.2259576

1002.3239

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Architecture > Distributed Systems (1.00)

Add feedback