AITopics

Survey propagation (SP) is an exciting new technique that has been remarkably successful at solving very large hard combinatorial problems, such as determining the satisfiability of Boolean formulas. In a promising attempt at understanding the success of SP, it was recently shown that SP can be viewed as a form of belief propagation, computing marginal probabilities over certain objects called covers of a formula. This explanation was, however, shortly dismissed by experiments suggesting that non-trivial covers simply do not exist for large formulas. In this paper, we show that these experiments were misleading: not only do covers exist for large hard random formulas, SP is surprisingly accurate at computing marginals over these covers despite the existence of many cycles in the formulas. This re-opens a potentially simpler line of reasoning for understanding SP, in contrast to some alternative lines of explanation that have been proposed assuming covers do not exist.

artificial intelligence, formula, machine learning, (18 more...)

1206.5273

Country:

North America > United States (0.67)
North America > Canada (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Accuracy Bounds for Belief Propagation

Ihler, Alexander T.

The belief propagation (BP) algorithm is widely applied to perform approximate inference on arbitrary graphical models, in part due to its excellent empirical properties and performance. However, little is known theoretically about when this algorithm will perform well. Using recent analysis of convergence and stability properties in BP and new results on approximations in binary systems, we derive a bound on the error in BP's estimates for pairwise Markov random fields over discrete valued random variables. Our bound is relatively simple to compute, and compares favorably with a previous method of bounding the accuracy of BP.

artificial intelligence, belief revision, graph, (17 more...)

1206.5277

Genre: Research Report > New Finding (0.34)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Weiss, Yair, Yanover, Chen, Meltzer, Talya

MAP Estimation, Linear Programming and Belief Propagation with Convex Free Energies

Finding the most probable assignment (MAP) in a general graphical model is known to be NP hard but good approximations have been attained with max-product belief propagation (BP) and its variants. In particular, it is known that using BP on a single-cycle graph or tree reweighted BP on an arbitrary graph will give the MAP solution if the beliefs have no ties. In this paper we extend the setting under which BP can be used to provably extract the MAP. We define Convex BP as BP algorithms based on a convex free energy approximation and show that this class includes ordinary BP with single-cycle, tree reweighted BP and many other BP variants. We show that when there are no ties, fixed-points of convex max-product BP will provably give the MAP solution. We also show that convex sum-product BP at sufficiently small temperatures can be used to solve linear programs that arise from relaxing the MAP problem. Finally, we derive a novel condition that allows us to derive the MAP solution even if some of the convex BP beliefs have ties. In experiments, we show that our theorems allow us to find the MAP in many real-world instances of graphical models where exact inference using junction-tree is impossible.

belief revision, linear programming and belief propagation, optimization problem, (3 more...)

1206.5286

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.60)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.40)

Shpitser, Ilya, Pearl, Judea

What Counterfactuals Can Be Tested

Counterfactual statements, e.g., "my headache would be gone had I taken an aspirin" are central to scientific discourse, and are formally interpreted as statements derived from "alternative worlds". However, since they invoke hypothetical states of affairs, often incompatible with what is actually known or observed, testing counterfactuals is fraught with conceptual and practical difficulties. In this paper, we provide a complete characterization of "testable counterfactuals," namely, counterfactual statements whose probabilities can be inferred from physical experiments. We provide complete procedures for discerning whether a given counterfactual is testable and, if so, expressing its probability in terms of experimental data.

counterfactual, graph, pearl, (16 more...)

1206.5294

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Japan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.54)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)

Marlin, Benjamin, Zemel, Richard S., Roweis, Sam, Slaney, Malcolm

Collaborative Filtering and the Missing at Random Assumption

Rating prediction is an important application, and a popular research topic in collaborative filtering. However, both the validity of learning algorithms, and the validity of standard testing procedures rest on the assumption that missing ratings are missing at random (MAR). In this paper we present the results of a user study in which we collect a random sample of ratings from current users of an online radio service. An analysis of the rating data collected in the study shows that the sample of random ratings has markedly different properties than ratings of user-selected songs. When asked to report on their own rating behaviour, a large number of users indicate they believe their opinion of a song does affect whether they choose to rate that song, a violation of the MAR condition. Finally, we present experimental results showing that incorporating an explicit model of the missing data mechanism can lead to significant improvements in prediction performance on the random sample of ratings.

artificial intelligence, machine learning, multinomial mixture model, (16 more...)

1206.5267

Country: North America > Canada > Ontario > Toronto (0.29)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)

Industry:

Leisure & Entertainment (1.00)
Media > Radio (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kapoor, Ashish, Horvitz, Eric J.

On Discarding, Caching, and Recalling Samples in Active Learning

We address challenges of active learning under scarce informational resources in non-stationary environments. In real-world settings, data labeled and integrated into a predictive model may become invalid over time. However, the data can become informative again with switches in context and such changes may indicate unmodeled cyclic or other temporal dynamics. We explore principles for discarding, caching, and recalling labeled data points in active learning based on computations of value of information. We review key concepts and study the value of the methods via investigations of predictive performance and costs of acquiring data for simulated and real-world data sets.

artificial intelligence, machine learning, survey article, (15 more...)

1206.5274

Genre:

Research Report (0.50)
Overview (0.48)

Industry: Education > Educational Setting (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Nonparametric Bayes Pachinko Allocation

Li, Wei, Blei, David, McCallum, Andrew

Recent advances in topic models have explored complicated structured distributions to represent topic correlation. For example, the pachinko allocation model (PAM) captures arbitrary, nested, and possibly sparse correlations between topics using a directed acyclic graph (DAG). While PAM provides more flexibility and greater expressive power than previous models like latent Dirichlet allocation (LDA), it is also more difficult to determine the appropriate topic structure for a specific dataset. In this paper, we propose a nonparametric Bayesian prior for PAM based on a variant of the hierarchical Dirichlet process (HDP). Although the HDP can capture topic correlations defined by nested data structure, it does not automatically discover such correlations from unstructured data. By assuming an HDP-based prior for PAM, we are able to learn both the number of topics and how the topics are correlated. We evaluate our model on synthetic and real-world text datasets, and show that nonparametric PAM achieves performance matching the best of PAM without manually tuning the number of topics.

dirichlet process, machine learning, natural language, (20 more...)

1206.527

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (0.82)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.56)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.34)

Eaton, Daniel, Murphy, Kevin

Bayesian structure learning using dynamic programming and MCMC

MCMC methods for sampling from the space of DAGs can mix poorly due to the local nature of the proposals that are commonly used. It has been shown that sampling from the space of node orders yields better results [FK03, EW06]. Recently, Koivisto and Sood showed how one can analytically marginalize over orders using dynamic programming (DP) [KS04, Koi06]. Their method computes the exact marginal posterior edge probabilities, thus avoiding the need for MCMC. Unfortunately, there are four drawbacks to the DP technique: it can only use modular priors, it can only compute posteriors over modular features, it is difficult to compute a predictive density, and it takes exponential time and space. We show how to overcome the first three of these problems by using the DP algorithm as a proposal distribution for MCMC in DAG space. We show that this hybrid technique converges to the posterior faster than other methods, resulting in more accurate structure learning and higher predictive likelihoods on test data.

artificial intelligence, bayesian inference, machine learning, (20 more...)

1206.5247

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Shift-Invariance Sparse Coding for Audio Classification

Grosse, Roger, Raina, Rajat, Kwong, Helen, Ng, Andrew Y.

Sparse coding is an unsupervised learning algorithm that learns a succinct high-level representation of the inputs given only unlabeled data; it represents each input as a sparse linear combination of a set of basis functions. Originally applied to modeling the human visual cortex, sparse coding has also been shown to be useful for self-taught learning, in which the goal is to solve a supervised classification task given access to additional unlabeled data drawn from different classes than that in the supervised learning problem. Shift-invariant sparse coding (SISC) is an extension of sparse coding which reconstructs a (usually time-series) input using all of the basis functions in all possible shifts. In this paper, we present an efficient algorithm for learning SISC bases. Our method is based on iteratively solving two large convex optimization problems: The first, which computes the linear coefficients, is an L1-regularized linear least squares problem with potentially hundreds of thousands of variables. Existing methods typically use a heuristic to select a small subset of the variables to optimize, but we present a way to efficiently compute the exact solution. The second, which solves for bases, is a constrained linear least squares problem. By optimizing over complex-valued variables in the Fourier domain, we reduce the coupling between the different variables, allowing the problem to be solved efficiently. We show that SISC's learned high-level representations of speech and music provide useful features for classification tasks within those domains. When applied to classification, under certain conditions the learned features outperform state of the art spectral and cepstral features.

artificial intelligence, inductive learning, machine learning, (17 more...)

1206.5241

Country: North America > United States (0.14)

Genre: Research Report (0.64)

Industry:

Education (0.48)
Media (0.46)
Leisure & Entertainment (0.46)
Health & Medicine > Therapeutic Area > Neurology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.90)

Vorobeychik, Yevgeniy, Reeves, Daniel, Wellman, Michael P.

Constrained Automated Mechanism Design for Infinite Games of Incomplete Information

We present a functional framework for automated mechanism design based on a two-stage game model of strategic interaction between the designer and the mechanism participants, and apply it to several classes of two-player infinite games of incomplete information. At the core of our framework is a black-box optimization algorithm which guides the selection process of candidate mechanisms. Our approach yields optimal or nearly optimal mechanisms in several application domains using various objective functions. By comparing our results with known optimal mechanisms, and in some cases improving on the best known mechanisms, we provide evidence that ours is a promising approach to parametric design of indirect mechanisms.

artificial intelligence, mechanism, optimization problem, (19 more...)

1206.5288

Country: North America > United States > Michigan (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Government (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)