AITopics | McAllester, David A.

Object Detection with Grammar Models

Girshick, Ross B., Felzenszwalb, Pedro F., McAllester, David A.

Neural Information Processing SystemsFeb-14-2020, 21:56:19 GMT

Compositional models provide an elegant formalism for representing the visual appearance of highly variable objects. While such models are appealing from a theoretical point of view, it has been difficult to demonstrate that they lead to performance advantages on challenging datasets. Here we develop a grammar model for person detection and show that it outperforms previous high-performance systems on the PASCAL benchmark. Our model represents people using a hierarchy of deformable parts, variable structure and an explicit model of occlusion for partially visible objects. To train the model, we introduce a new discriminative framework for learning structured prediction models from weakly-labeled data.

artificial intelligence, grammar model, machine learning, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Approximate Planning for Factored POMDPs using Belief State Simplification

McAllester, David A., Singh, Satinder

arXiv.org Artificial IntelligenceJan-23-2013

We are interested in the problem of planning for factored POMDPs. Building on the recent results of Kearns, Mansour and Ng, we provide a planning algorithm for factored POMDPs that exploits the accuracy-efficiency tradeoff in the belief state simplification introduced by Boyen and Koller.

artificial intelligence, machine learning, pomdp, (14 more...)

arXiv.org Artificial Intelligence

1301.6719

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Case-Factor Diagrams for Structured Probabilistic Modeling

McAllester, David A., Collins, Michael, Pereira, Fernando

arXiv.org Artificial IntelligenceJul-11-2012

We introduce a probabilistic formalism subsuming Markov random fields of bounded tree width and probabilistic context free grammars. Our models are based on a representation of Boolean formulas that we call case-factor diagrams (CFDs). CFDs are similar to binary decision diagrams (BDDs) but are concise for circuits of bounded tree width (unlike BDDs) and can concisely represent the set of parse trees over a given string undera given context free grammar (also unlike BDDs). A probabilistic model consists of aCFD defining a feasible set of Boolean assignments and a weight (or cost) for each individual Boolean variable. We give an insideoutside algorithm for simultaneously computing the marginal of each Boolean variable, and a Viterbi algorithm for finding the mininum cost variable assignment. Both algorithms run in time proportional to the size of the CFD.

artificial intelligence, assignment, us government, (19 more...)

arXiv.org Artificial Intelligence

1207.4135

Country: North America > United States > Massachusetts (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Generalization Bounds and Consistency for Latent Structural Probit and Ramp Loss

Keshet, Joseph, McAllester, David A.

Neural Information Processing SystemsDec-31-2011

We consider latent structural versions of probit loss and ramp loss. We show that these surrogate loss functions are consistent in the strong sense that for any feature map (finite or infinite dimensional) they yield predictors approaching the infimum task loss achievable by any linear predictor over the given features. We also give finite sample generalization bounds (convergence rates) for these loss functions. These bounds suggest that probit loss converges more rapidly. However, ramp loss is more easily optimized and may ultimately be more practical.

artificial intelligence, evolutionary algorithm, ramp loss, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Object Detection with Grammar Models

Girshick, Ross B., Felzenszwalb, Pedro F., McAllester, David A.

Neural Information Processing SystemsDec-31-2011

Compositional models provide an elegant formalism for representing the visual appearance of highly variable objects. While such models are appealing from a theoretical point of view, it has been difficult to demonstrate that they lead to performance advantages on challenging datasets. Here we develop a grammar model for person detection and show that it outperforms previous high-performance systems on the PASCAL benchmark. Our model represents people using a hierarchy of deformable parts, variable structure and an explicit model of occlusion for partially visible objects. To train the model, we introduce a new discriminative framework for learning structured prediction models from weakly-labeled data.

artificial intelligence, grammar model, inductive learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.15)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)
(3 more...)

Add feedback

Direct Loss Minimization for Structured Prediction

Hazan, Tamir, Keshet, Joseph, McAllester, David A.

Neural Information Processing SystemsDec-31-2010

In discriminative machine learning one is interested in training a system to optimize a certain desired measure of performance, or loss. In binary classification one typically tries to minimizes the error rate. But in structured prediction each task often has its own measure of performance such as the BLEU score in machine translation or the intersection-over-union score in PASCAL segmentation. The most common approaches to structured prediction, structural SVMs and CRFs, do not minimize the task loss: the former minimizes a surrogate loss with no guarantees for task loss and the latter minimizes log loss independent of task loss. The main contribution of this paper is a theorem stating that a certain perceptron-like learning rule, involving features vectors derived from loss-adjusted inference, directly corresponds to the gradient of task loss. We give empirical results on phonetic alignment of a standard test set from the TIMIT corpus, which surpasses all previously reported results on this problem.

Add feedback

Exponentiated Gradient Algorithms for Large-margin Structured Classification

Bartlett, Peter L., Collins, Michael, Taskar, Ben, McAllester, David A.

Neural Information Processing SystemsDec-31-2005

We consider the problem of structured classification, where the task is to predict a label y from an input x, and y has meaningful internal structure. Ourframework includes supervised training of Markov random fields and weighted context-free grammars as special cases. We describe an algorithm that solves the large-margin optimization problem defined in [12], using an exponential-family (Gibbs distribution) representation of structured objects. The algorithm is efficient--even in cases where the number of labels y is exponential in size--provided that certain expectations underGibbs distributions can be calculated efficiently. The method for structured labels relies on a more general result, specifically the application ofexponentiated gradient updates [7, 8] to quadratic programs.

Add feedback

Concentration Inequalities for the Missing Mass and for Histogram Rule Error

Ortiz, Luis E., McAllester, David A.

Neural Information Processing SystemsDec-31-2003

This paper gives distribution-free concentration inequalities for the missing mass and the error rate of histogram rules. Negative association methods can be used to reduce these concentration problems to concentration questions about independent sums. Although the sums are independent, they are highly heterogeneous. Such highly heterogeneous independent sums cannot be analyzed using standard concentration inequalities such as Hoeffding's inequality, the Angluin-Valiant bound, Bernstein's inequality, Bennett's inequality, or McDiarmid's theorem.

artificial intelligence, inequality, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

Concentration Inequalities for the Missing Mass and for Histogram Rule Error

Ortiz, Luis E., McAllester, David A.

Neural Information Processing SystemsDec-31-2003

This paper gives distribution-free concentration inequalities for the missing massand the error rate of histogram rules. Negative association methods canbe used to reduce these concentration problems to concentration questions about independent sums. Although the sums are independent, they are highly heterogeneous. Such highly heterogeneous independent sums cannot be analyzed using standard concentration inequalities such as Hoeffding's inequality, the Angluin-Valiant bound, Bernstein's inequality, Bennett'sinequality, or McDiarmid's theorem.

artificial intelligence, inequality, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Add feedback

PAC Generalization Bounds for Co-training

Dasgupta, Sanjoy, Littman, Michael L., McAllester, David A.

Neural Information Processing SystemsDec-31-2002

In this paper, we study bootstrapping algorithms for learning from unlabeled data. The general idea in bootstrapping is to use some initial labeled data to build a (possibly partial) predictive labeling procedure; then use the labeling procedure to label more data; then use the newly labeled data to build a new predictive procedure and so on. This process can be iterated until a fixed point is reached or some other stopping criterion is met. Here we give P AC style bounds on generalization error which can be used to formally justify certain boostrapping algorithms. One well-known form of bootstrapping is the EM algorithm (Dempster, Laird and Rubin, 1977).

artificial intelligence, machine learning, null null, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Filters

Collaborating Authors

McAllester, David A.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Object Detection with Grammar Models

Approximate Planning for Factored POMDPs using Belief State Simplification

Case-Factor Diagrams for Structured Probabilistic Modeling

Generalization Bounds and Consistency for Latent Structural Probit and Ramp Loss

Object Detection with Grammar Models

Direct Loss Minimization for Structured Prediction

Exponentiated Gradient Algorithms for Large-margin Structured Classification

Concentration Inequalities for the Missing Mass and for Histogram Rule Error

Concentration Inequalities for the Missing Mass and for Histogram Rule Error

PAC Generalization Bounds for Co-training