Statistical Learning
Attractive People: Assembling Loose-Limbed Models using Non-parametric Belief Propagation
Sigal, Leonid, Isard, Michael, Sigelman, Benjamin H., Black, Michael J.
The detection and pose estimation of people in images and video is made challenging by the variability of human appearance, the complexity of natural scenes, and the high dimensionality of articulated body models. Tocope with these problems we represent the 3D human body as a graphical model in which the relationships between the body parts are represented by conditional probability distributions. We formulate the pose estimation problem as one of probabilistic inference over a graphical modelwhere the random variables correspond to the individual limb parameters (position and orientation). Because the limbs are described by 6-dimensional vectors encoding pose in 3-space, discretization is impractical andthe random variables in our model must be continuousvalued. To approximate belief propagation in such a graph we exploit a recently introduced generalization of the particle filter. This framework facilitates the automatic initialization of the body-model from low level cues and is robust to occlusion of body parts and scene clutter.
Discriminative Fields for Modeling Spatial Dependencies in Natural Images
Kumar, Sanjiv, Hebert, Martial
In this paper we present Discriminative Random Fields (DRF), a discriminative frameworkfor the classification of natural image regions by incorporating neighborhoodspatial dependencies in the labels as well as the observed data. The proposed model exploits local discriminative models and allows to relax the assumption of conditional independence of the observed data given the labels, commonly used in the Markov Random Field (MRF) framework. The parameters of the DRF model are learned using penalized maximum pseudo-likelihood method. Furthermore, the form of the DRF model allows the MAP inference for binary classification problemsusing the graph min-cut algorithms. The performance of the model was verified on the synthetic as well as the real-world images. The DRF model outperforms the MRF model in the experiments.
Factorization with Uncertainty and Missing Data: Exploiting Temporal Coherence
The problem of "Structure From Motion" is a central problem in vision: given the 2D locations of certain points we wish to recover the camera motion and the 3D coordinates of the points. Under simplifiedcamera models, the problem reduces to factorizing a measurement matrix into the product of two low rank matrices. Each element of the measurement matrix contains the position of a point in a particular image. When all elements are observed, the problem can be solved trivially using SVD, but in any realistic situation manyelements of the matrix are missing and the ones that are observed have a different directional uncertainty. Under these conditions, most existing factorization algorithms fail while human perception is relatively unchanged. In this paper we use the well known EM algorithm for factor analysis toperform factorization. This allows us to easily handle missing data and measurement uncertainty and more importantly allows us to place a prior on the temporal trajectory of the latent variables (the camera position). We show that incorporating this prior gives a significant improvement in performance in challenging image sequences.
Discriminating Deformable Shape Classes
Ruiz-correa, Salvador, Shapiro, Linda G., Meila, Marina, Berson, Gabriel
We present and empirically test a novel approach for categorizing 3-D free form object shapesrepresented by range data . In contrast to traditional surface-signature based systems that use alignment to match specific objects, we adapted the newly introduced symbolic-signature representation to classify deformable shapes [10]. Our approach constructs anabstract description of shape classes using an ensemble of classifiers that learn object class parts and their corresponding geometrical relationships from a set of numeric and symbolic descriptors. We used our classification engine in a series of large scale discrimination experimentson two well-defined classes that share many common distinctive features. The experimental results suggest that our method outperforms traditional numeric signature-based methodologies.
A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications
Moreno, Pedro J., Ho, Purdy P., Vasconcelos, Nuno
Over the last years significant efforts have been made to develop kernels that can be applied to sequence data such as DNA, text, speech, video and images. The Fisher Kernel and similar variants have been suggested as good ways to combine an underlying generative model in the feature space and discriminant classifiers such as SVM's. In this paper we suggest analternative procedure to the Fisher kernel for systematically finding kernel functions that naturally handle variable length sequence data in multimedia domains. In particular for domains such as speech and images we explore the use of kernel functions that take full advantage of well known probabilistic models such as Gaussian Mixtures and single fullcovariance Gaussian models. We derive a kernel distance based on the Kullback-Leibler (KL) divergence between generative models. In effect our approach combines the best of both generative and discriminative methodsand replaces the standard SVM kernels. We perform experiments on speaker identification/verification and image classification tasksand show that these new kernels have the best performance in speaker verification and mostly outperform the Fisher kernel based SVM's and the generative classifiers in speaker identification and image classification.
Approximate Analytical Bootstrap Averages for Support Vector Classifiers
Malzahn, Dörthe, Opper, Manfred
We compute approximate analytical bootstrap averages for support vector classificationusing a combination of the replica method of statistical physics and the TAP approach for approximate inference. We test our method on a few datasets and compare it with exact averages obtained by extensive Monte-Carlo sampling.
Geometric Clustering Using the Information Bottleneck Method
Still, Susanne, Bialek, William, Bottou, Léon
We argue that K-means and deterministic annealing algorithms for geometric clusteringcan be derived from the more general Information Bottleneck approach.If we cluster the identities of data points to preserve information about their location, the set of optimal solutions is massively degenerate. But if we treat the equations that define the optimal solution as an iterative algorithm, then a set of "smooth" initial conditions selects solutions with the desired geometrical properties. In addition to conceptual unification,we argue that this approach can be more efficient and robust than classic algorithms.