Conditional Visual Tracking in Kernel Space
Sminchisescu, Cristian, Kanujia, Atul, Li, Zhiguo, Metaxas, Dimitris
–Neural Information Processing Systems
We present a conditional temporal probabilistic framework for reconstructing 3Dhuman motion in monocular video based on descriptors encoding image silhouette observations. For computational efficiency we restrict visual inference to low-dimensional kernel induced nonlinear state spaces. Our methodology (kBME) combines kernel PCA-based nonlinear dimensionality reduction (kPCA) and Conditional Bayesian Mixture of Experts (BME) in order to learn complex multivalued predictors betweenobservations and model hidden states. This is necessary for accurate, inverse, visual perception inferences, where several probable, distant3D solutions exist due to noise or the uncertainty of monocular perspectiveprojection. Low-dimensional models are appropriate because many visual processes exhibit strong nonlinear correlations in both the image observations and the target, hidden state variables. The learned predictors are temporally combined within a conditional graphical modelin order to allow a principled propagation of uncertainty. We study several predictors and empirically show that the proposed algorithm positivelycompares with techniques based on regression, Kernel Dependency Estimation (KDE) or PCA alone, and gives results competitive tothose of high-dimensional mixture predictors at a fraction of their computational cost. We show that the method successfully reconstructs the complex 3D motion of humans in real monocular video sequences.
Neural Information Processing Systems
Dec-31-2006