Country
Conditional Visual Tracking in Kernel Space
Sminchisescu, Cristian, Kanujia, Atul, Li, Zhiguo, Metaxas, Dimitris
We present a conditional temporal probabilistic framework for reconstructing 3D human motion in monocular video based on descriptors encoding image silhouette observations. For computational efficiency we restrict visual inference to low-dimensional kernel induced nonlinear state spaces. Our methodology (kBME) combines kernel PCA-based nonlinear dimensionality reduction (kPCA) and Conditional Bayesian Mixture of Experts (BME) in order to learn complex multivalued predictors between observations and model hidden states. This is necessary for accurate, inverse, visual perception inferences, where several probable, distant 3D solutions exist due to noise or the uncertainty of monocular perspective projection. Low-dimensional models are appropriate because many visual processes exhibit strong nonlinear correlations in both the image observations and the target, hidden state variables. The learned predictors are temporally combined within a conditional graphical model in order to allow a principled propagation of uncertainty. We study several predictors and empirically show that the proposed algorithm positively compares with techniques based on regression, Kernel Dependency Estimation (KDE) or PCA alone, and gives results competitive to those of high-dimensional mixture predictors at a fraction of their computational cost. We show that the method successfully reconstructs the complex 3D motion of humans in real monocular video sequences.
Selecting Landmark Points for Sparse Manifold Learning
Silva, Jorge, Marques, Jorge, Lemos, João
There has been a surge of interest in learning nonlinear manifold models to approximate high-dimensional data. Both for computational complexity reasons and for generalization capability, sparsity is a desired feature in such models. This usually means dimensionality reduction, which naturally implies estimating the intrinsic dimension, but it can also mean selecting a subset of the data to use as landmarks, which is especially important because many existing algorithms have quadratic complexity in the number of observations.
Learning Shared Latent Structure for Image Synthesis and Robotic Imitation
Shon, Aaron, Grochow, Keith, Hertzmann, Aaron, Rao, Rajesh PN
We propose an algorithm that uses Gaussian process regression to learn common hidden structure shared between corresponding sets of heterogenous observations. The observation spaces are linked via a single, reduced-dimensionality latent variable space. We present results from two datasets demonstrating the algorithms's ability to synthesize novel data from learned correspondences. We first show that the method can learn the nonlinear mapping between corresponding views of objects, filling in missing data as needed to synthesize novel views. We then show that the method can learn a mapping between human degrees of freedom and robotic degrees of freedom for a humanoid robot, allowing robotic imitation of human poses from motion capture data.
AER Building Blocks for Multi-Layer Multi-Chip Neuromorphic Vision Systems
Serrano-Gotarredona, R., Oster, M., Lichtsteiner, P., Linares-Barranco, A., Paz-Vicente, R., Gomez-Rodriguez, F., Riis, H. Kolle, Delbruck, T., Liu, S. C., Zahnd, S., Whatley, A. M., Douglas, R., Hafliger, P., Jimenez-Moreno, G., Civit, A., Serrano-Gotarredona, T., Acosta-Jimenez, A., Linares-Barranco, B.
A 5-layer neuromorphic vision processor whose components communicate spike events asychronously using the address-eventrepresentation (AER) is demonstrated. The system includes a retina chip, two convolution chips, a 2D winner-take-all chip, a delay line chip, a learning classifier chip, and a set of PCBs for computer interfacing and address space remappings. The components use a mixture of analog and digital computation and will learn to classify trajectories of a moving object. A complete experimental setup and measurements results are shown.
Learning Minimum Volume Sets
Given a probability measure P and a reference measure µ, one is often interested in the minimum µ-measure set with P-measure at least α. Minimum volume sets of this type summarize the regions of greatest probability mass of P, and are useful for detecting anomalies and constructing confidence regions. This paper addresses the problem of estimating minimum volume sets based on independent samples distributed according to P. Other than these samples, no other information is available regarding P, but the reference measure µ is assumed to be known. We introduce rules for estimating minimum volume sets that parallel the empirical risk minimization and structural risk minimization principles in classification. As in classification, we show that the performances of our estimators are controlled by the rate of uniform convergence of empirical to true probabilities over the class from which the estimator is drawn. Thus we obtain finite sample size performance bounds in terms of VC dimension and related quantities. We also demonstrate strong universal consistency and an oracle inequality. Estimators based on histograms and dyadic partitions illustrate the proposed rules.
A Bayesian Framework for Tilt Perception and Confidence
Schwartz, Odelia, Dayan, Peter, Sejnowski, Terrence J.
The misjudgement of tilt in images lies at the heart of entertaining visual illusions and rigorous perceptual psychophysics. A wealth of findings has attracted many mechanistic models, but few clear computational principles. We adopt a Bayesian approach to perceptual tilt estimation, showing how a smoothness prior offers a powerful way of addressing much confusing data. In particular, we faithfully model recent results showing that confidence in estimation can be systematically affected by the same aspects of images that affect bias. Confidence is central to Bayesian modeling approaches, and is applicable in many other perceptual domains. Perceptual anomalies and illusions, such as the misjudgements of motion and tilt evident in so many psychophysical experiments, have intrigued researchers for decades.
Fast Online Policy Gradient Learning with SMD Gain Vector Adaptation
Yu, Jin, Aberdeen, Douglas, Schraudolph, Nicol N.
Reinforcement learning by direct policy gradient estimation is attractive in theory but in practice leads to notoriously ill-behaved optimization problems. We improve its robustness and speed of convergence with stochastic meta-descent, a gain vector adaptation method that employs fast Hessian-vector products. In our experiments the resulting algorithms outperform previously employed online stochastic, offline conjugate, and natural policy gradient methods.
On the Accuracy of Bounded Rationality: How Far from Optimal Is Fast and Frugal?
Schmitt, Michael, Martignon, Laura
Fast and frugal heuristics are well studied models of bounded rationality. Psychological research has proposed the take-the-best heuristic as a successful strategy in decision making with limited resources. Take-thebest searches for a sufficiently good ordering of cues (features) in a task where objects are to be compared lexicographically. We investigate the complexity of the problem of approximating optimal cue permutations for lexicographic strategies. We show that no efficient algorithm can approximate the optimum to within any constant factor, if P NP. We further consider a greedy approach for building lexicographic strategies and derive tight bounds for the performance ratio of a new and simple algorithm. This algorithm is proven to perform better than take-the-best.
Identifying Distributed Object Representations in Human Extrastriate Visual Cortex
Sayres, Rory, Ress, David, Grill-spector, Kalanit
The category of visual stimuli has been reliably decoded from patterns of neural activity in extrastriate visual cortex [1]. It has yet to be seen whether object identity can be inferred from this activity. We present fMRI data measuring responses in human extrastriate cortex to a set of 12 distinct object images. We use a simple winner-take-all classifier, using half the data from each recording session as a training set, to evaluate encoding of object identity across fMRI voxels. Since this approach is sensitive to the inclusion of noisy voxels, we describe two methods for identifying subsets of voxels in the data which optimally distinguish object identity. One method characterizes the reliability of each voxel within subsets of the data, while another estimates the mutual information of each voxel with the stimulus set. We find that both metrics can identify subsets of the data which reliably encode object identity, even when noisy measurements are artificially added to the data. The mutual information metric is less efficient at this task, likely due to constraints in fMRI data.