Goto

Collaborating Authors

 Cadieu, Charles


Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream

Neural Information Processing Systems

Humans recognize visually-presented objects rapidly and accurately. To understand this ability, we seek to construct models of the ventral stream, the series of cortical areas thought to subserve object recognition. One tool to assess the quality of a model of the ventral stream is the Representation Dissimilarity Matrix (RDM), which uses a set of visual stimuli and measures the distances produced in either the brain (i.e. fMRI voxel responses, neural firing rates) or in models (features). Previous work has shown that all known models of the ventral stream fail to capture the RDM pattern observed in either IT cortex, the highest ventral area, or in the human ventral stream. In this work, we construct models of the ventral stream using a novel optimization procedure for category-level object recognition problems, and produce RDMs resembling both macaque IT and human ventral stream. The model, while novel in the optimization procedure, further develops a long-standing functional hypothesis that the ventral visual stream is a hierarchically arranged series of processing stages optimized for visual object recognition.


Learning Transformational Invariants from Natural Movies

Neural Information Processing Systems

We describe a hierarchical, probabilistic model that learns to extract complex motion from movies of the natural environment. The model consists of two hidden layers: the first layer produces a sparse representation of the image that is expressed in terms of local amplitude and phase variables. The second layer learns the higher-order structure among the time-varying phase variables. After training on natural movies, the top layer units discover the structure of phase-shifts within the first layer.


Learning Transformational Invariants from Natural Movies

Neural Information Processing Systems

We describe a hierarchical, probabilistic model that learns to extract complex motion frommovies of the natural environment. The model consists of two hidden layers: the first layer produces a sparse representation of the image that is expressed interms of local amplitude and phase variables. The second layer learns the higher-order structure among the time-varying phase variables. After training onnatural movies, the top layer units discover the structure of phase-shifts within the first layer.