Technology
3D Object Recognition Using Unsupervised Feature Extraction
Intrator, Nathan, Gold, Joshua I., Bรผlthoff, Heinrich H., Edelman, Shimon
Gold Center for Neural Science, Brown University Providence, RI 02912, USA Shimon Edelman Dept. of Applied Mathematics and Computer Science, Weizmann Institute of Science, Rehovot 76100, Israel Abstract Intrator (1990) proposed a feature extraction method that is related to recent statistical theory (Huber, 1985; Friedman, 1987), and is based on a biologically motivated model of neuronal plasticity (Bienenstock et al., 1982). This method has been recently applied to feature extraction in the context of recognizing 3D objects from single 2D views (Intrator and Gold, 1991). Here we describe experiments designed to analyze the nature of the extracted features, and their relevance to the theory and psychophysics of object recognition. 1 Introduction Results of recent computational studies of visual recognition (e.g., Poggio and Edelman, 1990)indicate that the problem of recognition of 3D objects can be effectively reformulated in terms of standard pattern classification theory. According to this approach, an object is represented by a few of its 2D views, encoded as clusters in multidimentional space. Recognition of a novel view is then carried out by interpo-460 3D Object Recognition Using Unsupervised Feature Extraction 461 lating among the stored views in the representation space.
Linear Operator for Object Recognition
Visual object recognition involves the identification of images of 3-D objects seenfrom arbitrary viewpoints. We suggest an approach to object recognition in which a view is represented as a collection of points given by their location in the image. An object is modeled by a set of 2-D views together with the correspondence between the views. We show that any novel view of the object can be expressed as a linear combination of the stored views. Consequently, we build a linear operator that distinguishes between views of a specific object and views of other objects.
Combined Neural Network and Rule-Based Framework for Probabilistic Pattern Recognition and Discovery
Greenspan, Hayit K., Goodman, Rodney, Chellappa, Rama
A combined neural network and rule-based approach is suggested as a general framework for pattern recognition. This approach enables unsupervised andsupervised learning, respectively, while providing probability estimates for the output classes. The probability maps are utilized for higher level analysis such as a feedback for smoothing over the output label mapsand the identification of unknown patterns (pattern "discovery"). The suggested approach is presented and demonstrated in the texture - analysis task. A correct classification rate in the 90 percentile is achieved for both unstructured and structured natural texture mosaics. The advantages ofthe probabilistic approach to pattern analysis are demonstrated.
Learning to Segment Images Using Dynamic Feature Binding
Mozer, Michael C., Zemel, Richard S., Behrmann, Marlene
Despite the fact that complex visual scenes contain multiple, overlapping objects, people perform object recognition with ease and accuracy. One operation that facilitates recognition is an early segmentation process in which features of objects are grouped and labeled according to which object theybelong. Current computational systems that perform this operation arebased on predefined grouping heuristics.
Illumination and View Position in 3D Visual Recognition
It is shown that both changes in viewing position and illumination conditions canbe compensated for, prior to recognition, using combinations of images taken from different viewing positions and different illumination conditions.It is also shown that, in agreement with psychophysical findings, the computation requires at least a sign-bit image as input - contours alone are not sufficient. 1 Introduction The task of visual recognition is natural and effortless for biological systems, yet the problem of recognition has been proven to be very difficult to analyze from a computational point of view. The fundamental reason is that novel images of familiar objects are often not sufficiently similar to previously seen images of that object. Assuming a rigid and isolated object in the scene, there are two major sources for this variability: geometric and photometric. The geometric source of variability comes from changes of view position. A 3D object can be viewed from a variety of directions, each resulting with a different 2D projection. The difference is significant, even for modest changes in viewing positions, and can be demonstrated by superimposing those projections (see Figure 1, first row second image). Much attention has been given to this problem in the visual recognition literature ([9], and references therein), and recent results show that one can compensate for changes in viewing position by generating novel views from a small number of model views of the object [10, 4, 8].
Markov Random Fields Can Bridge Levels of Abstraction
Cooper, Paul R., Prokopowicz, Peter N.
Network vision systems must make inferences from evidential information acrosslevels of representational abstraction, from low level invariants, through intermediate scene segments, to high level behaviorally relevant object descriptions. This paper shows that such networks can be realized as Markov Random Fields (MRFs). We show first how to construct an MRF functionally equivalent to a Hough transform parameter network, thus establishing a principled probabilistic basis for visual networks. Second, weshow that these MRF parameter networks are more capable and flexible than traditional methods. In particular, they have a well-defined probabilistic interpretation, intrinsically incorporate feedback, and offer richer representations and decision capabilities.
Learning How to Teach or Selecting Minimal Surface Data
Geiger, Davi, Pereira, Ricardo A. Marques
Marques Pereira Dipartimento di Informatica Universita di Trento Via Inama 7, Trento, TN 38100 ITALY Abstract Learning a map from an input set to an output set is similar to the problem ofreconstructing hypersurfaces from sparse data (Poggio and Girosi, 1990). In this framework, we discuss the problem of automatically selecting "minimal"surface data. The objective is to be able to approximately reconstruct the surface from the selected sparse data. We show that this problem is equivalent to the one of compressing information by data removal andthe one oflearning how to teach. Our key step is to introduce a process that statistically selects the data according to the model. During the process of data selection (learning how to teach) our system (teacher) is capable of predicting the new surface, the approximated one provided by the selected data.
Operators and curried functions: Training and analysis of simple recurrent networks
Wiles, Janet, Bloesch, Anthony
We present a framework for programming tbe bidden unit representations of simple recurrent networks based on the use of hint units (additional targets at the output layer). We present two ways of analysing a network trained within this framework: Input patterns act as operators on the information encoded by the context units; symmetrically, patterns of activation over tbe context units act as curried functions of the input sequences. Simulations demonstrate that a network can learn to represent three different functions simultaneously and canonical discriminant analysis is used to investigate bow operators and curried functions are represented in the space of bidden unit activations.