Vision
Lipreading by neural networks: Visual preprocessing, learning, and sensory integration
Wolff, Gregory J., Prasad, K. Venkatesh, Stork, David G., Hennecke, Marcus
Automated speech recognition is notoriously hard, and thus any predictive source of information and constraints that could be incorporated into a computer speech recognition system would be desirable. Humans, especially the hearing impaired, can utilize visual information - "speech reading" - for improved accuracy (Dodd & Campbell, 1987, Sanders & Goodrich, 1971). Speech reading can provide direct information about segments, phonemes, rate, speaker gender and identity, and subtle informationfor segmenting speech from background noise or multiple speakers (De Filippo & Sims, 1988, Green & Miller, 1985). Fundamental support for the use of visual information comes from the complementary natureof the visual and acoustic speech signals. Utterances that are difficult to distinguish acoustically are the easiest to distinguish.
An Analog VLSI Saccadic Eye Movement System
Horiuchi, Timothy K., Bishofberger, Brooks, Koch, Christof
In an effort to understand saccadic eye movements and their relation tovisual attention and other forms of eye movements, we - in collaboration with a number of other laboratories - are carrying outa large-scale effort to design and build a complete primate oculomotor system using analog CMOS VLSI technology. Using this technology, a low power, compact, multi-chip system has been built which works in real-time using real-world visual inputs. We describe in this paper the performance of an early version of such a system including a 1-D array of photoreceptors mimicking the retina, a circuit computing the mean location of activity representing thesuperior colliculus, a saccadic burst generator, and a one degree-of-freedom rotational platform which models the dynamic properties of the primate oculomotor plant. 1 Introduction When we look around our environment, we move our eyes to center and stabilize objects of interest onto our fovea. In order to achieve this, our eyes move in quick jumps with short pauses in between. These quick jumps (up to 750 deg/sec in humans) areknown as saccades and are seen in both exploratory eye movements and as reflexive eye movements in response to sudden visual, auditory, or somatosensory stimuli.Since the intent of the saccade is to bring new objects of interest onto the fovea, it can be considered a primitive attentional mechanism.
Learning Complex Boolean Functions: Algorithms and Applications
Oliveira, Arlindo L., Sangiovanni-Vincentelli, Alberto
The most commonly used neural network models are not well suited to direct digital implementations because each node needs to perform a large number of operations between floating point values. Fortunately, the ability to learn from examples and to generalize is not restricted to networks ofthis type. Indeed, networks where each node implements a simple Boolean function (Boolean networks) can be designed in such a way as to exhibit similar properties. Two algorithms that generate Boolean networks from examples are presented. The results show that these algorithms generalize very well in a class of problems that accept compact Boolean network descriptions. The techniques described are general and can be applied to tasks that are not known to have that characteristic. Two examples of applications are presented: image reconstruction and handwritten character recognition.
Feature Densities are Required for Computing Feature Correspondences
The feature correspondence problem is a classic hurdle in visual object-recognition concerned with determining the correct mapping between the features measured from the image and the features expected by the model. In this paper we show that determining good correspondences requires information about the joint probability density over the image features. We propose "likelihood based correspondence matching" as a general principle for selecting optimal correspondences. The approach is applicable to nonrigid models, allows nonlinear perspective transformations, and can optimally deal with occlusions and missing features.
Learning in Computer Vision and Image Understanding
There is an increasing interest in the area of Learning in Computer Vision and Image Understanding, both from researchers in the learning community and from researchers involved with the computer vision world. The field is characterized by a shift away from the classical, purely model-based, computer vision techniques, towards data-driven learning paradigms for solving real-world vision problems. Using learning in segmentation or recognition tasks has several advantages over classical model-based techniques. These include adaptivity to noise and changing environments, as well as in many cases, a simplified system generation procedure. Yet, learning from examples introduces a new challenge - getting a representative data set of examples from which to learn.
Resolving motion ambiguities
Diamantaras, K. I., Geiger, D.
We address the problem of optical flow reconstruction and in particular the problem of resolving ambiguities near edges. They occur due to (i) the aperture problem and (ii) the occlusion problem, where pixels on both sides of an intensity edge are assigned the same velocity estimates (and confidence). However, these measurements are correct for just one side of the edge (the non occluded one). Our approach is to introduce an uncertamty field with respect to the estimates and confidence measures. We note that the confidence measures are large at intensity edges and larger at the convex sides of the edges, i.e. inside corners, than at the concave side. We resolve the ambiguities through local interactions via coupled Markov random fields (MRF). The result is the detection of motion for regions of images with large global convexity.
A Comparison of Dynamic Reposing and Tangent Distance for Drug Activity Prediction
Dietterich, Thomas G., Jain, Ajay N., Lathrop, Richard H., Lozano-Pérez, Tomás
The task of drug activity prediction is to predict the activity of proposed drug compounds by learning from the observed activity of previously-synthesized drug compounds. Accurate drug activity prediction can save substantial time and money by focusing the efforts of chemists and biologists on the synthesis and testing of compounds whose predicted activity is high. If the requirements for highly active binding can be displayed in three dimensions, chemists can work from such displays to design new compounds having high predicted activity. Drug molecules usually act by binding to localized sites on large receptor molecules or large enyzme molecules. One reasonable way to represent drug molecules is to capture the location of their surface in the (fixed) frame of reference of the (hypothesized) binding site.
An Analog VLSI Saccadic Eye Movement System
Horiuchi, Timothy K., Bishofberger, Brooks, Koch, Christof
In an effort to understand saccadic eye movements and their relation to visual attention and other forms of eye movements, we - in collaboration with a number of other laboratories - are carrying out a large-scale effort to design and build a complete primate oculomotor system using analog CMOS VLSI technology. Using this technology, a low power, compact, multi-chip system has been built which works in real-time using real-world visual inputs. We describe in this paper the performance of an early version of such a system including a 1-D array of photoreceptors mimicking the retina, a circuit computing the mean location of activity representing the superior colliculus, a saccadic burst generator, and a one degree-of-freedom rotational platform which models the dynamic properties of the primate oculomotor plant. 1 Introduction When we look around our environment, we move our eyes to center and stabilize objects of interest onto our fovea. In order to achieve this, our eyes move in quick jumps with short pauses in between. These quick jumps (up to 750 deg/sec in humans) are known as saccades and are seen in both exploratory eye movements and as reflexive eye movements in response to sudden visual, auditory, or somatosensory stimuli. Since the intent of the saccade is to bring new objects of interest onto the fovea, it can be considered a primitive attentional mechanism.