AITopics

We have developed a foveated gesture recognition system that runs in an unconstrained office environment with an active camera. Using vision routines previously implemented for an interactive environment, we determine the spatial location of salient body parts of a user and guide an active camera to obtain images of gestures or expressions. A hidden-state reinforcement learning paradigm is used to implement visual attention. The attention module selects targets to foveate based on the goal of successful recognition, and uses a new multiple-model Q-Iearning formulation. Given a set of target and distractor gestures, our system can learn where to foveate to maximally discriminate a particular gesture. 1 INTRODUCTION Vision has numerous uses in the natural world.

active gesture recognition, gesture recognition, recognition, (12 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Gesture Recognition (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Viola, Paul A., Schraudolph, Nicol N., Sejnowski, Terrence J.

Empirical Entropy Manipulation for Real-World Problems

No finite sample is sufficient to determine the density, and therefore the entropy, of a signal directly. Some assumption about either the functional form of the density or about its smoothness is necessary.

empirical entropy manipulation, entropy, projection, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)

Industry: Health & Medicine (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Pessoa, Luiz, Ross, William D.

A Neural Network Model of 3-D Lightness Perception

A neural network model of 3-D lightness perception is presented which builds upon the FACADE Theory Boundary Contour System/Feature Contour System of Grossberg and colleagues. Early ratio encoding by retinal ganglion neurons as well as psychophysical results on constancy across different backgrounds (background constancy) are used to provide functional constraints to the theory and suggest a contrast negation hypothesis which states that ratio measures between coplanar regions are given more weight in the determination of lightness of the respective regions.

background, lightness, perception, (13 more...)

Country:

South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

McCabe, Susan L., Denham, Michael J.

A Model of Auditory Streaming

The formation of associations between signals, which are considered to arise from the same external source, allows the organism to recognise significant patterns and relationships within the signals from each source without being confused by accidental coincidences between unrelated signals (Bregman, 1990). The intrinsically temporal nature of sound means that in addition to being able to focus on the signal of interest, perhaps of equal significance, is the ability to predict how that signal is expected to progress; such expectations can then be used to facilitate further processing of the signal. It is important to remember that perception is a creative act (Luria, 1980). The organism creates its interpretation of the world in response to the current stimuli, within the context of its current state of alertness, attention, and previous experience. The creative aspects of perception are exemplified in the auditory system where peripheral processing decomposes acoustic stimuli.

bregman, frequency, tone presentation rate, (14 more...)

Country:

Europe > United Kingdom (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence (0.47)

Rao, Rajesh P. N., Zelinsky, Gregory J., Hayhoe, Mary M., Ballard, Dana H.

Modeling Saccadic Targeting in Visual Search

Visual cognition depends criticalIy on the ability to make rapid eye movements known as saccades that orient the fovea over targets of interest in a visual scene. Saccades are known to be ballistic: the pattern of muscle activation for foveating a prespecified target location is computed prior to the movement and visual feedback is precluded. Despite these distinctive properties, there has been no general model of the saccadic targeting strategy employed by the human visual system during visual search in natural scenes. This paper proposes a model for saccadic targeting that uses iconic scene representations derived from oriented spatial filters at multiple scales. Visual search proceeds in a coarse-to-fine fashion with the largest scale filter responses being compared first. The model was empirically tested by comparing its perfonnance with actual eye movement data from human subjects in a natural visual search task; preliminary results indicate substantial agreement between eye movements predicted by the model and those recorded from human subjects.

eye movement, representation, saccade, (14 more...)

Country:

North America > United States > New York > Monroe County > Rochester (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Illinois > Cook County > Chicago (0.04)

Industry: Health & Medicine > Therapeutic Area (0.49)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Vision (0.69)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Cognitive Science (0.67)

Bartlett, Marian Stewart, Viola, Paul A., Sejnowski, Terrence J., Golomb, Beatrice A., Larsen, Jan, Hager, Joseph C., Ekman, Paul

Classifying Facial Action

The Facial Action Coding System, (FACS), devised by Ekman and Friesen (1978), provides an objective meanS for measuring the facial muscle contractions involved in a facial expression. In this paper, we approach automated facial expression analysis by detecting and classifying facial actions. We generated a database of over 1100 image sequences of 24 subjects performing over 150 distinct facial actions or action combinations. We compare three different approaches to classifying the facial actions in these images: Holistic spatial analysis based on principal components of graylevel images; explicit measurement of local image features such as wrinkles; and template matching with motion flow fields. On a dataset containing six individual actions and 20 subjects, these methods had 89%, 57%, and 85% performances respectively for generalization to novel subjects. When combined, performance improved to 92%.

expression, facial action, facial expression, (13 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.29)
North America > United States > California > San Diego County > La Jolla (0.05)
Europe > Switzerland > Zürich > Zürich (0.05)
(4 more...)

Genre: Research Report (0.47)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)

Marshall, Jonathan A., Alley, Richard K., Hubbard, Robert S.

Learning to Predict Visibility and Invisibility from Occlusion Events

This paper presents a self-organizing neural network that learns to detect, represent, and predict the visibility and invisibility relationships that arise during occlusion events, after a period of exposure to motion sequences containing occlusion and disocclusion events. The network develops two parallel opponent channels or "chains" of lateral excitatory connections for every resolvable motion trajectory. One channel, the "On" chain or "visible" chain, is activated when a moving stimulus is visible. The other channel, the "Off" chain or "invisible" chain, carries a persistent, amodal representation that predicts the motion of a formerly visible stimulus that becomes invisible due to occlusion. The learning rule uses disinhibition from the On chain to trigger learning in the Off chain.

neuron, representation, stimulus, (13 more...)

Country:

North America > United States > North Carolina > Orange County > Chapel Hill (0.14)
North America > United States > California > San Diego County > San Diego (0.05)
North America > United States > Minnesota (0.04)
(5 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Unsupervised Pixel-prediction

Softky, William R.

When a sensory system constructs a model of the environment from its input, it might need to verify the model's accuracy. One method of verification is multivariate time-series prediction: a good model could predict the near-future activity of its inputs, much as a good scientific theory predicts future data. Such a predicting model would require copious top-down connections to compare the predictions with the input. That feedback could improve the model's performance in two ways: by biasing internal activity toward expected patterns, and by generating specific error signals if the predictions fail. A proof-of-concept model-an event-driven, computationally efficient layered network, incorporating "cortical" features like all-excitatory synapses and local inhibition-was constructed to make near-future predictions of a simple, moving stimulus. After unsupervised learning, the network contained units not only tuned to obvious features of the stimulus like contour orientation and motion, but also to contour discontinuity ("end-stopping") and illusory contours.

cortex, prediction, spike, (15 more...)

Country:

North America > United States > Wisconsin (0.04)
North America > United States > Maryland > Montgomery County > Bethesda (0.04)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)

Niebur, Ernst, Koch, Christof

Control of Selective Visual Attention: Modeling the "Where" Pathway

Intermediate and higher vision processes require selection of a subset of the available sensory information before further processing. Usually, this selection is implemented in the form of a spatially circumscribed region of the visual field, the so-called "focus of attention" which scans the visual scene dependent on the input and on the attentional state of the subject. We here present a model for the control of the focus of attention in primates, based on a saliency map. This mechanism is not only expected to model the functionality of biological vision but also to be essential for the understanding of complex scenes in machine vision.

mechanism, saliency map, selective visual attention, (15 more...)

Country:

North America > United States > California (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Maryland > Baltimore (0.04)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.90)

Pappu, Suguna, Gold, Steven, Rangarajan, Anand

A Framework for Non-rigid Matching and Correspondence

Matching feature point sets lies at the core of many approaches to object recognition. We present a framework for nonrigid matching that begins with a skeleton module, affine point matching, and then integrates multiple features to improve correspondence and develops an object representation based on spatial regions to model local transformations.

affine transformation, correspondence, transformation, (14 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Connecticut > New Haven County > New Haven (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)