AITopics

We explore in this study the statistical properties of this normalization in the presence of noise. Using simulations, we show that divisive normalization is a close approximation to a maximum likelihood estimator, which, in the context of population coding, is the same as an ideal observer. We also demonstrate analytically that this is a general property of a large class of nonlinear recurrent networks with line attractors. Our work suggests that divisive normalization plays a critical role in noise filtering, and that every cortical layer may be an ideal observer of the activity in the preceding layer. Information processing in the cortex is often formalized as a sequence of a linear stages followed by a nonlinearity.

noise, normalization, spatial frequency, (10 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.38)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.38)

Bayesian Modeling of Human Concept Learning

Tenenbaum, Joshua B.

I consider the problem of learning concepts from small numbers of positive examples, a feat which humans perform routinely but which computers are rarely capable of. Bridging machine learning and cognitive science perspectives, I present both theoretical analysis and an empirical study with human subjects for the simple task oflearning concepts corresponding to axis-aligned rectangles in a multidimensional feature space. Existing learning models, when applied to this task, cannot explain how subjects generalize from only a few examples of the concept. I propose a principled Bayesian model based on the assumption that the examples are a random sample from the concept to be learned. The model gives precise fits to human behavior on this simple task and provides qualitati ve insights into more complex, realistic cases of concept learning.

generalization, hypothesis, rectangle, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.47)

Industry: Health & Medicine > Therapeutic Area (0.30)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

An Entropic Estimator for Structure Discovery

Brand, Matthew

We introduce a novel framework for simultaneous structure and parameter learning in hidden-variable conditional probability models, based on an entropic prior and a solution for its maximum a posteriori (MAP) estimator. The MAP estimate minimizes uncertainty in all respects: cross-entropy between model and data; entropy of the model; entropy of the data's descriptive statistics. Iterative estimation extinguishes weakly supported parameters, compressing and sparsifying the model. Trimming operators accelerate this process by removing excess parameters and, unlike most pruning schemes, guarantee an increase in posterior probability. Entropic estimation takes a overcomplete random model and simplifies it, inducing the structure of relations between hidden and observed variables. Applied to hidden Markov models (HMMs), it finds a concise finite-state machine representing the hidden structure of a signal. We entropically model music, handwriting, and video time-series, and show that the resulting models are highly concise, structured, predictive, and interpretable: Surviving states tend to be highly correlated with meaningful partitions of the data, while surviving transitions provide a low-perplexity model of the signal dynamics.

entropy, estimator, probability, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Bristol (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)

Williams, Christopher K. I., Adams, Nicholas J.

DTs: Dynamic Trees

A dynamic tree model specifies a prior over a large number of trees, each one of which is a tree-structured belief net (TSBN) . Our aim is to retain the advantages of tree-structured belief networks, namely the hierarchical structure of the model and (in part) the efficient inference algorithms, while avoiding the "blocky" artifacts that derive from a single, fixed TSBN structure. One use for DTs is as prior models over labellings for image segmentation problems.

artificial intelligence, machine learning, node, (18 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Kearns, Michael J., Saul, Lawrence K.

Inference in Multilayer Networks via Large Deviation Bounds

Arguably oneabilities of the most important types of information processing is the capacity for probabilistic reasoning. The properties of undirectedproDabilistic models represented as symmetric networks ethave been studied extensively using methods from statistical mechanics (Hertz aI, 1991). Detailed analyses of these models are possible by exploiting averaging that occur in the thermodynamic limit of large networks.phenomena In this paper, we analyze the limit of large, multilayer networks for probabilistic models represented as directed acyclic graphs. These models are known as Bayesian networks (Pearl, 1988; Neal, 1992), and they have different probabilistic semantics than symmetric neural networks (such as Hopfield models or Boltzmann machines). We show that the intractability of exact inference in multilayer Bayesian networks 261 Inference in Multilayer Networks via Large Deviation Bounds does not preclude their effective use. Our work builds on earlier studies of variational methods (Jordan et aI, 1997).

artificial intelligence, bayesian inference, machine learning, (17 more...)

Country:

Asia > Middle East > Jordan (0.25)
North America > United States > California > San Mateo County (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Learning a Hierarchical Belief Network of Independent Factor Analyzers

Attias, Hagai

The model parameters are learned in an unsupervised manner by maximizing the likelihood that these data are generated by the model.

algorithm, artificial intelligence, machine learning, (17 more...)

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.53)

Denève, Sophie, Pouget, Alexandre, Latham, Peter E.

Divisive Normalization, Line Attractor Networks and Ideal Observers

Using simulations, we show that divisive normalization is a close approximation to a maximum likelihood estimator, which, in the context of population coding, is the same as an ideal observer. We also demonstrate analytically thatthis is a general property of a large class of nonlinear recurrent networks with line attractors. Our work suggests that divisive normalization plays a critical role in noise filtering, and that every cortical layer may be an ideal observer of the activity in the preceding layer. Information processing in the cortex is often formalized as a sequence of a linear stages followed by a nonlinearity. In the visual cortex, the nonlinearity is best described bysquaring combined with a divisive pooling of local activities.

artificial intelligence, machine learning, normalization, (13 more...)

Country: North America > United States > California > Los Angeles County > Los Angeles (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.38)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.38)

Oliver, Nuria, Rosario, Barbara, Pentland, Alex

Graphical Models for Recognizing Human Interactions

We describe a real-time computer vision and machine learning system for modeling and recognizing human behaviors in two different scenarios: (1) complex, twohanded actionrecognition in the martial art of Tai Chi and (2) detection and recognition of individual human behaviors and multiple-person interactions in a visual surveillance task. In the latter case, the system is particularly concerned with detecting when interactions between people occur, and classifying them. Graphical models, such as Hidden Markov Models (HMMs) [6] and Coupled Hidden MarkovModels (CHMMs) [3, 2], seem appropriate for modeling and, classifying human behaviors because they offer dynamic time warping, a well-understood training algorithm, and a clear Bayesian semantics for both individual (HMMs) and interacting or coupled (CHMMs) generative processes. A major problem with this data-driven statistical approach, especially when modeling rare or anomalous behaviors, is the limited number of training examples. A major emphasis of our work, therefore, is on efficient Bayesian integration of both prior knowledge with evidence from data.

artificial intelligence, bayesian inference, machine learning, (17 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Industry: Leisure & Entertainment > Sports (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Moghaddam, Baback, Jebara, Tony, Pentland, Alex

Bayesian Modeling of Facial Similarity

In previous work [6, 9, 10], we advanced a new technique for direct visual matching of images for the purposes of face recognition and image retrieval, using a probabilistic measure of similarity based primarily on a Bayesian (MAP) analysis of image differences, leadingto a "dual" basis similar to eigenfaces [13]. The performance advantage of this probabilistic matching technique over standard Euclidean nearest-neighbor eigenface matching was recently demonstrated using results from DARPA's 1996 "FERET" face recognition competition, in which this probabilistic matching algorithm was found to be the top performer. We have further developed a simple method of replacing the costly compution of nonlinear (online) Bayesian similarity measures by the relatively inexpensive computation of linear (offline) subspace projections and simple (online) Euclidean norms, thus resulting in a significant computational speedup for implementation with very large image databases as typically encountered in real-world applications.

artificial intelligence, machine learning, similarity measure, (16 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Industry:

Government > Military (0.69)
Government > Regional Government > North America Government > United States Government (0.54)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Probabilistic Modeling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data

Baluja, Shumeet

This paper presents probabilistic modeling methods to solve the problem of discriminating betweenfive facial orientations with very little labeled data.

artificial intelligence, dependency, machine learning, (13 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)