AITopics

The Facial Action Coding System, (FACS), devised by Ekman and Friesen (1978), provides an objective meanS for measuring the facial muscle contractions involved in a facial expression. In this paper, we approach automated facial expression analysis by detecting and classifying facial actions. We generated a database of over 1100 image sequences of 24 subjects performing over 150 distinct facial actions or action combinations. We compare three different approaches to classifying the facial actions in these images: Holistic spatial analysis based on principal components of graylevel images; explicit measurement of local image features such as wrinkles; and template matching with motion flow fields. On a dataset containing six individual actions and 20 subjects, these methods had 89%, 57%, and 85% performances respectively for generalization to novel subjects. When combined, performance improved to 92%.

expression, facial action, facial expression, (13 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.29)
North America > United States > California > San Diego County > La Jolla (0.05)
Europe > Switzerland > Zürich > Zürich (0.05)
(4 more...)

Genre: Research Report (0.47)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)

Frey, Brendan J., Hinton, Geoffrey E., Dayan, Peter

Does the Wake-sleep Algorithm Produce Good Density Estimators?

The wake-sleep algorithm (Hinton, Dayan, Frey and Neal 1995) is a relatively efficient method of fitting a multilayer stochastic generative model to high-dimensional data. In addition to the top-down connections in the generative model, it makes use of bottom-up connections for approximating the probability distribution over the hidden units given the data, and it trains these bottom-up connections using a simple delta rule. We use a variety of synthetic and real data sets to compare the performance of the wake-sleep algorithm with Monte Carlo and mean field methods for fitting the same generative model and also compare it with other models that are less powerful but easier to fit. 1 INTRODUCTION Neural networks are often used as bottom-up recognition devices that transform input vectors into representations of those vectors in one or more hidden layers. But multilayer networks of stochastic neurons can also be used as top-down generative models that produce patterns with complicated correlational structure in the bottom visible layer. In this paper we consider generative models composed of layers of stochastic binary logistic units. Given a generative model parameterized by top-down weights, there is an obvious way to perform unsupervised learning. The generative weights are adjusted to maximize the probability that the visible vectors generated by the model would match the observed data.

algorithm produce good density estimator, helmholtz machine, probability, (10 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Jackson, Jeffrey C., Craven, Mark

Learning Sparse Perceptrons

We introduce a new algorithm designed to learn sparse perceptrons over input representations which include high-order features. Our algorithm, which is based on a hypothesis-boosting method, is able to PAClearn a relatively natural class of target concepts. Moreover, the algorithm appears to work well in practice: on a set of three problem domains, the algorithm produces classifiers that utilize small numbers of features yet exhibit good generalization performance. Perhaps most importantly, our algorithm generates concept descriptions that are easy for humans to understand. However, in many applications, such as those that may involve scientific discovery, it is crucial to be able to explain predictions.

algorithm, hypothesis, perceptron, (16 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.47)
Research Report > Experimental Study (0.47)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Using Unlabeled Data for Supervised Learning

Towell, Geoffrey G.

For example, it is trivial to record hours of heartbeats from hundreds of patients. However, it is expensive to hire cardiologists to label each of the recorded beats. One response to the expense of class labels is to squeeze the most information possible out of each labeled example. Regularization and cross-validation both have this goal. A second response is to start with a small set of labeled examples and request labels of only those currently unlabeled examples that are expected to provide a significant improvement in the behavior of the classifier (Lewis & Catlett, 1994; Freund et al., 1993). A third response is to tap into a largely ignored potential source of information; namely, unlabeled examples. This response is supported by the theoretical work of Castelli and Cover (1995) which suggests that unlabeled examples have value in learning classification problems.

information, sulu, unlabeled example, (15 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.42)

Is Learning The n-th Thing Any Easier Than Learning The First?

Thrun, Sebastian

This paper investigates learning in a lifelong context. Lifelong learning addresses situations in which a learner faces a whole stream of learning tasks. Such scenarios provide the opportunity to transfer knowledge across multiple learning tasks, in order to generalize more accurately from less training data. In this paper, several different approaches to lifelong learning are described, and applied in an object recognition domain. It is shown that across the board, lifelong learning approaches generalize consistently more accurately from less training data, by their ability to transfer knowledge across learning tasks.

knowledge, neural network, representation, (15 more...)

Country:

North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.05)
Asia > Middle East > Israel (0.05)
(4 more...)

Genre:

Overview (0.74)
Research Report (0.54)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.77)

A Practical Monte Carlo Implementation of Bayesian Learning

Rasmussen, Carl Edward

A practical method for Bayesian training of feed-forward neural networks using sophisticated Monte Carlo methods is presented and evaluated. In reasonably small amounts of computer time this approach outperforms other state-of-the-art methods on 5 datalimited tasks from real world domains.

carlo method, hyperparameter, monte carlo method, (12 more...)

Country:

North America > Canada > Ontario > Toronto (0.16)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Opitz, David W., Shavlik, Jude W.

Generating Accurate and Diverse Members of a Neural-Network Ensemble

In particular, combining separately trained neural networks (commonly referred to as a neural-network ensemble) has been demonstrated to be particularly successful (Alpaydin, 1993; Drucker et al., 1994; Hansen and Salamon, 1990; Hashem et al., 1994; Krogh and Vedelsby, 1995; Maclin and Shavlik, 1995; Perrone, 1992). Both theoretical (Hansen and Salamon, 1990; Krogh and Vedelsby, 1995) and empirical (Hashem et al., 1994; 536 D. W. OPITZ, J. W. SHA VLIK Maclin and Shavlik, 1995) work has shown that a good ensemble is one where the individual networks are both accurate and make their errors on different parts of the input space; however, most previous work has either focussed on combining the output of multiple trained networks or only indirectly addressed how we should generate a good set of networks.

algorithm, ensemble, neural network, (13 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > New Jersey > Middlesex County > New Brunswick (0.05)
North America > United States > Minnesota > St. Louis County > Duluth (0.04)
(6 more...)

Genre: Research Report (0.69)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Hinton, Geoffrey E., Revow, Michael

Using Pairs of Data-Points to Define Splits for Decision Trees

CART either split the data using axis-aligned hyperplanes or they perform a computationally expensive search in the continuous space of hyperplanes with unrestricted orientations. We show that the limitations of the former can be overcome without resorting to the latter. For every pair of training data-points, there is one hyperplane that is orthogonal to the line joining the data-points and bisects this line. Such hyperplanes are plausible candidates for splits. In a comparison on a suite of 12 datasets we found that this method of generating candidate splits outperformed the standard methods, particularly when the training sets were small. 1 Introduction Binary decision trees come in many flavours, but they all rely on splitting the set of k-dimensional data-points at each internal node into two disjoint sets.

dataset, hyperplane, projection, (15 more...)

Country:

North America > Canada > Ontario > Toronto (0.48)
North America > United States > California (0.05)
North America > United States > Wisconsin (0.04)

Genre: Research Report (0.48)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (0.63)

Konig, Yochai, Bourlard, Hervé, Morgan, Nelson

REMAP: Recursive Estimation and Maximization of A Posteriori Probabilities - Application to Transition-Based Connectionist Speech Recognition

In this paper, we introduce REMAP, an approach for the training and estimation of posterior probabilities using a recursive algorithm that is reminiscent of the EMbased Forward-Backward (Liporace 1982) algorithm for the estimation of sequence likelihoods. Although very general, the method is developed in the context of a statistical model for transition-based speech recognition using Artificial Neural Networks (ANN) to generate probabilities for Hidden Markov Models (HMMs). In the new approach, we use local conditional posterior probabilities of transitions to estimate global posterior probabilities of word sequences. Although we still use ANNs to estimate posterior probabilities, the network is trained with targets that are themselves estimates of local posterior probabilities. An initial experimental result shows a significant decrease in error-rate in comparison to a baseline system. 1 INTRODUCTION The ultimate goal in speech recognition is to determine the sequence of words that has been uttered.

algorithm, posterior probability, probability, (11 more...)

Country:

Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.05)
North America > United States > California > Alameda County > Berkeley (0.05)
North America > United States > Oregon (0.04)
Europe > Belgium (0.04)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)

Quadratic-Type Lyapunov Functions for Competitive Neural Networks with Different Time-Scales

Meyer-Bäse, Anke

The dynamics of complex neural networks modelling the selforganization process in cortical maps must include the aspects of long and short-term memory. The behaviour of the network is such characterized by an equation of neural activity as a fast phenomenon and an equation of synaptic modification as a slow part of the neural system. We present a quadratic-type Lyapunov function for the flow of a competitive neural system with fast and slow dynamic variables. We also show the consequences of the stability analysis on the neural net parameters. 1 INTRODUCTION This paper investigates a special class of laterally inhibited neural networks. In particular, we have examined the dynamics of a restricted class of laterally inhibited neural networks from a rigorous analytic standpoint.

equilibrium point, lyapunov function, neural network, (11 more...)

Country: Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.05)

Genre: Research Report (0.55)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)