AITopics

This paper introduces a method for regularization ofHMM systems that avoids parameter overfitting caused by insufficient training data. Regularization is done by augmenting the EM training method by a penalty term that favors simple and smooth HMM systems. The penalty term is constructed as a mixture model of negative exponential distributions that is assumed to generate the state dependent emission probabilities of the HMMs. This new method is the successful transfer of a well known regularization approach in neural networks to the HMM domain and can be interpreted as a generalization of traditional state-tying for HMM systems. The effect of regularization is demonstrated for continuous speech recognition tasks by improving overfitted triphone models and by speaker adaptation with limited training data. 1 Introduction One general problem when constructing statistical pattern recognition systems is to ensure the capability to generalize well, i.e. the system must be able to classify data that is not contained in the training data set.

neural network, probability, speech recognition, (18 more...)

Country: North America > United States (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.54)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Hayashi, Akira, Suematsu, Nobuo

Viewing Classifier Systems as Model Free Learning in POMDPs

Classifier systems are now viewed disappointing because of their problems such as the rule strength vs rule set performance problem and the credit assignment problem. In order to solve the problems, we have developed a hybrid classifier system: GLS (Generalization Learning System). In designing GLS, we view CSs as model free learning in POMDPs and take a hybrid approach to finding the best generalization, given the total number of rules. GLS uses the policy improvement procedure by Jaakkola et al. for an locally optimal stochastic policy when a set of rule conditions is given. GLS uses GA to search for the best set of rule conditions. 1 INTRODUCTION Classifier systems (CSs) (Holland 1986) have been among the most used in reinforcement learning.

artificial intelligence, evolutionary algorithm, rule set, (15 more...)

Country: Asia > Japan (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.97)

Probabilistic Modeling for Face Orientation Discrimination: Learning from Labeled and Unlabeled Data

Baluja, Shumeet

This paper presents probabilistic modeling methods to solve the problem of discriminating between five facial orientations with very little labeled data.

artificial intelligence, dependency, machine learning, (13 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Rinberg, Dmitry, Davidowitz, Hanan, Tishby, Naftali

Multi-Electrode Spike Sorting by Clustering Transfer Functions

Since every electrode is in a different position it will measure a different contribution from each of the different neurons. Simply stated, the problem is this: how can these complex signals be untangled to determine when each individual cell fired? This problem is difficult because, a) the objects being classified are very similar and often noisy, b) spikes coming from the same cell can ·Permanent address: Institute of Computer Science and Center for Neural Computation, The Hebrew University, Jerusalem, Israel.

health & medicine, neurology, spike, (19 more...)

Country: Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.24)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.48)

Technology: Information Technology > Artificial Intelligence (0.90)

Freeman, William T., Pasztor, Egon C.

Learning to Estimate Scenes from Images

We seek the scene interpretation that best explains image data. For example, we may want to infer the projected velocities (scene) which best explain two consecutive image frames (image).

artificial intelligence, bayesian inference, probability, (16 more...)

Country: North America > United States > California (0.14)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Grandvalet, Yves, Canu, Stéphane

Outcomes of the Equivalence of Adaptive Ridge with Least Absolute Shrinkage

In supervised learning, we have a set of explicative variables x from which we wish to predict a response variable y.

adaptive ridge, artificial intelligence, bayesian inference, (18 more...)

Country:

North America > United States (0.15)
Europe > France (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Blake, Andrew, North, Ben, Isard, Michael

Learning Multi-Class Dynamics

Yule-Walker) are available for learning Auto-Regressive process models of simple, directly observable, dynamical processes. When sensor noise means that dynamics are observed only approximately, learning can still been achieved via Expectation-Maximisation (EM) together with Kalman Filtering. However, this does not handle more complex dynamics, involving multiple classes of motion.

algorithm, artificial intelligence, machine learning, (18 more...)

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Haruno, Masahiko, Wolpert, Daniel M., Kawato, Mitsuo

Multiple Paired Forward-Inverse Models for Human Motor Learning and Control

Humans demonstrate a remarkable ability to generate accurate and appropriate motor behavior under many different and oftpn uncprtain environmental conditions. This paper describes a new modular approach to human motor learning and control, baspd on multiple pairs of inverse (controller) and forward (prpdictor) models. This architecture simultaneously learns the multiple inverse models necessary for control as well as how to select the inverse models appropriate for a given em'ironm0nt. Simulations of object manipulation demonstrates the ability to learn mUltiple objects, appropriate generalization to novel objects and the inappropriate activation of motor programs based on visual cues, followed by online correction, seen in the "size-weight illusion".

artificial intelligence, inverse model, machine learning, (15 more...)

Country: Asia > Japan (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Graepel, Thore, Herbrich, Ralf, Bollmann-Sdorra, Peter, Obermayer, Klaus

Classification on Pairwise Proximity Data

We investigate the problem of learning a classification task on data represented in terms of their pairwise proximities. This representation doesnot refer to an explicit feature representation of the data items and is thus more general than the standard approach of using Euclideanfeature vectors, from which pairwise proximities can always be calculated. Our first approach is based on a combined linear embedding and classification procedure resulting in an extension ofthe Optimal Hyperplane algorithm to pseudo-Euclidean data. As an alternative we present another approach based on a linear threshold model in the proximity values themselves, which is optimized using Structural Risk Minimization. We show that prior knowledge about the problem can be incorporated by the choice of distance measures and examine different metrics W.r.t.

artificial intelligence, data item, health & medicine, (17 more...)

Country: Europe > Germany (0.29)

Industry: Health & Medicine (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)