AITopics

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)

Price, David, Knerr, Stefan, Personnaz, Léon, Dreyfus, Gérard

Pairwise Neural Network Classifiers with Probabilistic Outputs

Multi-class classification problems can be efficiently solved by partitioning the original problem into sub-problems involving only two classes: for each pair of classes, a (potentially small) neural network is trained using only the data of these two classes. We show how to combine the outputs of the two-class neural networks in order to obtain posterior probabilities for the class decisions. The resulting probabilistic pairwise classifier is part of a handwriting recognition system which is currently applied to check reading. We present results on real world data bases and show that, from a practical point of view, these results compare favorably to other neural network approaches.

classifier, posterior probability, probability, (10 more...)

Country: Europe > France > Île-de-France > Paris > Paris (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Price, David, Knerr, Stefan, Personnaz, Léon, Dreyfus, Gérard

Pairwise Neural Network Classifiers with Probabilistic Outputs

Multi-class classification problems can be efficiently solved by partitioning the original problem into sub-problems involving only two classes: for each pair of classes, a (potentially small) neural network is trained using only the data of these two classes. We show how to combine the outputs of the two-class neural networks in order to obtain posterior probabilities for the class decisions. The resulting probabilistic pairwise classifier is part of a handwriting recognition system which is currently applied to check reading. We present results on real world data bases and show that, from a practical point of view, these results compare favorably to other neural network approaches.

classifier, posterior probability, probability, (10 more...)

Country: Europe > France > Île-de-France > Paris > Paris (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.86)

Smyth, Padhraic, Fayyad, Usama M., Burl, Michael C., Perona, Pietro, Baldi, Pierre

Inferring Ground Truth from Subjective Labelling of Venus Images

Instead of "ground truth" one may only have the subjective opinion(s) of one or more experts. For example, medical data or image data may be collected off-line and some time later a set of experts analyze the data and produce a set of class labels. The central problem is that of trying to infer the "ground truth" given the noisy subjective estimates of the experts. When one wishes to apply a supervised learning algorithm to the data, the problem is primarily twofold: (i) how to evaluate the relative performance of experts and algorithms, and (ii) how to train a pattern recognition system in the absence of absolute ground truth. In this paper we focus on problem (i), namely the performance evaluation issue, and in particular we discuss the application of a particular modelling technique to the problem of counting volcanoes on the surface of Venus.

algorithm, probability, volcano, (13 more...)

Country:

North America > United States > California > Los Angeles County > Pasadena (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
(2 more...)

Stensmo, Magnus, Sejnowski, Terrence J.

A Mixture Model System for Medical and Machine Diagnosis

Diagnosis of human disease or machine fault is a missing data problem since many variables are initially unknown. Additional information needs to be obtained. The j oint probability distribution of the data can be used to solve this problem. We model this with mixture models whose parameters are estimated by the EM algorithm. This gives the benefit that missing data in the database itself can also be handled correctly. The request for new information to refine the diagnosis is performed using the maximum utility principle. Since the system is based on learning it is domain independent and less labor intensive than expert systems or probabilistic networks. An example using a heart disease database is presented.

diagnosis, mixture model system, probability, (14 more...)

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(5 more...)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Kambhatla, Nanda, Leen, Todd K.

Classifying with Gaussian Mixtures and Clusters

In this paper, we derive classifiers which are winner-take-all (WTA) approximations to a Bayes classifier with Gaussian mixtures for class conditional densities. The derived classifiers include clustering based algorithms like LVQ and k-Means. We propose a constrained rank Gaussian mixtures model and derive a WTA algorithm for it. Our experiments with two speech classification tasks indicate that the constrained rank model and the WTA approximations improve the performance over the unconstrained models. 1 Introduction A classifier assigns vectors from Rn (n dimensional feature space) to one of K classes, partitioning the feature space into a set of K disjoint regions. A Bayesian classifier builds the partition based on a model of the class conditional probability densities of the inputs (the partition is optimal for the given model).

algorithm, classifier, discriminant function, (15 more...)

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)

Bishop, Chris M., Legleye, Claire

Estimating Conditional Probability Densities for Periodic Variables

Many applications of neural networks can be formulated in terms of a multivariate nonlinear mapping from an input vector x to a target vector t. A conventional neural network approach, based on least squares for example, leads to a network mapping which approximates the regression of t on x. A more complete description of the data can be obtained by estimating the conditional probability density of t, conditioned on x, which we write as p(tlx). Various techniques exist for modelling such densities when the target variables live in a Euclidean space. However, a number of potential applications involve angle-like output variables which are periodic on some finite interval (usually chosen to be (0,271")).

conditional probability density, kernel function, wind direction, (12 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.73)

Factorial Learning and the EM Algorithm

Ghahramani, Zoubin

Many real world learning problems are best characterized by an interaction of multiple independent causes or factors. Discovering such causal structure from the data is the focus of this paper. Based on Zemel and Hinton's cooperative vector quantizer (CVQ) architecture, an unsupervised learning algorithm is derived from the Expectation-Maximization (EM) framework. Due to the combinatorial nature of the data generation process, the exact E-step is computationally intractable. Two alternative methods for computing the E-step are proposed: Gibbs sampling and mean-field approximation, and some promising empirical results are presented.

algorithm, e-step, mean-field approximation, (13 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Sung, Kah Kay, Niyogi, Partha

Active Learning for Function Approximation

We develop a principled strategy to sample a function optimally for function approximation tasks within a Bayesian framework. Using ideas from optimal experiment design, we introduce an objective function (incorporating both bias and variance) to measure the degree of approximation, and the potential utility of the data points towards optimizing this objective. We show how the general strategy can be used to derive precise algorithms to select data for two cases: learning unit step functions and polynomial functions. In particular, we investigate whether such active algorithms can learn the target with fewer examples. We obtain theoretical and empirical results to suggest that this is the case. 1 INTRODUCTION AND MOTIVATION Learning from examples is a common supervised learning paradigm that hypothesizes a target concept given a stream of training examples that describes the concept. In function approximation, example-based learning can be formulated as synthesizing an approximation function for data sampled from an unknown target function (Poggio and Girosi, 1990). Active learning describes a class of example-based learning paradigms that seeks out new training examples from specific regions of the input space, instead of passively accepting examples from some data generating source.

active learning, learner, target function, (14 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
North America > United States > New York (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
North America > United States > California > San Diego County > San Diego (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.83)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Ueda, Naonori, Nakano, Ryohei

Deterministic Annealing Variant of the EM Algorithm

We present a deterministic annealing variant of the EM algorithm for maximum likelihood parameter estimation problems. In our approach, the EM process is reformulated as the problem of minimizing the thermodynamic free energy by using the principle of maximum entropy and statistical mechanics analogy. Unlike simulated annealing approaches, this minimization is deterministically performed. Moreover, the derived algorithm, unlike the conventional EM algorithm, can obtain better estimates free of the initial parameter values.

algorithm, deterministic annealing variant, estimation problem, (14 more...)