AITopics

The prior can be obtained by placing prior distributions on the weights in a neural 494 P. W Goldberg, C. K. L Williams and C. M. Bishop network, although we would argue that it is perhaps more natural to place priors directly overfunctions. One tractable way of doing this is to create a Gaussian process prior. This has the advantage that predictions can be made from the posterior using only matrix multiplication for fixed hyperparameters and a global noise level. In contrast, for neural networks (with fixed hyperparameters and a global noise level) it is necessary to use approximations or Markov chain Monte Carlo (MCMC) methods. Rasmussen(1996) has demonstrated that predictions obtained with Gaussian processes are as good as or better than other state-of-the art predictors. In much of the work on regression problems in the statistical and neural networks literatures, it is assumed that there is a global noise level, independent of the input vector x. The book by Bishop (1995) and the papers by Bishop (1994), MacKay (1995) and Bishop and Qazaz (1997) have examined the case of input-dependent noise for parametric models such as neural networks.

artificial intelligence, gaussian process, machine learning, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Hastie, Trevor, Tibshirani, Robert

Classification by Pairwise Coupling

Friedman's combination rule is quite intuitive: assign to the class that wins the most pairwise comparisons.

artificial intelligence, machine learning, procedure, (16 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > California > Santa Clara County (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.71)

Sykacek, Peter, Dorffner, Georg, Rappelsberger, Peter, Zeitlhofer, Josef

Experiences with Bayesian Learning in a Real World Application

Sleep staging is usually based on rules defined by Rechtschaffen and Kales (see [8]). Rechtschaffen and Kales rules define 4 sleep stages, stage one to four, as well as rapid eye movement (REM) and wakefulness. In [1] J. Bentrup and S. Ray report that every year nearly one million US citizens consulted their physicians concerning their sleep. Since sleep staging is a tedious task (one all night recording on average takes abou t 3 hours to score manually), much effort was spent in designing automatic sleep stagers. Sleep staging is a classification problem which was solved using classical statistical t.echniques or techniques emerged from the field of artificial intelligence (AI) . Among classical techniques especially the k nearest neighbor technique was used. In [1] J. Bentrup and S. Ray report that the classical technique outperformed their AI approaches. Among techniques from the field of AI, researchers used inductive learning to build tree based classifiers (e.g.

artificial intelligence, classifier, machine learning, (16 more...)

Country:

North America > United States (0.68)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.47)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.85)

Cross, Andrew D. J., Hancock, Edwin R.

Recovering Perspective Pose with a Dual Step EM Algorithm

This paper describes a new approach to extracting 3D perspective structure from 2D point-sets. The novel feature is to unify the tasks of estimating transformation geometry and identifying pointcorrespondence matches.Unification is realised by constructing a mixture model over the bipartite graph representing the correspondence matchand by effecting optimisation using the EM algorithm. According to our EM framework the probabilities of structural correspondence gatecontributions to the expected likelihood function used to estimate maximum likelihood perspective pose parameters. This provides a means of rejecting structural outliers.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.75)

Bonet, Jeremy S. De, Viola, Paul A.

A Non-Parametric Multi-Scale Statistical Model for Natural Images

The observed distribution of natural images is far from uniform. On the contrary, real images have complex and important structure thatcan be exploited for image processing, recognition and analysis. There have been many proposed approaches to the principled statisticalmodeling of images, but each has been limited in either the complexity of the models or the complexity of the images. Wepresent a nonparametric multi-scale statistical model for images that can be used for recognition, image de-noising, and in a "generative mode" to synthesize high quality textures.

artificial intelligence, machine learning, texture, (14 more...)

Country:

Europe (0.28)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.51)

Saul, Lawrence K., Rahim, Mazin G.

Modeling Acoustic Correlations by Factor Analysis

Hidden Markov models (HMMs) for automatic speech recognition rely on high dimensional feature vectors to summarize the shorttime propertiesof speech. Correlations between features can arise when the speech signal is non-stationary or corrupted by noise. We investigate how to model these correlations using factor analysis, a statistical method for dimensionality reduction. Factor analysis uses a small number of parameters to model the covariance structure ofhigh dimensional data. These parameters are estimated by an Expectation-Maximization (EM) algorithm that can be embedded inthe training procedures for HMMs.

artificial intelligence, factor analysis, machine learning, (12 more...)

Country: North America > Canada > Ontario > Toronto (0.15)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.91)

Mapping a Manifold of Perceptual Observations

Tenenbaum, Joshua B.

Nonlinear dimensionality reduction is formulated here as the problem of trying to find a Euclidean feature-space embedding of a set of observations that preserves as closely as possible their intrinsic metric structure - the distances between points on the observation manifold as measured along geodesic paths. Our isometric feature mapping procedure, or isomap, is able to reliably recover low-dimensional nonlinear structure in realistic perceptual data sets, such as a manifold of face images, where conventional global mapping methods find only local minima. The recovered map provides a canonical set of globally meaningful features, which allows perceptual transformations such as interpolation, extrapolation, and analogy - highly nonlinear transformations in the original observation space - to be computed with simple linear operations in feature space.

artificial intelligence, machine learning, manifold, (18 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Schwenk, Holger, Bengio, Yoshua

Training Methods for Adaptive Boosting of Neural Networks

"Boosting" is a general method for improving the performance of any learning algorithm that consistently generates classifiers which need to perform only slightly better than random guessing. A recently proposed and very promising boosting algorithm is AdaBoost [5]. It has been applied withgreat success to several benchmark machine learning problems using rather simple learning algorithms [4], and decision trees [1, 2, 6]. In this paper we use AdaBoost to improve the performances of neural networks. We compare training methods based on sampling the training set and weighting the cost function. Our system achieves about 1.4% error on a data base of online handwritten digits from more than 200 writers. Adaptive boosting of a multi-layer network achieved 1.5% error on the UCI Letters and 8.1 % error on the UCI satellite data set.

artificial intelligence, classifier, machine learning, (18 more...)

Country: North America > Canada (0.15)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.87)

Schölkopf, Bernhard, Simard, Patrice, Smola, Alex J., Vapnik, Vladimir

Prior Knowledge in Support Vector Kernels

We explore methods for incorporating prior knowledge about a problem at hand in Support Vector learning machines. We show that both invariances undergroup transfonnations and prior knowledge about locality in images can be incorporated by constructing appropriate kernel functions.

artificial intelligence, invariance, machine learning, (18 more...)

Country: North America > United States (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.63)

EM Algorithms for PCA and SPCA

Roweis, Sam T.

I present an expectation-maximization (EM) algorithm for principal component analysis (PCA). The algorithm allows a few eigenvectors and eigenvalues to be extracted from large collections of high dimensional data. It is computationally very efficient in space and time.

algorithm, artificial intelligence, machine learning, (16 more...)

Country: North America > United States (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)