AITopics

Yuansong Liao and John Moody Department of Computer Science, Oregon Graduate Institute, P.O.Box 91000, Portland, OR 97291-1000 Abstract The committee approach has been proposed for reducing model uncertainty and improving generalization performance. The advantage of committees depends on (1) the performance of individual members and (2) the correlational structure of errors between members. This paper presents an input grouping technique for designing a heterogeneous committee. With this technique, all input variables are first grouped based on their mutual information. Statistically similar variables are assigned to the same group.

committee member, information, input feature, (14 more...)

Country:

North America > United States > Oregon > Multnomah County > Portland (0.24)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Netherlands (0.04)

Genre: Research Report (0.69)

Industry: Banking & Finance > Economy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Forecasting (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Learning the Similarity of Documents: An Information-Geometric Approach to Document Retrieval and Categorization

Hofmann, Thomas

The project pursued in this paper is to develop from first information-geometric principles a general method for learning the similarity between text documents. Each individual document is modeled as a memoryless information source. Based on a latent class decomposition of the term-document matrix, a lowdimensional (curved) multinomial subfamily is learned. From this model a canonical similarity function - known as the Fisher kernel - is derived. Our approach can be applied for unsupervised and supervised learning problems alike.

Country:

North America > United States > New York (0.05)
South America > Brazil (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(2 more...)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

III, John W. Fisher, Ihler, Alexander T., Viola, Paul A.

Learning Informative Statistics: A Nonparametnic Approach

We discuss an information theoretic approach for categorizing and modeling dynamic processes. The approach can learn a compact and informative statistic which summarizes past states to predict future observations. Furthermore, the uncertainty of the prediction is characterized nonparametrically by a joint density over the learned statistic and present observation. We discuss the application of the technique to both noise driven dynamical systems and random processes sampled from a density which is conditioned on the past. In the first case we show results in which both the dynamics of random walk and the statistics of the driving noise are captured. In the second case we present results in which a summarizing statistic is learned on noisy random telegraph waves with differing dependencies on past states. In both cases the algorithm yields a principled approach for discriminating processes with differing dynamics and/or dependencies. The method is grounded in ideas from information theory and nonparametric statistics.

learning informative statistics, statistic, statistics, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.69)

Bakker, Rembrandt, Schouten, Jaap C., Coppens, Marc-Olivier, Takens, Floris, Giles, C. Lee, Bleek, Cor M. van den

Robust Learning of Chaotic Attractors

A fundamental problem with the modeling of chaotic time series data is that minimizing short-term prediction errors does not guarantee a match between the reconstructed attractors of model and experiments. We introduce a modeling paradigm that simultaneously learns to short-tenn predict and to locate the outlines of the attractor by a new way of nonlinear principal component analysis. Closed-loop predictions are constrained to stay within these outlines, to prevent divergence from the attractor. Learning is exceptionally fast: parameter estimation for the 1000 sample laser data from the 1991 Santa Fe time series competition took less than a minute on a 166 MHz Pentium PC.

algorithm, attractor, subspace, (12 more...)

Country:

Europe > Netherlands > South Holland > Delft (0.05)
Asia > Middle East > Jordan (0.05)
North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Yang, Ming-Hsuan, Roth, Dan, Ahuja, Narendra

A SNoW-Based Face Detector

A novel learning approach for human face detection using a network of linear units is presented. The SNoW learning architecture is a sparse network of linear functions over a predefined or incrementally learned feature space and is specifically tailored for learning in the presence of a very large number of features. A wide range of face images in different poses, with different expressions and under different lighting conditions are used as a training set to capture the variations of human faces. Experimental results on commonly used benchmark data sets of a wide range of face images show that the SNoW-based approach outperforms methods that use neural networks, Bayesian methods, support vector machines and others. Furthermore, learning and evaluation using the SNoW-based method are significantly more efficient than with other methods. 1 Introduction Growing interest in intelligent human computer interactions has motivated a recent surge in research on problems such as face tracking, pose estimation, face expression and gesture recognition. Most methods, however, assume human faces in their input images have been detected and localized.

architecture, detection, face detection, (12 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Champaign County > Urbana (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.56)

Smith, Gavin, Freitas, João F. G. de, Robinson, Tony, Niranjan, Mahesan

Speech Modelling Using Subspace and EM Techniques

The speech waveform can be modelled as a piecewise-stationary linear stochastic state space system, and its parameters can be estimated using an expectation-maximisation (EM) algorithm. One problem is the initialisation of the EM algorithm. Standard initialisation schemes can lead to poor formant trajectories. But these trajectories however are important for vowel intelligibility. The aim of this paper is to investigate the suitability of subspace identification methods to initialise EM. The paper compares the subspace state space system identification (4SID) method with the EM algorithm. The 4SID and EM methods are similar in that they both estimate a state sequence (but using Kalman ters fil and Kalman smoothers respectively), and then estimate parameters (but using least-squares and maximum likelihood respectively).

algorithm, formant trajectory, speech modelling, (12 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Ontario > Toronto (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.79)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.77)

Schraudolph, Nicol N., Giannakopoulos, Xavier

Online Independent Component Analysis with Local Learning Rate Adaptation

Stochastic meta-descent (SMD) is a new technique for online adaptation of local learning rates in arbitrary twice-differentiable systems. Like matrix momentum it uses full second-order information while retaining O(n) computational complexity by exploiting the efficient computation of Hessian-vector products. Here we apply SMD to independent component analysis, and employ the resulting algorithm for the blind separation of time-varying mixtures. By matching individual learning rates to the rate of change in each source signal's mixture coefficients, our technique is capable of simultaneously tracking sources that move at very different, a priori unknown speeds.

algorithm, learning rate, neural network, (10 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Texas > Harris County > Houston (0.05)
(8 more...)

Industry:

Education (0.88)
Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.32)

Yang, Howard Hua, Moody, John

Data Visualization and Feature Selection: New Algorithms for Nongaussian Data

Visualization of input data and feature selection are intimately related. A good feature selection algorithm can identify meaningful coordinate projections for low dimensional data visualization. Conversely, a good visualization technique can suggest meaningful features to include in a model. Input variable selection is the most important step in the model selection process. Given a target variable, a set of input variables can be selected as explanatory variables by some prior knowledge.

conditional mi, information, mutual information, (14 more...)

Country:

North America > United States > Oregon (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Colorado (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Williams, Christopher K. I.

A MCMC Approach to Hierarchical Mixture Modelling

There are many hierarchical clustering algorithms available, but these lack a firm statistical basis. Here we set up a hierarchical probabilistic mixture model, where data is generated in a hierarchical tree-structured manner. Markov chain Monte Carlo (MCMC) methods are demonstrated which can be used to sample from the posterior distribution over trees containing variable numbers of hidden units.

configuration, node, probability, (17 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.15)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
Europe > United Kingdom > Scotland (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.70)

Wan, Eric A., Merwe, Rudolph van der, Nelson, Alex T.

Dual Estimation and the Unscented Transformation

Dual estimation refers to the problem of simultaneously estimating the state of a dynamic system and the model which gives rise to the dynamics. Algorithms include expectation-maximization (EM), dual Kalman filtering, and joint Kalman methods. These methods have recently been explored in the context of nonlinear modeling, where a neural network is used as the functional form of the unknown model. Typically, an extended Kalman filter (EKF) or smoother is used for the part of the algorithm that estimates the clean state given the current estimated model. An EKF may also be used to estimate the weights of the network. This paper points out the flaws in using the EKF, and proposes an improvement based on a new approach called the unscented transformation (UT) [3]. A substantial performance gain is achieved with the same order of computational complexity as that of the standard EKF. The approach is illustrated on several dual estimation methods.

algorithm, ekf, estimation, (16 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Oregon > Washington County > Beaverton (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)