AITopics | Statistical Learning

Collaborating Authors

Statistical Learning

News Overviews Instructional Materials AI-Alerts Classics

Improved Heterogeneous Distance Functions

Journal of Artificial Intelligence ResearchJan-1-1997

Instance-based learning techniques typically handle continuous and linear input values well, but often do not handle nominal input attributes appropriately. The Value Difference Metric (VDM) was designed to find reasonable distance values between nominal attribute values, but it largely ignores continuous attributes, requiring discretization to map continuous values into nominal values. This paper proposes three new heterogeneous distance functions, called the Heterogeneous Value Difference Metric (HVDM), the Interpolated Value Difference Metric (IVDM), and the Windowed Value Difference Metric (WVDM). These new distance functions are designed to handle applications with nominal attributes, continuous attributes, or both. In experiments on 48 applications the new distance metrics achieve higher classification accuracy on average than three previous distance functions on those datasets that have both nominal and continuous attributes.

accuracy, dataset, distance function, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.346

AI Access Foundation

10182

Journal of Artificial Intelligence Research

Country:

North America > United States > California > Orange County > Irvine (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > California > San Mateo County > San Mateo (0.04)
(9 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

EM Optimization of Latent-Variable Density Models

Bishop, Christopher M., Svensén, Markus, Williams, Christopher K. I.

Neural Information Processing SystemsDec-31-1996

There is currently considerable interest in developing general nonlinear densitymodels based on latent, or hidden, variables. Such models have the ability to discover the presence of a relatively small number of underlying'causes' which, acting in combination, give rise to the apparent complexity of the observed data set. Unfortunately, totrain such models generally requires large computational effort. In this paper we introduce a novel latent variable algorithm which retains the general nonlinear capabilities of previous models but which uses a training procedure based on the EM algorithm. We demonstrate the performance of the model on a toy problem and on data from flow diagnostics for a multiphase oil pipeline.

algorithm, artificial intelligence, midstream oil & gas, (18 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.14)

Industry: Energy > Oil & Gas > Midstream (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

EM Optimization of Latent-Variable Density Models

Bishop, Christopher M., Svensén, Markus, Williams, Christopher K. I.

Neural Information Processing SystemsDec-31-1996

There is currently considerable interest in developing general nonlinear density models based on latent, or hidden, variables. Such models have the ability to discover the presence of a relatively small number of underlying'causes' which, acting in combination, give rise to the apparent complexity of the observed data set. Unfortunately, to train such models generally requires large computational effort. In this paper we introduce a novel latent variable algorithm which retains the general nonlinear capabilities of previous models but which uses a training procedure based on the EM algorithm. We demonstrate the performance of the model on a toy problem and on data from flow diagnostics for a multiphase oil pipeline.

algorithm, artificial intelligence, midstream oil & gas, (18 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.14)

Industry: Energy > Oil & Gas > Midstream (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

EM Optimization of Latent-Variable Density Models

Bishop, Christopher M., Svensén, Markus, Williams, Christopher K. I.

Neural Information Processing SystemsDec-31-1996

There is currently considerable interest in developing general nonlinear density models based on latent, or hidden, variables. Such models have the ability to discover the presence of a relatively small number of underlying'causes' which, acting in combination, give rise to the apparent complexity of the observed data set. Unfortunately, to train such models generally requires large computational effort. In this paper we introduce a novel latent variable algorithm which retains the general nonlinear capabilities of previous models but which uses a training procedure based on the EM algorithm. We demonstrate the performance of the model on a toy problem and on data from flow diagnostics for a multiphase oil pipeline.

algorithm, artificial intelligence, midstream oil & gas, (18 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom (0.14)

Industry: Energy > Oil & Gas > Midstream (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

Estimating the Bayes Risk from Sample Data

Snapp, Robert R., Xu, Tong

Neural Information Processing SystemsDec-31-1996

In this setting, each pattern, represented as an n-dimensional feature vector, is associated with a discrete pattern class, or state of nature (Duda and Hart, 1973). Using available information, (e.g., a statistically representative set of labeled feature vectors

artificial intelligence, classifier, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Vermont > Chittenden County > Burlington (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Texas (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Add feedback

Gradient and Hamiltonian Dynamics Applied to Learning in Neural Networks

Howse, James W., Abdallah, Chaouki T., Heileman, Gregory L.

Neural Information Processing SystemsDec-31-1996

James W. Howse Chaouki T. Abdallah Gregory L. Heileman Department of Electrical and Computer Engineering University of New Mexico Albuquerque, NM 87131 Abstract The process of machine learning can be considered in two stages: model selection and parameter estimation. In this paper a technique is presented for constructing dynamical systems with desired qualitative properties. The approach is based on the fact that an n-dimensional nonlinear dynamical system can be decomposed into one gradient and (n - 1) Hamiltonian systems. Thus, the model selection stage consists of choosing the gradient and Hamiltonian portions appropriately so that a certain behavior is obtainable. To estimate the parameters, a stably convergent learning rule is presented.

equation, level surface, trajectory, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > New Mexico > Bernalillo County > Albuquerque (0.24)
North America > United States > California > San Diego County > San Diego (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

Add feedback

Recursive Estimation of Dynamic Modular RBF Networks

Kadirkamanathan, Visakan, Kadirkamanathan, Maha

Neural Information Processing SystemsDec-31-1996

In this paper, recursive estimation algorithms for dynamic modular networks are developed. The models are based on Gaussian RBF networks and the gating network is considered in two stages: At first, it is simply a time-varying scalar and in the second, it is based on the state, as in the mixture of local experts scheme. The resulting algorithm uses Kalman filter estimation for the model estimation and the gating probability estimation. Both, 'hard' and'soft' competition based estimation schemes are developed where in the former, the most probable network is adapted and in the latter all networks are adapted by appropriate weighting of the data. 1 INTRODUCTION The problem of learning multiple modes in a complex nonlinear system is increasingly being studied by various researchers [2, 3, 4, 5, 6], The use of a mixture of local experts [5, 6], and a conditional mixture density network [3] have been developed to model various modes of a system. The development has mainly been on model estimation from a given set of block data, with the model likelihood dependent on the input to the networks.

algorithm, estimation, probability, (14 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.51)

Add feedback

Memory-based Stochastic Optimization

Moore, Andrew W., Schneider, Jeff G.

Neural Information Processing SystemsDec-31-1996

In this paper we introduce new algorithms for optimizing noisy plants in which each experiment is very expensive. The algorithms build a global nonlinear model of the expected output at the same time as using Bayesian linear regression analysis of locally weighted polynomial models. The local model answers queries about confidence, noise, gradient and Hessians, and use them to make automated decisions similar to those made by a practitioner of Response Surface Methodology. The global and local models are combined naturally as a locally weighted regression. We examine the question of whether the global model can really help optimization, and we extend it to the case of time-varying functions. We compare the new algorithms with a highly tuned higher-order stochastic optimization algorithm on randomly-generated functions and a simulated manufacturing task. We note significant improvements in total regret, time to converge, and final solution quality. 1 INTRODUCTION In a stochastic optimization problem, noisy samples are taken from a plant. A sample consists of a chosen control u (a vector ofreal numbers) and a noisy observed response y.

algorithm, experiment, optimization, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Add feedback

Boosting Decision Trees

Drucker, Harris, Cortes, Corinna

Neural Information Processing SystemsDec-31-1996

We introduce a constructive, incremental learning system for regression problems that models data by means of locally linear experts. In contrast to other approaches, the experts are trained independently and do not compete for data during learning. Only when a prediction for a query is required do the experts cooperate by blending their individual predictions. Each expert is trained by minimizing a penalized local cross validation error using second order methods. In this way, an expert is able to find a local distance metric by adjusting the size and shape of the receptive field in which its predictions are valid, and also to detect relevant input features by adjusting its bias on the importance of individual input dimensions. We derive asymptotic results for our method. In a variety of simulations the properties of the algorithm are demonstrated with respect to interference, learning speed, prediction accuracy, feature detection, and task oriented incremental learning.

algorithm, ensemble, weak learner, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Asia > Middle East > Jordan (0.05)
North America > United States > New York (0.04)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)

Add feedback

Constructive Algorithms for Hierarchical Mixtures of Experts

Waterhouse, Steve R., Robinson, Anthony J.

Neural Information Processing SystemsDec-31-1996

By applying a likelihood splitting criteria to each expert in the HME we "grow" the tree adaptively during training. Secondly, by considering only the most probable path through the tree we may "prune" branches away, either temporarily, or permanently if they become redundant. We demonstrate results for the growing and path pruning algorithms which show significant speed ups and more efficient use of parameters over the standard fixed structure in discriminating between two interlocking spirals and classifying 8-bit parity patterns. INTRODUCTION The HME (Jordan & Jacobs 1994) is a tree structured network whose terminal nodes are simple function approximators in the case of regression or classifiers in the case of classification. The outputs of the terminal nodes or experts are recursively combined upwards towards the root node, to form the overall output of the network, by "gates" which are situated at the non-terminal nodes.

constructive algorithm, hierarchical mixture, node, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.25)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback