AITopics

Bumptrees are geometric data structures introduced by Omohundro (1991) to provide efficient access to a collection of functions on a Euclidean space of interest. We describe a modified bumptree structure that has been employed as a neural network classifier, and compare its performance on several classification tasks against that of radial basis function networks and the standard mutIi-Iayer perceptron. 1 INTRODUCTION A number of neural network studies have demonstrated the utility of the multi-layer perceptron (MLP) and shown it to be a highly effective paradigm. Studies have also shown, however, that the MLP is not without its problems, in particular it requires an extensive training time, is susceptible to local minima problems and its perfonnance is dependent upon its internal network architecture. In an attempt to improve upon the generalisation performance and computational efficiency a number of studies have been undertaken principally concerned with investigating the parametrisation of the MLP. It is well known, for example, that the generalisation performance of the MLP is affected by the number of hidden units in the network, which have to be determined empirically since theory provides no guidance.

bumptree, problem space, rbf network, (12 more...)

Country:

Europe > United Kingdom > England (0.05)
Europe > Netherlands > North Brabant > Eindhoven (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Ginzburg, Iris, Horn, David

Combined Neural Networks for Time Series Analysis

We propose a method for improving the performance of any network designed to predict the next value of a time series. Vve advocate analyzing the deviations of the network's predictions from the data in the training set. This can be carried out by a secondary network trained on the time series of these residuals. The combined system of the two networks is viewed as the new predictor. We demonstrate the simplicity and success of this method, by applying it to the sunspots data. The small corrections of the secondary network can be regarded as resulting from a Taylor expansion of a complex network which includes the combined system.

prediction, primary network, secondary network, (14 more...)

Country: Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Time Series Analysis (0.42)

Dietterich, Thomas G., Jain, Ajay N., Lathrop, Richard H., Lozano-Pérez, Tomás

A Comparison of Dynamic Reposing and Tangent Distance for Drug Activity Prediction

The task of drug activity prediction is to predict the activity of proposed drug compounds by learning from the observed activity of previously-synthesized drug compounds. Accurate drug activity prediction can save substantial time and money by focusing the efforts of chemists and biologists on the synthesis and testing of compounds whose predicted activity is high. If the requirements for highly active binding can be displayed in three dimensions, chemists can work from such displays to design new compounds having high predicted activity. Drug molecules usually act by binding to localized sites on large receptor molecules or large enyzme molecules. One reasonable way to represent drug molecules is to capture the location of their surface in the (fixed) frame of reference of the (hypothesized) binding site.

drug activity prediction, dynamic reposing and tangent distance, molecule, (9 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > San Mateo County > San Mateo (0.05)
North America > United States > Oregon > Benton County > Corvallis (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.33)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Bayesian Backprop in Action: Pruning, Committees, Error Bars and an Application to Spectroscopy

Thodberg, Hans Henrik

MacKay's Bayesian framework for backpropagation is conceptually appealing as well as practical. It automatically adjusts the weight decay parameters during training, and computes the evidence for each trained network. The evidence is proportional to our belief in the model. In this paper, the framework is extended to pruned nets, leading to an Ockham Factor for "tuning the architecture to the data". A committee of networks, selected by their high evidence, is a natural Bayesian construction.

bayesian backprop, ockham factor, weight decay parameter, (12 more...)

Country: North America > United States > Ohio (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.90)

Robust Parameter Estimation and Model Selection for Neural Network Regression

Liu, Yong

In this paper, it is shown that the conventional back-propagation (BPP) algorithm for neural network regression is robust to leverages (data with:n corrupted), but not to outliers (data with y corrupted). A robust model is to model the error as a mixture of normal distribution. The influence function for this mixture model is calculated and the condition for the model to be robust to outliers is given. EM algorithm [5] is used to estimate the parameter. The usefulness of model selection criteria is also discussed.

algorithm, influence function, outlier, (9 more...)

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
North America > United States > New York (0.04)
North America > United States > California (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Wettschereck, Dietrich, Dietterich, Thomas G.

Locally Adaptive Nearest Neighbor Algorithms

Four versions of a k-nearest neighbor algorithm with locally adaptive k are introduced and compared to the basic k-nearest neighbor algorithm (kNN). Locally adaptive kNN algorithms choose the value of k that should be used to classify a query by consulting the results of cross-validation computations in the local neighborhood of the query. Local kNN methods are shown to perform similar to kNN in experiments with twelve commonly used data sets. Encouraging results in three constructed tasks show that local methods can significantly outperform kNN in specific applications. Local methods can be recommended for online learning and for applications where different regions of the input space are covered by patterns solving different sub-tasks.

algorithm, experiment, knn, (15 more...)

Country:

North America > United States > California > Orange County > Irvine (0.05)
North America > United States > Oregon > Benton County > Corvallis (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)

Genre: Research Report > Experimental Study (0.47)

Industry: Education (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Ron, Dana, Singer, Yoram, Tishby, Naftali

The Power of Amnesia

We propose a learning algorithm for a variable memory length Markov process. Human communication, whether given as text, handwriting, or speech, has multi characteristic time scales. On short scales it is characterized mostly by the dynamics that generate the process, whereas on large scales, more syntactic and semantic information is carried. For that reason the conventionally used fixed memory Markov models cannot capture effectively the complexity of such structures. On the other hand using long memory models uniformly is not practical even for as short memory as four.

algorithm, automaton, probability, (14 more...)

Country:

North America > United States > New York (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Efficient Computation of Complex Distance Metrics Using Hierarchical Filtering

Simard, Patrice Y.

By their very nature, memory based algorithms such as KNN or Parzen windows require a computationally expensive search of a large database of prototypes. In this paper we optimize the searching process for tangent distance (Simard, LeCun and Denker, 1993) to improve speed performance. The closest prototypes are found by recursively searching included subset.s of the database using distances of increasing complexit.y. This is done by using a hierarchy of tangent distances (increasing the Humber of tangent.

prototype, tangent distance, tangent vector, (13 more...)

Country: North America > United States > California (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Schaal, Stefan, Atkeson, Christopher G.

Assessing the Quality of Learned Local Models

An approach is presented to learning high dimensional functions in the case where the learning algorithm can affect the generation of new data. A local modeling algorithm, locally weighted regression, is used to represent the learned function. Architectural parameters of the approach, such as distance metrics, are also localized and become a function of the query point instead of being global. Statistical tests are given for when a local model is good enough and sampling should be moved to a new area. Our methods explicitly deal with the case where prediction accuracy requirements exist during exploration: By gradually shifting a "center of exploration" and controlling the speed of the shift with local prediction accuracy, a goal-directed exploration of state space takes place along the fringes of the current data support until the task goal is achieved.

algorithm, prediction interval, regression, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Mateo County > Redwood City (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Kambhatla, Nanda, Leen, Todd K.

Fast Non-Linear Dimension Reduction

Dimension reduction provides compact representations for storage, transmission, and classification. Dimension reduction algorithms operate by identifying and eliminating statistical redundancies in the data. The optimal linear technique for dimension reduction is principal component analysis (PCA).

algorithm, dimension reduction, reduction, (13 more...)

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.89)