AITopics

Our goal is to predict the class membership of an observation with predictor vector Xo Nearest neighbor classification is a simple and appealing approach to this problem.

artificial intelligence, classification, machine learning, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (0.68)

The Capacity of a Bump

Flake, Gary William

Recently, several researchers have reported encouraging experimental results when using Gaussian or bump-like activation functions in multilayer perceptrons. Networks of this type usually require fewer hidden layers and units and often learn much faster than typical sigmoidal networks. To explain these results we consider a hyper-ridge network, which is a simple perceptron with no hidden units and a rid¥e activation function. If we are interested in partitioningp points in d dimensions into two classes then in the limit as d approaches infinity the capacity of a hyper-ridge and a perceptron is identical.

artificial intelligence, neural network, perceptron, (18 more...)

Country: North America > United States > Maryland > Prince George's County > College Park (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Drucker, Harris, Cortes, Corinna

Boosting Decision Trees

We introduce a constructive, incremental learning system for regression problems that models data by means of locally linear experts. In contrast to other approaches, the experts are trained independently and do not compete for data during learning. Only when a prediction for a query is required do the experts cooperate by blending their individual predictions. Each expert is trained by minimizing a penalized local cross validation error using second order methods. In this way, an expert is able to find a local distance metric by adjusting the size and shape of the receptive field in which its predictions are valid, and also to detect relevant input features by adjusting its bias on the importance of individual input dimensions. We derive asymptotic results for our method. In a variety of simulations the properties of the algorithm are demonstrated with respect to interference, learning speed, prediction accuracy, feature detection, and task oriented incremental learning.

artificial intelligence, decision tree learning, weak learner, (19 more...)

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)

Choi, Samuel P. M., Yeung, Dit-Yan

Predictive Q-Routing: A Memory-based Reinforcement Learning Approach to Adaptive Traffic Control

The controllers usually have no or only very little prior knowledge of the environment. While only local communication between controllers is allowed, the controllers must cooperate among themselves to achieve the common, global objective. Finding the optimal routing policy in such a distributed manner is very difficult. Moreover, since the environment is non-stationary, the optimal policy varies with time as a result of changes in network traffic and topology.

artificial intelligence, q-routing, télécommunications, (20 more...)

Country:

North America > United States (0.47)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Industry:

Transportation (0.36)
Telecommunications (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Family Discovery

Omohundro, Stephen M.

"Family discovery" is the task of learning the dimension and structure ofa parameterized family of stochastic models. It is especially appropriatewhen the training examples are partitioned into "episodes" of samples drawn from a single parameter value. We present three family discovery algorithms based on surface learning andshow that they significantly improve performance over two alternatives on a parameterized classification task. 1 INTRODUCTION Human listeners improve their ability to recognize speech by identifying the accent of the speaker. "Might" in an American accent is similar to "mate" in an Australian accent. By first identifying the accent, discrimination between these two words is improved.

algorithm, artificial intelligence, machine learning, (15 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Miller, David J., Rao, Ajit V., Rose, Kenneth, Gersho, Allen

An Information-theoretic Learning Algorithm for Neural Network Classification

A new learning algorithm is developed for the design of statistical classifiers minimizing the rate of misclassification. The method, which is based on ideas from information theory and analogies to statistical physics, assigns data to classes in probability. The distributions arechosen to minimize the expected classification error while simultaneously enforcing the classifier's structure and a level of "randomness" measured by Shannon's entropy. Achievement of the classifier structure is quantified by an associated cost. The constrained optimizationproblem is equivalent to the minimization of a Helmholtz free energy, and the resulting optimization method is a basic extension of the deterministic annealing algorithm that explicitly enforces structural constraints on assignments while reducing theentropy and expected cost with temperature. In the limit of low temperature, the error rate is minimized directly and a hard classifier with the requisite structure is obtained. This learning algorithmcan be used to design a variety of classifier structures. The approach is compared with standard methods for radial basis function design and is demonstrated to substantially outperform other design methods on several benchmark examples, while often retainingdesign complexity comparable to, or only moderately greater than that of strict descent-based methods.

artificial intelligence, classifier, optimization problem, (16 more...)

Country: North America > United States > California > Santa Barbara County > Santa Barbara (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.89)

Makeig, Scott, Bell, Anthony J., Jung, Tzyy-Ping, Sejnowski, Terrence J.

Independent Component Analysis of Electroencephalographic Data

Recent efforts to identify EEG sources have focused mostly on verforming spatial segregation and localization of source activity [4]. By applying the leA algorithm of Bell and Sejnowski [1], we attempt to completely separate the twin problems of source identification (What) and source localization (Where). The leA algorithm derives independent sources from highly correlated EEG signals statistically and without regard to the physical location or configuration of the source generators. Rather than modeling the EEG as a unitary output of a multidimensional dynamical system,or as "the roar of the crowd" of independent microscopic generators, we suppose that the EEG is the output of a number of statistically independent but spatially fixed potential-generating systems which may either be spatially restricted or widely distributed.

artificial intelligence, health & medicine, independent component analysis, (17 more...)

Country: North America > United States (0.48)

Industry:

Health & Medicine (0.70)
Government > Regional Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.95)

Optimization Principles for the Neural Code

DeWeese, Michael

Recent experiments show that the neural codes at work in a wide range of creatures share some common features. At first sight, these observations seem unrelated. However, we show that these features arise naturally in a linear filtered threshold crossing (LFTC) model when we set the threshold to maximize the transmitted information. This maximization process requires neural adaptation to not only the DC signal level, as in conventional light and dark adaptation, but also to the statistical structure of the signal and noise distributions. Wealso present a new approach for calculating the mutual information between a neuron's output spike train and any aspect of its input signal which does not require reconstruction of the input signal.This formulation is valid provided the correlations in the spike train are small, and we provide a procedure for checking this assumption.

artificial intelligence, information, spike train, (17 more...)

Country: North America > United States (0.14)

Genre: Research Report (0.48)

Technology: Information Technology > Artificial Intelligence (0.47)

Learning long-term dependencies is not as difficult with NARX networks

Lin, Tsungnan, Horne, Bill G., Tiño, Peter, Giles, C. Lee

However, learning simple behavior can be quite "Also with NEC Research Institute.

artificial intelligence, long-term dependency, neural network, (18 more...)

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Drucker, Harris, Cortes, Corinna

Boosting Decision Trees

We introduce a constructive, incremental learning system for regression problems that models data by means of locally linear experts. In contrast to other approaches, the experts are trained independently and do not compete for data during learning. Only when a prediction for a query is required do the experts cooperate by blending their individual predictions. Eachexpert is trained by minimizing a penalized local cross validation errorusing second order methods. In this way, an expert is able to find a local distance metric by adjusting the size and shape of the receptive fieldin which its predictions are valid, and also to detect relevant input features by adjusting its bias on the importance of individual input dimensions. We derive asymptotic results for our method. In a variety of simulations the properties of the algorithm are demonstrated with respect to interference, learning speed, prediction accuracy, feature detection, and task oriented incremental learning.

artificial intelligence, decision tree learning, weak learner, (20 more...)

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)