AITopics

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Neural Information Processing SystemsDec-31-2000

The Relaxed Online Maximum Margin Algorithm

Li, Yi, Long, Philip M.

We describe a new incremental algorithm for training linear threshold functions:the Relaxed Online Maximum Margin Algorithm, or ROMMA. ROMMA can be viewed as an approximation to the algorithm that repeatedly chooses the hyperplane that classifies previously seen examples correctlywith the maximum margin. It is known that such a maximum-margin hypothesis can be computed by minimizing the length of the weight vector subject to a number of linear constraints. ROMMA works by maintaining a relatively simple relaxation of these constraints that can be efficiently updated. We prove a mistake bound for ROMMA that is the same as that proved for the perceptron algorithm. Our analysis implies that the more computationally intensive maximum-margin algorithm alsosatisfies this mistake bound; this is the first worst-case performance guaranteefor this algorithm. We describe some experiments using ROMMA and a variant that updates its hypothesis more aggressively as batch algorithms to recognize handwritten digits. The computational complexity and simplicity of these algorithms is similar to that of perceptron algorithm,but their generalization is much better. We describe a sense in which the performance of ROMMA converges to that of SVM in the limit if bias isn't considered.

algorithm, perceptron algorithm, romma, (9 more...)

Country:

North America > United States > District of Columbia > Washington (0.04)
Asia > Singapore > Central Region > Singapore (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Carreira-Perpiñán, Miguel Á.

Reconstruction of Sequential Data with Probabilistic Models and Continuity Constraints

Neural Information Processing SystemsDec-31-2000

We consider the problem of reconstructing a temporal discrete sequence of multidimensional real vectors when part of the data is missing, under the assumption that the sequence was generated by a continuous process. Aparticular case of this problem is multivariate regression, which is very difficult when the underlying mapping is one-to-many. We propose analgorithm based on a joint probability model of the variables of interest, implemented using a nonlinear latent variable model. Each point in the sequence is potentially reconstructed as any of the modes of the conditional distribution of the missing variables given the present variables (computed using an exhaustive mode search in a Gaussian mixture). Modeselection is determined by a dynamic programming search that minimises a geometric measure of the reconstructed sequence, derived fromcontinuity constraints. We illustrate the algorithm with a toy example and apply it to a real-world inverse problem, the acoustic-toarticulatory mapping.The results show that the algorithm outperforms conditional mean imputation and multilayer perceptrons. 1 Definition of the problem

artificial intelligence, machine learning, sequence, (17 more...)

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Gentile, Claudio, Warmuth, Manfred K. K.

Linear Hinge Loss and Average Margin

Neural Information Processing SystemsDec-31-1999

We describe a unifying method for proving relative loss bounds for online linear threshold classification algorithms, such as the Perceptron and the Winnow algorithms. For classification problems the discrete loss is used, i.e., the total number of prediction mistakes. We introduce a continuous loss function, called the "linear hinge loss", that can be employed to derive the updates of the algorithms. We first prove bounds w.r.t. the linear hinge loss and then convert them to the discrete loss. We introduce a notion of "average margin" of a set of examples. We show how relative loss bounds based on the linear hinge loss can be converted to relative loss bounds i.t.o. the discrete loss using the average margin.

algorithm, classification algorithm, perceptron algorithm, (13 more...)

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > Russia (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Asia > Russia (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Gentile, Claudio, Warmuth, Manfred K.

Linear Hinge Loss and Average Margin

Neural Information Processing SystemsDec-31-1999

We describe a unifying method for proving relative loss bounds for online linearthreshold classification algorithms, such as the Perceptron and the Winnow algorithms. For classification problems the discrete loss is used, i.e., the total number of prediction mistakes. We introduce a continuous lossfunction, called the "linear hinge loss", that can be employed to derive the updates of the algorithms. We first prove bounds w.r.t. the linear hinge loss and then convert them to the discrete loss. We introduce anotion of "average margin" of a set of examples . We show how relative loss bounds based on the linear hinge loss can be converted to relative loss bounds i.t.o. the discrete loss using the average margin.

algorithm, artificial intelligence, machine learning, (15 more...)

Country:

Europe (0.28)
North America > United States > California (0.14)

Industry: Education > Educational Setting > Online (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Freitas, João F. G. de, Niranjan, Mahesan, Gee, Andrew H.

Regularisation in Sequential Learning Algorithms

In this paper, we discuss regularisation in online/sequential learning algorithms. In environments where data arrives sequentially, techniques such as cross-validation to achieve regularisation or model selection are not possible. Further, bootstrapping to determine a confidence level is not practical. To surmount these problems, a minimum variance estimation approach that makes use of the extended Kalman algorithm for training multi-layer perceptrons is employed. The novel contribution of this paper is to show the theoretical links between extended Kalman filtering, Sutton's variable learning rate algorithms and Mackay's Bayesian estimation framework. In doing so, we propose algorithms to overcome the need for heuristic choices of the initial conditions and noise covariance matrices in the Kalman approach.

artificial intelligence, bayesian inference, machine learning, (18 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.06)
North America > United States > California > San Mateo County > San Mateo (0.04)
Africa > South Africa (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.70)

Verrelst, Herman, Moreau, Yves, Vandewalle, Joos, Timmerman, Dirk

Use of a Multi-Layer Perceptron to Predict Malignancy in Ovarian Tumors

Here we sudy the continuous time, continuous state-space stochastic case, which covers a wide variety of control problems including target, viability, optimization problems (see [FS93], [KP95])}or which a formalism is the following.

ca 125, prediction, roccurve, (15 more...)

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.05)
North America > United States > New York (0.05)
Europe > France (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.47)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.86)

Yang, Howard Hua, Amari, Shun-ichi

The Efficiency and the Robustness of Natural Gradient Descent Learning Rule

The inverse of the Fisher information matrix is used in the natural gradient descent algorithm to train single-layer and multi-layer perceptrons. We have discovered a new scheme to represent the Fisher information matrix of a stochastic multi-layer perceptron. Based on this scheme, we have designed an algorithm to compute the natural gradient. When the input dimension n is much larger than the number of hidden neurons, the complexity of this algorithm is of order O(n). It is confirmed by simulations that the natural gradient descent learning rule is not only efficient but also robust.

algorithm, fisher information matrix, gd algorithm, (12 more...)

Country:

North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kantō > Saitama Prefecture > Saitama (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.95)

Xiong, Yuansheng, Kwon, Chulan, Oh, Jong-Hoon

The Storage Capacity of a Fully-Connected Committee Machine

We study the storage capacity of a fully-connected committee machine with a large number K of hidden nodes. The storage capacity is obtained by analyzing the geometrical structure of the weight space related to the internal representation.

committee machine, internal representation, storage capacity, (14 more...)

Country:

Asia > South Korea > Gyeongsangbuk-do > Pohang (0.05)
North America > United States (0.04)
Asia > Singapore (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.30)

Shawe-Taylor, John, Cristianini, Nello

Data-Dependent Structural Risk Minimization for Perceptron Decision Trees

This paper presents a neural-model of pre-attentive visual processing. The model explains why certain displays can be processed very fast, "in parallel", while others require slower, "serial" processing, in subsequent attentional systems. Our approach stems from the observation that the visual environment is overflowing with diverse information, but the biological information-processing systems analyzing it have a limited capacity [1]. This apparent mismatch suggests that data compression should be performed at an early stage of perception, and that via an accompanying process of dimension reduction, only a few essential features of the visual display should be retained. We propose that only parallel displays incorporate global features that enable fast target detection, and hence they can be processed pre-attentively, with all items (target and dis tractors) examined at once.

node, principal axis, similarity, (16 more...)

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
North America > United States > New York (0.04)
North America > United States > Indiana (0.04)
North America > United States > California > Orange County > Irvine (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.43)