AITopics

asymptotically stationary, shortcut connection, time sery, (10 more...)

Country:

North America > United States > New York (0.05)
Europe > Austria (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)

Karakoulas, Grigoris I., Shawe-Taylor, John

Optimizing Classifers for Imbalanced Training Sets

Following recent results [9, 8] showing the importance of the fatshattering dimension in explaining the beneficial effect of a large margin on generalization performance, the current paper investigates the implications of these results for the case of imbalanced datasets and develops two approaches to setting the threshold. The approaches are incorporated into ThetaBoost, a boosting algorithm for dealing with unequal loss functions. The performance of ThetaBoost and the two approaches are tested experimentally.

algorithm, loss function, threshold, (16 more...)

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.31)

Herschkowitz, Didier, Nadal, Jean-Pierre

Unsupervised and Supervised Clustering: The Mutual Information between Parameters and Observations

Recent works in parameter estimation and neural coding have demonstrated that optimal performance are related to the mutual information between parameters and data. We consider the mutual information in the case where the dependency in the parameter (a vector 8) of the conditional p.d.f. of each observation (a vector

calculation, estimator, mutual information, (11 more...)

Country:

Asia > Brunei (0.06)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)

Gentile, Claudio, Warmuth, Manfred K. K.

Linear Hinge Loss and Average Margin

We describe a unifying method for proving relative loss bounds for online linear threshold classification algorithms, such as the Perceptron and the Winnow algorithms. For classification problems the discrete loss is used, i.e., the total number of prediction mistakes. We introduce a continuous loss function, called the "linear hinge loss", that can be employed to derive the updates of the algorithms. We first prove bounds w.r.t. the linear hinge loss and then convert them to the discrete loss. We introduce a notion of "average margin" of a set of examples. We show how relative loss bounds based on the linear hinge loss can be converted to relative loss bounds i.t.o. the discrete loss using the average margin.

algorithm, classification algorithm, perceptron algorithm, (13 more...)

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > Russia (0.04)
Europe > Italy > Lombardy > Milan (0.04)
Asia > Russia (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.43)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

Ferrari-Trecate, Giancarlo, Williams, Christopher K. I., Opper, Manfred

Finite-Dimensional Approximation of Gaussian Processes

Gaussian process (GP) prediction suffers from O(n3) scaling with the data set size n. By using a finite-dimensional basis to approximate the GP predictor, the computational complexity can be reduced. We derive optimal finite-dimensional predictors under a number of assumptions, and show the superiority of these predictors over the Projected Bayes Regression method (which is asymptotically optimal). We also show how to calculate the minimal model size for a given n. The calculations are backed up by numerical experiments.

eigenfunction, gaussian process, predictor, (15 more...)

Country:

Europe > United Kingdom > England > West Midlands > Birmingham (0.05)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Cristianini, Nello, Campbell, Colin, Shawe-Taylor, John

Dynamically Adapting Kernels in Support Vector Machines

The kernel-parameter is one of the few tunable parameters in Support Vector machines, controlling the complexity of the resulting hypothesis. Its choice amounts to model selection and its value is usually found by means of a validation set. We present an algorithm which can automatically perform model selection with little additional computational cost and with no need of a validation set. In this procedure model selection and learning are not separate, but kernels are dynamically adjusted during the learning process to find the kernel parameter which provides the best possible upper bound on the generalisation error. Theoretical results motivating the approach and experimental results confirming its validity are presented.

generalisation error, kernel parameter, support vector machine, (13 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Wisconsin (0.05)
Europe > United Kingdom > England > Bristol (0.05)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Leisch, Friedrich, Trapletti, Adrian, Hornik, Kurt

Stationarity and Stability of Autoregressive Neural Network Processes

AR-NNs are a natural generalization of the classic linear autoregressive AR(p) process (2) See, e.g., Brockwell & Davis (1987) for a comprehensive introduction into AR and ARMA (autoregressive moving average) models. F. Leisch, A. Trapletti and K. Hornik 268 One of the most central questions in linear time series theory is the stationarity of the model, i.e., whether the probabilistic structure of the series is constant over time or at least asymptotically constant (when not started in equilibrium). Surprisingly, this question has not gained much interest in the NN literature, especially there are-up to our knowledge-no results giving conditions for the stationarity of AR NN models. There are results on the stationarity of Hopfield nets (Wang & Sheng, 1996), but these nets cannot be used to estimate conditional expectations for time series prediction. The rest of this paper is organized as follows: In Section 2 we recall some results from time series analysis and Markov chain theory defining the relationship between a time series and its associated Markov chain. In Section 3 we use these results to establish that standard AR-NN models without shortcut connections are stationary. We also give conditions for AR-NN models with shortcut connections to be stationary. Section 4 examines the NN modeling of an important class of non-stationary to the appendix.time

asymptotically stationary, shortcut connection, time sery, (11 more...)

Country:

North America > United States > New York (0.05)
Europe > Austria (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Example-Based Image Synthesis of Articulated Figures

Darrell, Trevor

We present a method for learning complex appearance mappings.

convex hull, example-based image synthesis, interpolation, (14 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Smola, Alex J., Frieß, Thilo-Thomas, Schölkopf, Bernhard

Semiparametric Support Vector and Linear Programming Machines

In fact, for many of the kernels used (not the polynomial kernels) like Gaussian rbf-kernels it can be shown [6] that SV machines are universal approximators. While this is advantageous in general, parametric models are useful techniques in their own right. Especially if one happens to have additional knowledge about the problem, it would be unwise not to take advantage of it. For instance it might be the case that the major properties of the data are described by a combination of a small set of linear independent basis functions {¢Jt (.), ..., ¢n (.)}. Or one may want to correct the data for some (e.g.

semiparametric model, semiparametric support vector, sv machine, (13 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > South Yorkshire > Sheffield (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.53)

Kégl, Balázs, Krzyzak, Adam, Linder, Tamás, Zeger, Kenneth

A Polygonal Line Algorithm for Constructing Principal Curves

Principal curves have been defined as "self consistent" smooth curves which pass through the "middle" of a d-dimensional probability distribution ordata cloud. Recently, we [1] have offered a new approach by defining principal curves as continuous curves of a given length which minimize the expected squared distance between the curve and points of the space randomly chosen according to a given distribution. The new definition made it possible to carry out a theoretical analysis of learning principal curves from training data. In this paper we propose a practical construction based on the new definition. Simulation results demonstrate that the new algorithm compares favorably with previous methods both in terms of performance and computational complexity.

algorithm, generating curve, principal curve, (10 more...)

Country:

Europe > Germany > Saxony > Leipzig (0.05)
North America > United States > New York (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)