AITopics

Yuansong Liao and John Moody Department of Computer Science, Oregon Graduate Institute, P.O.Box 91000, Portland, OR 97291-1000 Abstract The committee approach has been proposed for reducing model uncertainty and improving generalization performance. The advantage of committees depends on (1) the performance of individual members and (2) the correlational structure of errors between members. This paper presents an input grouping technique for designing a heterogeneous committee. With this technique, all input variables are first grouped based on their mutual information. Statistically similar variables are assigned to the same group.

artificial intelligence, banking & finance, committee member, (17 more...)

Country: North America > United States > Oregon > Multnomah County > Portland (0.24)

Genre: Research Report (0.69)

Industry: Banking & Finance > Economy (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Forecasting (0.42)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

The Infinite Gaussian Mixture Model

Rasmussen, Carl Edward

In a Bayesian mixture model it is not necessary a priori to limit the number of components to be finite. In this paper an infinite Gaussian mixture model is presented which neatly sidesteps the difficult problem of finding the "right" number of mixture components. Inference in the model is done using an efficient parameter-free Markov Chain that relies entirely on Gibbs sampling.

artificial intelligence, bayesian inference, unrepresented class, (17 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Denmark > Capital Region > Kongens Lyngby (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Burges, Christopher J. C., Crisp, David J.

Uniqueness of the SVM Solution

We give necessary and sufficient conditions for uniqueness of the support vector solution for the problems of pattern recognition and regression estimation, for a general class of cost functions. We show that if the solution is not unique, all support vectors are necessarily at bound, and we give some simple examples of non-unique solutions. We note that uniqueness of the primal (dual) solution does not necessarily imply uniqueness of the dual (primal) solution. We show how to compute the threshold b when the solution is unique, but when all support vectors are at bound, in which case the usual method for determining b does not work. 1 Introduction Support vector machines (SVMs) have attracted wide interest as a means to implement structural risk minimization for the problems of classification and regression estimation. The fact that training an SVM amounts to solving a convex quadratic programming problem means that the solution found is global, and that if it is not unique, then the set of global solutions is itself convex; furthermore, if the objective function is strictly convex, the solution is guaranteed to be unique [1]1.

artificial intelligence, machine learning, objective function, (14 more...)

Country:

North America > United States (0.46)
Oceania > Australia > South Australia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Mason, Llew, Baxter, Jonathan, Bartlett, Peter L., Frean, Marcus R.

Boosting Algorithms as Gradient Descent

Recent theoretical results suggest that the effectiveness of these algorithms is due to their tendency to produce large margin classifiers [1, 18]. Loosely speaking, if a combination of classifiers correctly classifies most of the training data with a large margin, then its error probability is small. In [14] we gave improved upper bounds on the misclassification probability of a combined classifier in terms of the average over the training data of a certain cost function of the margins.

algorithm, artificial intelligence, machine learning, (18 more...)

Country:

Oceania > Australia > Queensland (0.14)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.43)

Some Theoretical Results Concerning the Convergence of Compositions of Regularized Linear Functions

Zhang, Tong

Recently, sample complexity bounds have been derived for problems involving linear functions such as neural networks and support vector machines. In this paper, we extend some theoretical results in this area by deriving dimensional independent covering number bounds for regularized linear functions under certain regularization conditions. We show that such bounds lead to a class of new methods for training linear classifiers with similar theoretical advantages of the support vector machine. Furthermore, we also present a theoretical analysis for these new methods from the asymptotic statistical point of view. This technique provides better description for large sample behaviors of these algorithms.

artificial intelligence, covering number, machine learning, (18 more...)

Country: North America > United States (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.96)

Schneidman, Elad, Segev, Idan, Tishby, Naftali

Information Capacity and Robustness of Stochastic Neuron Models

The reliability and accuracy of spike trains have been shown to depend on the nature of the stimulus that the neuron encodes.

artificial intelligence, information, spike train, (15 more...)

Country: Asia > Middle East > Israel (0.14)

Technology: Information Technology > Artificial Intelligence (0.68)

Zlochin, Mark, Baram, Yoram

Manifold Stochastic Dynamics for Bayesian Learning

We propose a new Markov Chain Monte Carlo algorithm which is a generalization of the stochastic dynamics method. The algorithm performs exploration of the state space using its intrinsic geometric structure, facilitating efficient sampling of complex distributions. Applied to Bayesian learning in neural networks, our algorithm was found to perform at least as well as the best state-of-the-art method while consuming considerably less time. 1 Introduction

artificial intelligence, bayesian inference, manifold stochastic dynamic, (17 more...)

Country: Asia > Middle East > Israel (0.16)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Jojic, Nebojsa, Frey, Brendan J.

Topographic Transformation as a Discrete Latent Variable

A very small amount of shearing will move the point only slightly, so deforming the object by shearing will trace a continuous curve in the space of pixel intensities. As illustrated in Fig. la, extensive levels of shearing will produce a highly nonlinear curve (consider shearing a thin vertical line), although the curve can be approximated by a straight line locally. Linear approximations of the transformation manifold have been used to significantly improve the performance of feedforward discriminative classifiers such as nearest neighbors (Simard et al., 1993) and multilayer perceptrons (Simard et al., 1992). Linear generative models (factor analysis, mixtures of factor analysis) have also been modified using linear approximations of the transformation manifold to build in some degree of transformation invariance (Hinton et al., 1997). In general, the linear approximation is accurate for transformations that couple neighboring pixels, but is inaccurate for transformations that couple nonneighboring pixels. In some applications (e.g., handwritten digit recognition), the input can be blurred so that the linear approximation becomes more robust. For significant levels of transformation, the nonlinear manifold can be better modeled using a discrete approximation. For example, the curve in Figure 1a can be 478 N. Jojic and B. J. Frey

artificial intelligence, neural network, transformation, (17 more...)

Country:

North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Rosca, Justinian P., Ruanaidh, Joseph Ó, Jourjine, Alexander, Rickard, Scott

Broadband Direction-Of-Arrival Estimation Based on Second Order Statistics

N wideband sources recorded using N closely spaced receivers can feasibly be separated based only on second order statistics when using a physical model of the mixing process. In this case we show that the parameter estimation problem can be essentially reduced to considering directions of arrival and attenuations of each signal. The paper presents two demixing methods operating in the time and frequency domain and experimentally shows that it is always possible to demix signals arriving at different angles. Moreover, one can use spatial cues to solve the channel selection problem and a post-processing Wiener filter to ameliorate the artifacts caused by demixing.

artificial intelligence, receiver, separation, (14 more...)

Country: North America > United States > New Jersey > Mercer County > Princeton (0.14)

Technology: Information Technology > Artificial Intelligence (0.49)

A Multi-class Linear Learning Algorithm Related to Winnow

Mesterharm, Chris

In this paper, we present Committee, a new multi-class learning algorithm related to the Winnow family of algorithms. Committee is an algorithm for combining the predictions of a set of sub-experts in the online mistake-bounded model oflearning. A sub-expert is a special type of attribute that predicts with a distribution over a finite number of classes. Committee learns a linear function of sub-experts and uses this function to make class predictions. We provide bounds for Committee that show it performs well when the target can be represented by a few relevant sub-experts. We also show how Committee can be used to solve more traditional problems composed of attributes. This leads to a natural extension that learns on multi-class problems that contain both traditional attributes and sub-experts.

algorithm, artificial intelligence, machine learning, (17 more...)

Country:

North America > United States > New Jersey (0.14)
North America > United States > California (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)