AITopics

Country: Europe > United Kingdom (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.92)

Direct Classification with Indirect Data

Brown, Timothy X.

Suppose there exists an unknown real-valued property of the feature space, p(¢), that maps from the feature space, ¢ ERn, to R. The property function and a positive set A c

artificial intelligence, classifier, télécommunications, (17 more...)

Country: North America > United States > Colorado (0.14)

Industry: Telecommunications (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.33)

Ben-Hur, Asa, Horn, David, Siegelmann, Hava T., Vapnik, Vladimir

A Support Vector Method for Clustering

We present a novel method for clustering using the support vector machine approach. Data points are mapped to a high dimensional feature space, where support vectors are used to define a sphere enclosing them. The boundary of the sphere forms in data space a set of closed contours containing the data. Data points enclosed by each contour are defined as a cluster. As the width parameter of the Gaussian kernel is decreased, these contours fit the data more tightly and splitting of contours occurs.

algorithm, artificial intelligence, machine learning, (15 more...)

Country:

North America > United States (0.29)
Asia > Middle East > Israel (0.29)

Genre: Research Report (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Smola, Alex J., Óvári, Zoltán L., Williamson, Robert C.

Regularization with Dot-Product Kernels

In this paper we give necessary and sufficient conditions under which kernels of dot product type k(x, y) k(x.

artificial intelligence, kernel, machine learning, (15 more...)

Country: North America > United States (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Arcas, Blaise Agüera y, Fairhall, Adrienne L., Bialek, William

What Can a Single Neuron Compute?

What can a single neuron compute? Abstract In this paper we formulate a description of the computation performed by a neuron as a combination of dimensional reduction and nonlinearity. We implement this description for the Hodgkin Huxley model, identify the most relevant dimensions and find the nonlinearity. A two dimensional description already captures a significant fraction of the information that spikes carry about dynamic inputs. This description also shows that computation in the Hodgkin-Huxley model is more complex than a simple integrateand-fire or perceptron model. 1 Introduction Classical neural network models approximate neurons as devices that sum their inputs and generate a nonzero output if the sum exceeds a threshold.

artificial intelligence, neural network, spike, (19 more...)

Country: North America > United States > New Jersey (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Smola, Alex J., Bartlett, Peter L.

Sparse Greedy Gaussian Process Regression

We present a simple sparse greedy technique to approximate the maximum a posteriori estimate of Gaussian Processes with much improved scaling behaviour in the sample size m.

Weak Learners and Improved Rates of Convergence in Boosting

Mannor, Shie, Meir, Ron

The problem of constructing weak classifiers for boosting algorithms is studied. We present an algorithm that produces a linear classifier that is guaranteed to achieve an error better than random guessing for any distribution on the data. While this weak learner is not useful for learning in general, we show that under reasonable conditions on the distribution it yields an effective weak learner for one-dimensional problems. Preliminary simulations suggest that similar behavior can be expected in higher dimensions, a result which is corroborated by some recent theoretical bounds. Additionally, we provide improved convergence rate bounds for the generalization error in situations where the empirical error can be made small, which is exactly the situation that occurs if weak learners with guaranteed performance that is better than random guessing can be established.

artificial intelligence, classifier, machine learning, (12 more...)

Country: Asia > Middle East > Israel (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

The Kernel Trick for Distances

Schölkopf, Bernhard

A method is described which, like the kernel trick in support vector machines (SVMs), lets us generalize distance-based algorithms to operate in feature spaces, usually nonlinearly related to the input space. This is done by identifying a class of kernels which can be represented as norm-based distances in Hilbert spaces. It turns out that common kernel algorithms, such as SVMs and kernel PCA, are actually really distance based algorithms and can be run with that class of kernels, too. As well as providing a useful new insight into how these algorithms work, the present work can form the basis for conceiving new algorithms. 1 Introduction One of the crucial ingredients of SVMs is the so-called kernel trick for the computation of dot products in high-dimensional feature spaces using simple functions defined on pairs of input patterns. This trick allows the formulation of nonlinear variants of any algorithm that can be cast in terms of dot products, SVMs being but the most prominent example [13, 8]. Although the mathematical result underlying the kernel trick is almost a century old [6], it was only much later [1, 3,13] that it was made fruitful for the machine learning community. Kernel methods have since led to interesting generalizations of learning algorithms and to successful real-world applications. The present paper attempts to extend the utility of the kernel trick by looking at the problem of which kernels can be used to compute distances in feature spaces. Again, the underlying mathematical results, mainly due to Schoenberg, have been known for a while [7]; some of them have already attracted interest in the kernel methods community in various contexts [11, 5, 15].

artificial intelligence, kernel, machine learning, (14 more...)

Country: North America > United States (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Moghaddam, Baback, Yang, Ming-Hsuan

Sex with Support Vector Machines

These include face detection [14], face pose discrimination [12] and face recognition [16]. Although facial sex classification has attracted much attention in the psychological literature [1, 4, 8, 15], relatively few computatinal learning methods have been proposed. We will briefly review and summarize the prior art in facial sex classification.

artificial intelligence, classification, machine learning, (18 more...)

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Nemenman, Ilya, Bialek, William

Learning Continuous Distributions: Simulations With Field Theoretic Priors

Learning of a smooth but nonparametric probability density can be regularized using methods of Quantum Field Theory. We implement a field theoretic prior numerically, test its efficacy, and show that the free parameter of the theory (,smoothness scale') can be determined self consistently by the data; this forms an infinite dimensional generalization of the MDL principle. Finally, we study the implications of one's choice of the prior and the parameterization and conclude that the smoothness scale determination makes density estimation very weakly sensitive to the choice of the prior, and that even wrong choices can be advantageous for small data sets. One of the central problems in learning is to balance'goodness of fit' criteria against the complexity of models. An important development in the Bayesian approach was thus the realization that there does not need to be any extra penalty for model complexity: if we compute the total probability that data are generated by a model, there is a factor from the volume in parameter space-the'Occam factor' -that discriminates against models with more parameters [1, 2].