AITopics

I. Williams School of Informatics University of Edinburgh c.k.i.williams ed.ac.uk Abstract In this paper we analyze the relationships between the eigenvalues of the m x m Gram matrix K for a kernel k(·, .) We bound the differences betweenthe two spectra and provide a performance bound on kernel peA. 1 Introduction Over recent years there has been a considerable amount of interest in kernel methods for supervised learning (e.g. Support Vector Machines and Gaussian Process predict ion)and for unsupervised learning (e.g. In this paper we study the stability of the subspace of feature space extracted by kernel peA with respect to the sample of size m, and relate this to the feature space that would be extracted in the infinite sample-size limit. This analysis essentially "lifts" into (a potentially infinite dimensional) feature space an analysis which can also be carried out for peA, comparing the k-dimensional eigenspace extracted from a sample covariance matrix and the k-dimensional eigenspace extracted from the population covariance matrix, and comparing the residuals from the k-dimensional compression for the m-sample and the population.

artificial intelligence, eigenvalue, machine learning, (17 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.41)

Scott, Clayton, Nowak, Robert

Dyadic Classification Trees via Structural Risk Minimization

Classification trees are one of the most popular types of classifiers, with ease of implementation and interpretation being among their attractive features. Despite the widespread use of classification trees, theoretical analysis of their performance is scarce. In this paper, we show that a new family of classification trees, called dyadic classification trees (DCTs), are near optimal (in a minimax sense) for a very broad range of classification problems.This demonstrates that other schemes (e.g., neural networks, support vector machines) cannot perform significantly better than DCTs in many cases. We also show that this near optimal performance isattained with linear (in the number of training data) complexity growing and pruning algorithms. Moreover, the performance of DCTs on benchmark datasets compares favorably to that of standard CART, which is generally more computationally intensive and which does not possess similar near optimality properties. Our analysis stems from theoretical resultson structural risk minimization, on which the pruning rule for DCTs is based.

artificial intelligence, classification tree, machine learning, (15 more...)

Country:

North America > United States (0.47)
Europe > United Kingdom > England (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Slonim, Noam, Weiss, Yair

Maximum Likelihood and the Information Bottleneck

The information bottleneck (IB) method is an information-theoretic formulation for clustering problems.

artificial intelligence, bayesian inference, machine learning, (14 more...)

Country: Asia > Middle East (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Bray, Alistair, Martinez, Dominique

Kernel-Based Extraction of Slow Features: Complex Cells Learn Disparity and Translation Invariance from Natural Images

In Slow Feature Analysis (SFA [1]), it has been demonstrated that high-order invariant properties can be extracted by projecting inputs intoa nonlinear space and computing the slowest changing features in this space; this has been proposed as a simple general model for learning nonlinear invariances in the visual system. However, thismethod is highly constrained by the curse of dimensionality which limits it to simple theoretical simulations. This paper demonstrates that by using a different but closely-related objective function for extracting slowly varying features ([2, 3]), and then exploiting thekernel trick, this curse can be avoided. Using this new method we show that both the complex cell properties of translation invarianceand disparity coding can be learnt simultaneously from natural images when complex cells are driven by simple cells also learnt from the image. The notion of maximising an objective function based upon the temporal predictability ofoutput has been progressively applied in modelling the development of invariances in the visual system.

artificial intelligence, machine learning, objective function, (16 more...)

Country: Europe > France (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Spikernels: Embedding Spiking Neurons in Inner-Product Spaces

Shpigelman, Lavi, Singer, Yoram, Paz, Rony, Vaadia, Eilon

Inner-product operators, often referred to as kernels in statistical learning, define amapping from some input space into a feature space. The focus of this paper is the construction of biologically-motivated kernels for cortical activities. Thekernels we derive, termed Spikernels, map spike count sequences into an abstract vector space in which we can perform various prediction tasks. We discuss in detail the derivation of Spikernels and describe an efficient algorithm forcomputing their value on any two sequences of neural population spike counts. We demonstrate the merits of our modeling approach using the Spikernel and various standard kernels for the task of predicting hand movement velocitiesfrom cortical recordings. In all of our experiments all the kernels we tested outperform the standard scalar product used in regression with the Spikernel consistently achieving the best performance.

artificial intelligence, kernel, machine learning, (16 more...)

Country: Asia > Middle East > Israel (0.28)

Genre: Research Report (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.35)

Sahani, Maneesh, Linden, Jennifer F.

How Linear are Auditory Cortical Responses?

By comparison to some other sensory cortices, the functional properties ofcells in the primary auditory cortex are not yet well understood. Recent attempts to obtain a generalized description of auditory cortical responses have often relied upon characterization of the spectrotemporal receptivefield (STRF), which amounts to a model of the stimulusresponse function(SRF) that is linear in the spectrogram of the stimulus.

artificial intelligence, machine learning, predictive power, (18 more...)

Country: North America > United States > California (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Sanjana, Neville E., Tenenbaum, Joshua B.

Bayesian Models of Inductive Generalization

We argue that human inductive generalization is best explained in a Bayesian framework, rather than by traditional models based on similarity computations.We go beyond previous work on Bayesian concept learning by introducing an unsupervised method for constructing flexible hypothesisspaces, and we propose a version of the Bayesian Occam's razorthat trades off priors and likelihoods to prevent under-or over-generalization in these flexible spaces. We analyze two published data sets on inductive reasoning as well as the results of a new behavioral study that we have carried out.

artificial intelligence, generalization, machine learning, (19 more...)

Country: North America > United States > Massachusetts (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.94)

Chen, Yiling, Councill, Isaac G.

An Introduction to Support Vector Machines: A Review

AI MagazineJun-15-2003

machine learning, management and information, support vector machine, (3 more...)

AI Magazine

Genre: Overview (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Chen, Yiling, Councill, Isaac G.

An Introduction to Support Vector Machines: A Review

AI MagazineJun-15-2003

Kernel functions can implicitly combine these two steps (nonlinear mapping and linear learning) into one step in constructing a nonlinear learning machine.

artificial intelligence, linear learning machine, machine learning, (14 more...)

AI Magazine

Country: North America > United States (0.16)

Genre: Summary/Review (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)