AITopics

We present a simple sparse greedy technique to approximate the maximum a posteriori estimate of Gaussian Processes with much improved scaling behaviour in the sample size m.

approximation, log posterior, matrix, (12 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.05)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Shriki, Oren, Sompolinsky, Haim, Lee, Daniel D.

An Information Maximization Approach to Overcomplete and Recurrent Representations

The principle of maximizing mutual information is applied to learning overcomplete and recurrent representations. The underlying model consists of a network of input units driving a larger number of output units with recurrent interactions. In the limit of zero noise, the network is deterministic and the mutual information can be related to the entropy of the output units.

information, mutual information, representation, (13 more...)

Country:

North America > United States > New York (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Automatic Choice of Dimensionality for PCA

Minka, Thomas P.

A central issue in principal component analysis (PCA) is choosing the number of principal components to be retained. By interpreting PCA as density estimation, we show how to use Bayesian model selection to estimate the true dimensionality of the data. The resulting estimate is simple to compute yet guaranteed to pick the correct dimensionality, given enough data. The estimate involves an integral over the Steifel manifold of k-frames, which is difficult to compute exactly. But after choosing an appropriate parameterization and applying Laplace's method, an accurate and practical estimator is obtained. In simulations, it is convincingly better than cross-validation and other proposed algorithms, plus it runs much faster.

dimensionality, laplace, matrix, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.92)

Merwe, Rudolph van der, Doucet, Arnaud, Freitas, Nando de, Wan, Eric A.

The Unscented Particle Filter

In this paper, we propose a new particle filter based on sequential importance sampling. The algorithm uses a bank of unscented filters to obtain the importance proposal distribution. This proposal has two very "nice" properties. Firstly, it makes efficient use of the latest available information and, secondly, it can have heavy tails. As a result, we find that the algorithm outperforms standard particle filtering and other nonlinear filtering methods very substantially.

algorithm, particle filter, proposal distribution, (14 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.15)
North America > United States > Oregon > Multnomah County > Portland (0.05)
North America > United States > New Jersey (0.04)
North America > United States > Florida > Orange County > Orlando (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Mangasarian, Olvi L., Musicant, David R.

Active Support Vector Machine Classification

Classification is achieved by a linear or nonlinear separating surface in the input space of the dataset. In this work we propose a very fast simple algorithm, based on an active set strategy for solving quadratic programs with bounds [18]. The algorithm is capable of accurately solving problems with millions of points and requires nothing more complicated than a commonly available linear equation solver [17, 1, 6] for a typically small (100) dimensional input space of the problem. Key to our approach are the following two changes to the standard linear SVM: 1. Maximize the margin (distance) between the parallel separating planes with respect to both orientation (w) as well as location relative to the origin b).

algorithm, matrix, support vector machine, (13 more...)

Country:

North America > United States > Wisconsin > Dane County > Madison (0.28)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
(8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)

Lu, Wei, Rajapakse, Jagath C.

Constrained Independent Component Analysis

The paper presents a novel technique of constrained independent component analysis (CICA) to introduce constraints into the classical ICA and solve the constrained optimization problem by using Lagrange multiplier methods. This paper shows that CICA can be used to order the resulted independent components in a specific manner and normalize the demixing matrix in the signal separation procedure. It can systematically eliminate the ICA's indeterminacy on permutation and dilation. The experiments demonstrate the use of CICA in ordering of independent components while providing normalized demixing processes. Keywords: Independent component analysis, constrained independent component analysis, constrained optimization, Lagrange multiplier methods 1 Introduction Independent component analysis (ICA) is a technique to transform a multivariate random signal into a signal with components that are mutually independent in complete statistical sense [1].

constraint, independent component, lagrange multiplier method, (13 more...)

Country:

North America > United States > New York (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.77)

Lodhi, Huma, Shawe-Taylor, John, Cristianini, Nello, Watkins, Christopher J. C. H.

Text Classification using String Kernels

We introduce a novel kernel for comparing two text documents. The kernel is an inner product in the feature space consisting of all subsequences of length k. A subsequence is any ordered sequence of k characters occurring in the text though not necessarily contiguously. The subsequences are weighted by an exponentially decaying factor of their full length in the text, hence emphasising those occurrences which are close to contiguous. A direct computation of this feature vector would involve a prohibitive amount of computation even for modest values of k, since the dimension of the feature space grows exponentially with k. The paper describes how despite this fact the inner product can be efficiently evaluated by a dynamic programming technique.

computation, feature space, kernel, (12 more...)

Country:

North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Kjems, Ulrik, Hansen, Lars Kai, Strother, Stephen C.

Generalizable Singular Value Decomposition for Ill-posed Datasets

So which of the two variances is "correct"? From a modelling point of view, the variance from the test example tells us the true story, so the training set variance should be regarded as biased.

projection, singular value decomposition, variance, (13 more...)

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.04)
Europe > Germany (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
Asia > Middle East > Jordan (0.04)

Industry: Health & Medicine (0.96)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.31)

Højen-Sørensen, Pedro A. d. F. R., Winther, Ole, Hansen, Lars Kai

Ensemble Learning and Linear Response Theory for ICA

The naive mean-field approach fails in this case whereas linear response theory-which gives an improved estimate of covariances-is very efficient. The examples given are for sources without temporal correlations .

equation, noise level, temporal correlation, (10 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Reading (0.04)
Europe > Sweden > Skåne County > Lund (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.49)

Hochreiter, Sepp, Mozer, Michael C.

Beyond Maximum Likelihood and Density Estimation: A Sample-Based Criterion for Unsupervised Learning of Complex Models

Two well known classes of unsupervised procedures that can be cast in this manner are generative and recoding models. In a generative unsupervised framework, the environment generates training exampleswhich we will refer to as observations-by sampling from one distribution; the other distribution is embodied in the model. Examples of generative frameworks are mixtures of Gaussians (MoG) [2], factor analysis [4], and Boltzmann machines [8]. In the recoding unsupervised framework, the model transforms points from an obser- vation space to an output space, and the output distribution is compared either to a reference distribution or to a distribution derived from the output distribution. An example is independent component analysis (leA) [11], a method that discovers a representation of vector-valued observations in which the statistical dependence among the vector elements in the output space is minimized.

nonlinear model, particle, sample-based approach, (11 more...)

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > France (0.05)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.41)