AITopics

Guided by an initial idea of building a complex (non linear) decision surface with maximal local margin in input space, we give a possible geometrical intuition as to why K-Nearest Neighbor (KNN) algorithms often perform more poorly than SVMs on classification tasks. We then propose modified K-Nearest Neighbor algorithms to overcome the perceived problem. The approach is similar in spirit to Tangent Distance, but with invariances inferred from the local neighborhood rather than prior knowledge. Experimental results on real world classification tasks suggest that the modified KNN algorithms often give a dramatic improvement over standard KNN and perform as well or better than SVMs.

algorithm, decision surface, hyperplane, (13 more...)

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Tsuda, Koji, Kawanabe, Motoaki, Rätsch, Gunnar, Sonnenburg, Sören, Müller, Klaus-Robert

A New Discriminative Kernel From Probabilistic Models

Recently, Jaakkola and Haussler proposed a method for constructing kernel functions from probabilistic models. Their so called "Fisher kernel" has been combined with discriminative classifiers such as SVM and applied successfully in e.g.

fisher kernel, kernel, top kernel, (13 more...)

Country:

Europe > Germany > Brandenburg > Potsdam (0.04)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.69)

Learning Discriminative Feature Transforms to Low Dimensions in Low Dimentions

Torkkola, Kari

The marriage of Renyi entropy with Parzen density estimation has been shown to be a viable tool in learning discriminative feature transforms. However, it suffers from computational complexity proportional to the square of the number of samples in the training data. This sets a practical limit to using large databases. We suggest immediate divorce of the two methods and remarriage of Renyi entropy with a semi-parametric density estimation method, such as a Gaussian Mixture Models (GMM). This allows all of the computation to take place in the low dimensional target space, and it reduces computational complexity proportional to square of the number of components in the mixtures. Furthermore, a convenient extension to Hidden Markov Models as commonly used in speech recognition becomes possible.

feature transform, mutual information, output space, (14 more...)

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New York > Monroe County > Rochester (0.04)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.57)

Thrun, Sebastian, Langford, John, Verma, Vandi

Risk Sensitive Particle Filters

We propose a new particle filter that incorporates a model of costs when generating particles. The approach is motivated by the observation that the costs of accidentally not tracking hypotheses might be significant in some areas of state space, and next to irrelevant in others. By incorporating a cost model into particle filtering, states that are more critical to the system performance are more likely to be tracked. Automatic calculation of the cost model is implemented using an MDP value function calculation that estimates the value of tracking a particular state. Experiments in two mobile robot domains illustrate the appropriateness of the approach.

particle, particle filter, risk function, (14 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.79)

Szummer, Martin, Jaakkola, Tommi

Partially labeled classification with Markov random walks

To classify a large number of unlabeled examples we combine a limited number of labeled examples with a Markov random walk representation over the unlabeled examples.

markov random walk, probability, representation, (15 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Sykacek, Peter, Roberts, Stephen J.

Bayesian time series classification

This paper proposes an approach to classification of adjacent segments of a time series as being either of classes. We use a hierarchical model that consists of a feature extraction stage and a generative classifier which is built on top of these features. Such two stage approaches are often used in signal and image processing. The novel part of our work is that we link these stages probabilistically by using a latent feature space. To use one joint model is a Bayesian requirement, which has the advantage to fuse information according to its certainty.

coefficient, model order, probability, (14 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Weinheim (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.95)

Slonim, Noam, Friedman, Nir, Tishby, Naftali

Agglomerative Multivariate Information Bottleneck

The information bottleneck method is an unsupervised model independent data organization technique. Given a joint distribution peA, B), this method constructs a new variable T that extracts partitions, or clusters, over the values of A that are informative about B. In a recent paper, we introduced a general principled framework for multivariate extensions of the information bottleneck method that allows us to consider multiple systems of data partitions that are interrelated. In this paper, we present a new family of simple agglomerative algorithms to construct such systems of interrelated clusters. We analyze the behavior of these algorithms and apply them to several real-life datasets.

algorithm, information, procedure, (16 more...)

Country:

North America > United States (0.14)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Industry: Health & Medicine (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

Shimodaira, Hiroshi, Noma, Ken-ichi, Nakai, Mitsuru, Sagayama, Shigeki

Dynamic Time-Alignment Kernel in Support Vector Machine

A new class of Support Vector Machine (SVM) that is applicable to sequential-pattern recognition such as speech recognition is developed by incorporating an idea of nonlinear time alignment into the kernel function. Since the time-alignment operation of sequential pattern is embedded in the new kernel function, standard SVM training and classification algorithms can be employed without further modifications. The proposed SVM (DTAK-SVM) is evaluated in speaker-dependent speech recognition experiments of hand-segmented phoneme recognition. Preliminary experimental results show comparable recognition performance with hidden Markov models (HMMs).

kernel, recognition, svm, (14 more...)

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)

Segal, Eran, Koller, Daphne, Ormoneit, Dirk

Probabilistic Abstraction Hierarchies

Many domains are naturally organized in an abstraction hierarchy or taxonomy, where the instances in "nearby" classes in the taxonomy are similar. In this paper, we provide a general probabilistic framework for clustering data into a set of classes organized as a taxonomy, where each class is associated with a probabilistic model from which the data was generated. The clustering algorithm simultaneously optimizes three things: the assignment of data instances to clusters, the models associated with the clusters, and the structure of the abstraction hierarchy. A unique feature of our approach is that it utilizes global optimization algorithms for both of the last two steps, reducing the sensitivity to noise and the propensity to local maxima that are characteristic of algorithms such as hierarchical agglomerative clustering that only take local steps. We provide a theoretical analysis for our algorithm, showing that it converges to a local maximum of the joint likelihood of model and data.

algorithm, cpm, hierarchy, (16 more...)

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Covariance Kernels from Bayesian Generative Models

Seeger, Matthias

We propose the framework of mutual information kernels for learning covariance kernels, as used in Support Vector machines and Gaussian process classifiers, from unlabeled task data using Bayesian techniques. We describe an implementation of this framework which uses variational Bayesian mixtures of factor analyzers in order to attack classification problems in high-dimensional spaces where labeled data is sparse, but unlabeled data is abundant.

approximation, information, kernel, (13 more...)

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > United Kingdom (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)