AITopics

We present a powerful meta-clustering technique called Iterative Double Clustering (IDC). The IDC method is a natural extension of the recent Double Clustering (DC) method of Slonim and Tishby that exhibited impressive performance on text categorization tasks [12]. Using synthetically generated data we empirically find that whenever the DC procedure is successful in recovering some of the structure hidden in the data, the extended IDC procedure can incrementally compute a significantly more accurate classification. IDC is especially advantageous when the data exhibits high attribute noise. Our simulation results also show the effectiveness of IDC in text categorization problems. Surprisingly, this unsupervised procedure can be competitive with a (supervised) SVM trained with a small training set. Finally, we propose a simple and natural extension of IDC for semi-supervised and transductive learning where we are given both labeled and unlabeled examples.

accuracy, algorithm, idc, (14 more...)

Country:

North America > United States > New Jersey (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Learning Lateral Interactions for Feature Binding and Sensory Segmentation

Wersing, Heiko

We present a new approach to the supervised learning of lateral interactions for the competitive layer model (CLM) dynamic feature binding architecture. The method is based on consistency conditions, which were recently shown to characterize the attractor states of this linear threshold recurrent network. For a given set of training examples the learning problem is formulated as a convex quadratic optimization problem in the lateral interaction weights. An efficient dimension reduction of the learning problem can be achieved by using a linear superposition of basis interactions. We show the successful application of the method to a medical image segmentation problem of fluorescence microscope cell images.

interaction, lateral interaction, segmentation, (15 more...)

Country:

North America > United States > New York (0.04)
Europe > Germany > Lower Saxony > Gottingen (0.04)

Industry:

Health & Medicine > Therapeutic Area (0.47)
Education > Focused Education > Special Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Vincent, Pascal, Bengio, Yoshua

K-Local Hyperplane and Convex Distance Nearest Neighbor Algorithms

Guided by an initial idea of building a complex (non linear) decision surface with maximal local margin in input space, we give a possible geometrical intuition as to why K-Nearest Neighbor (KNN) algorithms often perform more poorly than SVMs on classification tasks. We then propose modified K-Nearest Neighbor algorithms to overcome the perceived problem. The approach is similar in spirit to Tangent Distance, but with invariances inferred from the local neighborhood rather than prior knowledge. Experimental results on real world classification tasks suggest that the modified KNN algorithms often give a dramatic improvement over standard KNN and perform as well or better than SVMs.

algorithm, decision surface, hyperplane, (13 more...)

Country:

North America > Canada > Quebec > Montreal (0.05)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (1.00)

Thrun, Sebastian, Langford, John, Verma, Vandi

Risk Sensitive Particle Filters

We propose a new particle filter that incorporates a model of costs when generating particles. The approach is motivated by the observation that the costs of accidentally not tracking hypotheses might be significant in some areas of state space, and next to irrelevant in others. By incorporating a cost model into particle filtering, states that are more critical to the system performance are more likely to be tracked. Automatic calculation of the cost model is implemented using an MDP value function calculation that estimates the value of tracking a particular state. Experiments in two mobile robot domains illustrate the appropriateness of the approach.

particle, particle filter, risk function, (14 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.79)

Szummer, Martin, Jaakkola, Tommi

Partially labeled classification with Markov random walks

To classify a large number of unlabeled examples we combine a limited number of labeled examples with a Markov random walk representation over the unlabeled examples.

markov random walk, probability, representation, (15 more...)

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Shimodaira, Hiroshi, Noma, Ken-ichi, Nakai, Mitsuru, Sagayama, Shigeki

Dynamic Time-Alignment Kernel in Support Vector Machine

A new class of Support Vector Machine (SVM) that is applicable to sequential-pattern recognition such as speech recognition is developed by incorporating an idea of nonlinear time alignment into the kernel function. Since the time-alignment operation of sequential pattern is embedded in the new kernel function, standard SVM training and classification algorithms can be employed without further modifications. The proposed SVM (DTAK-SVM) is evaluated in speaker-dependent speech recognition experiments of hand-segmented phoneme recognition. Preliminary experimental results show comparable recognition performance with hidden Markov models (HMMs).

kernel, recognition, svm, (14 more...)

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)

Segal, Eran, Koller, Daphne, Ormoneit, Dirk

Probabilistic Abstraction Hierarchies

Many domains are naturally organized in an abstraction hierarchy or taxonomy, where the instances in "nearby" classes in the taxonomy are similar. In this paper, we provide a general probabilistic framework for clustering data into a set of classes organized as a taxonomy, where each class is associated with a probabilistic model from which the data was generated. The clustering algorithm simultaneously optimizes three things: the assignment of data instances to clusters, the models associated with the clusters, and the structure of the abstraction hierarchy. A unique feature of our approach is that it utilizes global optimization algorithms for both of the last two steps, reducing the sensitivity to noise and the propensity to local maxima that are characteristic of algorithms such as hierarchical agglomerative clustering that only take local steps. We provide a theoretical analysis for our algorithm, showing that it converges to a local maximum of the joint likelihood of model and data.

algorithm, cpm, hierarchy, (16 more...)

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.95)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Covariance Kernels from Bayesian Generative Models

Seeger, Matthias

We propose the framework of mutual information kernels for learning covariance kernels, as used in Support Vector machines and Gaussian process classifiers, from unlabeled task data using Bayesian techniques. We describe an implementation of this framework which uses variational Bayesian mixtures of factor analyzers in order to attack classification problems in high-dimensional spaces where labeled data is sparse, but unlabeled data is abundant.

approximation, information, kernel, (13 more...)

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Europe > United Kingdom (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Saul, Lawrence K., Lee, Daniel D.

Multiplicative Updates for Classification by Mixture Models

We investigate a learning algorithm for the classification of nonnegative data by mixture models. Multiplicative update rules are derived that directly optimize the performance of these models as classifiers. The update rules have a simple closed form and an intuitive appeal. Our algorithm retains the main virtues of the Expectation-Maximization (EM) algorithm--its guarantee of monotonic improvement, and its absence of tuning parameters--with the added advantage of optimizing a discriminative objective function. The algorithm reduces as a special case to the method of generalized iterative scaling for log-linear models. The learning rate of the algorithm is controlled by the sparseness of the training data. We use the method of nonnegative matrix factorization (NMF) to discover sparse distributed representations of the data. This form of feature selection greatly accelerates learning and makes the algorithm practical on large problems. Experiments show that discriminatively trained mixture models lead to much better classification than comparably sized models trained by EM.

algorithm, conditional log likelihood, nullnullnull, (14 more...)

Country:

North America > Canada > Quebec > Montreal (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Massachusetts > Plymouth County > Norwell (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)

Rasmussen, Carl E., Ghahramani, Zoubin

Infinite Mixtures of Gaussian Process Experts

We present an extension to the Mixture of Experts (ME) model, where the individual experts are Gaussian Process (GP) regression models. Using an input-dependent adaptation of the Dirichlet Process, we implement a gating network for an infinite number of Experts. Inference in this model may be done efficiently using a Markov Chain relying on Gibbs sampling. The model allows the effective covariance function to vary with the inputs, and may handle large datasets - thus potentially overcoming two of the biggest hurdles with GP models.

covariance function, hyperparameter, indicator variable, (15 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)