AITopics

This paper investigates a new machine learning strategy called translated learning. Unlike many previous learning tasks, we focus on how to use labeled data from one feature space to enhance the classification of other entirely different learning spaces. For example, we might wish to use labeled text data to help learn a model for classifying image data, when the labeled images are difficult to obtain. An important aspect of translated learning is to build a "bridge" to link one feature space (known as the "source space") to another space (known as the "target space") through a translator in order to migrate the knowledge from source to target. The translated learning solution uses a language model to link the class labels to the features in the source spaces, which in turn is translated to the features in the target spaces. Finally, this chain of linkages is completed by tracing back to the instances in the target spaces. We show that this path of linkage can be modeled using a Markov chain and risk minimization. Through experiments on the text-aided image classification and cross-language classification tasks, we demonstrate that our translated learning framework can greatly outperform many state-of-the-art baseline methods.

classification, different feature space, feature space, (17 more...)

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > China > Hong Kong > Kowloon (0.04)

Genre:

Research Report (0.87)
Overview (0.68)

Industry: Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.52)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.35)

Shpigelman, Lavi, Lalazar, Hagai, Vaadia, Eilon

Kernel-ARMA for Hand Tracking and Brain-Machine interfacing During 3D Motor Control

Using machine learning algorithms to decode intended behavior from neural activity serves a dual purpose. First, these tools can be used to allow patients to interact with their environment through a Brain-Machine Interface (BMI). Second, analysis of the characteristics of such methods can reveal the significance of various features of neural activity, stimuli and responses to the encoding-decoding task. In this study we adapted, implemented and tested a machine learning method, called Kernel Auto-Regressive Moving Average (KARMA), for the task of inferring movements from neural activity in primary motor cortex. Our version of this algorithm is used in an on-line learning setting and is updated when feedback from the last inferred sequence become available. We first used it to track real hand movements executed by a monkey in a standard 3D motor control task. We then applied it in a closed-loop BMI setting to infer intended movement, while arms were restrained, allowing a monkey to perform the task using the BMI alone. KARMA is a recurrent method that learns a nonlinear model of output dynamics. It uses similarity functions (termed kernels) to compare between inputs. These kernels can be structured to incorporate domain knowledge into the method. We compare KARMA to various state-of-the-art methods by evaluating tracking performance and present results from the KARMA based BMI experiments.

algorithm, karma, neural activity, (16 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Education > Educational Setting > Online (0.54)
Health & Medicine > Therapeutic Area (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

A mixture model for the evolution of gene expression in non-homogeneous datasets

Quon, Gerald, Teh, Yee W., Chan, Esther, Hughes, Timothy, Brudno, Michael, Morris, Quaid D.

We address the challenge of assessing conservation of gene expression in complex, non-homogeneous datasets. Recent studies have demonstrated the success of probabilistic models in studying the evolution of gene expression in simple eukaryotic organisms such as yeast, for which measurements are typically scalar and independent. Models capable of studying expression evolution in much more complex organisms such as vertebrates are particularly important given the medical and scientific interest in species such as human and mouse. We present a statistical model that makes a number of significant extensions to previous models to enable characterization of changes in expression among highly complex organisms. We demonstrate the efficacy of our method on a microarray dataset containing diverse tissues from multiple vertebrate species. We anticipate that the model will be invaluable in the study of gene expression patterns in other diverse organisms as well, such as worms and insects.

evolution, expression, expression level, (17 more...)

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts (0.04)
Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report > New Finding (0.94)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)

Ponzi, Adam, Wickens, Jeff

Cell Assemblies in Large Sparse Inhibitory Networks of Biologically Realistic Spiking Neurons

Cell assemblies exhibiting episodes of recurrent coherent activity have been observed in several brain regions including the striatum and hippocampus CA3. Here we address the question of how coherent dynamically switching assemblies appear in large networks of biologically realistic spiking neurons interacting deterministically. We show by numerical simulations of large asymmetric inhibitory networks with fixed external excitatory drive that if the network has intermediate to sparse connectivity, the individual cells are in the vicinity of a bifurcation between a quiescent and firing state and the network inhibition varies slowly on the spiking timescale, then cells form assemblies whose members show strong positive correlation, while members of different assemblies show strong negative correlation. We show that cells and assemblies switch between firing and quiescent states with time durations consistent with a power-law. Our results are in good qualitative agreement with the experimental studies. The deterministic dynamical behaviour is related to winner-less competition shown in small closed loop inhibitory networks with heteroclinic cycles connecting saddle-points.

assembly, cell assembly, isi distribution, (14 more...)

Country:

Asia > Japan > Kyūshū & Okinawa > Okinawa (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.86)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.72)
Information Technology > Communications > Networks (0.67)

Onken, Arno, Grünewälder, Steffen, Munk, Matthias, Obermayer, Klaus

Modeling Short-term Noise Dependence of Spike Counts in Macaque Prefrontal Cortex

Correlations between spike counts are often used to analyze neural coding. The noise is typically assumed to be Gaussian. Yet, this assumption is often inappropriate, especially for low spike counts. In this study, we present copulas as an alternative approach. With copulas it is possible to use arbitrary marginal distributions such as Poisson or negative binomial that are better suited for modeling noise distributions of spike counts. Furthermore, copulas place a wide range of dependence structures at the disposal and can be used to analyze higher order interactions. We develop a framework to analyze spike count data by means of copulas. Methods for parameter inference based on maximum likelihood estimates and for computation of Shannon entropy are provided. We apply the method to our data recorded from macaque prefrontal cortex. The data analysis leads to three significant findings: (1) copula-based distributions provide better fits than discretized multivariate normal distributions; (2) negative binomial margins fit the data better than Poisson margins; and (3) a dependence model that includes only pairwise interactions overestimates the information entropy by at least 19% compared to the model with higher order interactions.

information, negative binomial margin, spike count, (14 more...)

Country:

South America > Colombia (0.04)
North America > United States > New York (0.04)
North America > United States > Illinois (0.04)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.35)

Nair, Vinod, Hinton, Geoffrey E.

Implicit Mixtures of Restricted Boltzmann Machines

We present a mixture model whose components are Restricted Boltzmann Machines (RBMs). This possibility has not been considered before because computing the partition function of an RBM is intractable, which appears to make learning a mixture of RBMs intractable as well. Surprisingly, when formulated as a third-order Boltzmann machine, such a mixture model can be learned tractably using contrastive divergence. The energy function of the model captures three-way interactions among visible units, hidden units, and a single hidden multinomial unit that represents the cluster labels. The distinguishing feature of this model is that, unlike other mixture models, the mixing proportions are not explicitly parameterized. Instead, they are defined implicitly via the energy function and depend on all the parameters in the model. We present results for the MNIST and NORB datasets showing that the implicit mixture of RBMs learns clusters that reflect the class structure in the data.

component rbm, mixture model, rbm, (14 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > District of Columbia > Washington (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Lacoste-Julien, Simon, Sha, Fei, Jordan, Michael I.

DiscLDA: Discriminative Learning for Dimensionality Reduction and Classification

Probabilistic topic models (and their extensions) have become popular as models of latent structures in collections of text documents or images. These models are usually treated as generative models and trained using maximum likelihood estimation, an approach which may be suboptimal in the context of an overall classification problem. In this paper, we describe DiscLDA, a discriminative learning framework for such models as Latent Dirichlet Allocation (LDA) in the setting of dimensionality reduction with supervised side information. In DiscLDA, a class-dependent linear transformation is introduced on the topic mixture proportions. This parameter is estimated by maximizing the conditional likelihood using Monte Carlo EM. By using the transformed topic mixture proportions as a new representation of documents, we obtain a supervised dimensionality reduction algorithm that uncovers the latent structure in a document collection while preserving predictive power for the task of classification. We compare the predictive power of the latent structure of DiscLDA with unsupervised LDA on the 20 Newsgroup ocument classification task.

class label, disclda, representation, (14 more...)

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Asia > Middle East > Republic of Türkiye (0.14)
Asia > Middle East > Jordan (0.05)
(8 more...)

Genre: Research Report (0.46)

Industry: Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Translated Learning: Transfer Learning across Different Feature Spaces

Dai, Wenyuan, Chen, Yuqiang, Xue, Gui-rong, Yang, Qiang, Yu, Yong

This paper investigates a new machine learning strategy called translated learning. Unlikemany previous learning tasks, we focus on how to use labeled data from one feature space to enhance the classification of other entirely different learning spaces. For example, we might wish to use labeled text data to help learn a model for classifying image data, when the labeled images are difficult to obtain. Animportant aspect of translated learning is to build a "bridge" to link one feature space (known as the "source space") to another space (known as the "target space")through a translator in order to migrate the knowledge from source to target. The translated learning solution uses a language model to link the class labels to the features in the source spaces, which in turn is translated to the features inthe target spaces. Finally, this chain of linkages is completed by tracing back to the instances in the target spaces. We show that this path of linkage can be modeled using a Markov chain and risk minimization. Through experiments on the text-aided image classification and cross-language classification tasks, we demonstrate that our translated learning framework can greatly outperform many state-of-the-art baseline methods.

classification, feature space, tlrisk, (17 more...)

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > China > Hong Kong > Kowloon (0.04)

Genre:

Research Report (0.87)
Overview (0.68)

Industry: Education (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.52)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.35)

Amini, Massih, Usunier, Nicolas, Laviolette, François

A Transductive Bound for the Voted Classifier with an Application to Semi-supervised Learning

In this paper we present two transductive bounds on the risk of the majority vote estimated over partially labeled training sets. Our first bound is tight when the additional unlabeled training data are used in the cases where the voted classifier makes its errors on low margin observations and where the errors of the associated Gibbs classifier can accurately be estimated. In semi-supervised learning, considering the margin as an indicator of confidence constitutes the working hypothesis of algorithms which search the decision boundary on low density regions. In this case, we propose a second bound on the joint probability that the voted classifier makes an error over an example having its margin over a fixed threshold. As an application we are interested on self-learning algorithms which assign iteratively pseudo-labels to unlabeled training examples having margin above a threshold obtained from this bound. Empirical results on different datasets show the effectiveness of our approach compared to the same algorithm and the TSVM in which the threshold is fixed manually.

algorithm, classifier, threshold, (16 more...)

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre:

Research Report (0.46)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.31)

Robust Principal Component Analysis: Exact Recovery of Corrupted Low-Rank Matrices via Convex Optimization

Wright, John, Ganesh, Arvind, Rao, Shankar, Peng, Yigang, Ma, Yi

Principal component analysis is a fundamental operation in computational data analysis, with myriad applications ranging from web search to bioinformatics to computer vision and image analysis. However, its performance and applicability in real scenarios are limited by a lack of robustness to outlying or corrupted observations. Thispaper considers the idealized "robust principal component analysis" problem of recovering a low rank matrix A from corrupted observations D A E. Here, the corrupted entries E are unknown and the errors can be arbitrarily large (modeling grossly corrupted observations common in visual and bioinformatic data), but are assumed to be sparse. We prove that most matrices A can be efficiently and exactly recovered from most error sign-and-support patterns bysolving a simple convex program, for which we give a fast and provably convergent algorithm. Our result holds even when the rank of A grows nearly proportionally (up to a logarithmic factor) to the dimensionality of the observation spaceand the number of errors E grows in proportion to the total number of entries in the matrix. A byproduct of our analysis is the first proportional growth results for the related problem of completing a low-rank matrix from a small fraction ofits entries. Simulations and real-data examples corroborate the theoretical results, and suggest potential applications in computer vision.

algorithm, low-rank matrix, matrix, (15 more...)

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Illinois (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Principal Component Analysis (0.81)