AITopics

Amino acid profiles, which capture position-specific mutation probabilities, are a richer encoding of biological sequences than the individual sequences themselves. However, profile comparisons are much more computationally expensive than discrete symbol comparisons, making profiles impractical for many large datasets. Furthermore, because they are such a rich representation, profiles can be difficult to visualize. To overcome these problems, we propose a discretization for profiles using an expanded alphabet representing not just individual amino acids, but common profiles. By using an extension of information bottleneck (IB) incorporating constraints and priors on the class distributions, we find an informationally optimal alphabet. This discretization yields a concise, informative textual representation for profile sequences. Also alignments between these sequences, while nearly as accurate as the full profile-profile alignments, can be computed almost as quickly as those between individual or consensus sequences. A full pairwise alignment of SwissProt would take years using profiles, but less than 3 days using a discrete IB encoding, illustrating how discrete encoding can expand the range of sequence problems to which profile information can be applied.

alignment, information, sequence, (16 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Biomedical Informatics (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)

Huang, Tzu-kuo, Lin, Chih-jen, Weng, Ruby C.

A Generalized Bradley-Terry Model: From Group Competition to Individual Skill

The Bradley-Terry model for paired comparison has been popular in many areas. We propose a generalized version in which paired individual comparisons are extended to paired team comparisons. We introduce a simple algorithm with convergence proofs to solve the model and obtain individual skill. A useful application to multi-class probability estimates using error-correcting codes is demonstrated.

algorithm, bradley-terry model, probability estimate, (12 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
Asia > Taiwan > Taiwan Province > Taipei (0.04)
North America > United States > New York (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Object Classification from a Single Example Utilizing Class Relevance Metrics

Fink, Michael

We describe a framework for learning an object classifier from a single example. This goal is achieved by emphasizing the relevant dimensions for classification using available examples of related classes. Learning to accurately classify objects from a single training example is often unfeasible due to overfitting effects. However, if the instance representation provides that the distance between each two instances of the same class is smaller than the distance between any two instances from different classes, then a nearest neighbor classifier could achieve perfect performance with a single training example. We therefore suggest a two stage strategy.

dimension, representation, single example, (13 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.97)

Courville, Aaron C., Daw, Nathaniel D., Touretzky, David S.

Similarity and Discrimination in Classical Conditioning: A Latent Variable Account

We propose a probabilistic, generative account of configural learning phenomena in classical conditioning. Configural learning experiments probe how animals discriminate and generalize between patterns of simultaneously presented stimuli (such as tones and lights) that are differentially predictive of reinforcement. Previous models of these issues have been successful more on a phenomenological than an explanatory level: they reproduce experimental findings but, lacking formal foundations, provide scant basis for understanding why animals behave as they do. We present a theory that clarifies seemingly arbitrary aspects of previous models while also capturing a broader set of data.

compound, configural unit, latent variable, (16 more...)

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)

Anguelov, Dragomir, Srinivasan, Praveen, Pang, Hoi-cheung, Koller, Daphne, Thrun, Sebastian, Davis, James

The Correlated Correspondence Algorithm for Unsupervised Registration of Nonrigid Surfaces

We present an unsupervised algorithm for registering 3D surface scans of an object undergoing significant deformations. Our algorithm does not need markers, nor does it assume prior knowledge about object shape, the dynamics of its deformation, or scan alignment.

algorithm, mesh, registration, (16 more...)

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > Massachusetts (0.04)
(5 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Zwald, Laurent, Blanchard, Gilles, Massart, Pascal, Vert, Régis

Kernel Projection Machine: a New Tool for Pattern Recognition

This paper investigates the effect of Kernel Principal Component Analysis (KPCA) within the classification framework, essentially the regularization properties of this dimensionality reduction method. KPCA has been previously used as a pre-processing step before applying an SVM but we point out that this method is somewhat redundant from a regularization point of view and we propose a new algorithm called Kernel Projection Machine to avoid this redundancy, based on an analogy with the statistical framework of regression for a Gaussian white noise model. Preliminary experimental results show that this algorithm reaches the same performances as an SVM.

algorithm, kernel projection machine, regularization, (14 more...)

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Germany > Berlin (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Zhu, Jerry, Kandola, Jaz, Ghahramani, Zoubin, Lafferty, John D.

Nonparametric Transforms of Graph Kernels for Semi-Supervised Learning

We present an algorithm based on convex optimization for constructing kernels for semi-supervised learning. The kernel matrices are derived from the spectral decomposition of graph Laplacians, and combine labeled and unlabeled data in a systematic fashion. Unlike previous work using diffusion kernels and Gaussian random field kernels, a nonparametric kernel approach is presented that incorporates order constraints during optimization. This results in flexible kernels and avoids the need to choose among different parametric forms. Our approach relies on a quadratically constrained quadratic program (QCQP), and is computationally feasible for large datasets. We evaluate the kernels on real datasets using support vector machines, with encouraging results.

constraint, kernel, spectral transformation, (14 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Zhou, Dengyong, Hofmann, Thomas, Schölkopf, Bernhard

Semi-supervised Learning on Directed Graphs

Given a directed graph in which some of the nodes are labeled, we investigate the question of how to exploit the link structure of the graph to infer the labels of the remaining unlabeled nodes. To that extent we propose a regularization framework for functions defined over nodes of a directed graph that forces the classification function to change slowly on densely linked subgraphs. A powerful, yet computationally simple classification algorithm is derived within the proposed framework. The experimental evaluation on real-world Web classification problems demonstrates encouraging results that validate our approach.

bipartite graph, graph, vertex, (15 more...)

Country:

North America > United States > Wisconsin (0.05)
North America > United States > Texas (0.05)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.83)

Class-size Independent Generalization Analsysis of Some Discriminative Multi-Category Classification

Zhang, Tong

We consider the problem of deriving class-size independent generalization bounds for some regularized discriminative multi-category classification methods. In particular, we obtain an expected generalization bound for a standard formulation of multi-category support vector machines. Based on the theoretical result, we argue that the formulation over-penalizes misclassification error, which in theory may lead to poor generalization performance. A remedy, based on a generalization of multi-category logistic regression (conditional maximum entropy), is then proposed, and its theoretical properties are examined.

classification error, gen, generalization, (11 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > New Jersey > Mercer County > Princeton (0.04)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.71)

Zhang, Jian, Ghahramani, Zoubin, Yang, Yiming

A Probabilistic Model for Online Document Clustering with Application to Novelty Detection

In this paper we propose a probabilistic model for online document clustering. We use nonparametric Dirichlet process prior to model the growing number of clusters, and use a prior of general English language model as the base distribution to handle the generation of novel clusters. Furthermore, cluster uncertainty is modeled with a Bayesian Dirichletmultinomial distribution. We use empirical Bayes method to estimate hyperparameters based on a historical dataset. Our probabilistic model is applied to the novelty detection task in Topic Detection and Tracking (TDT) and compared with existing approaches in the literature.

new cluster, novelty detection task, probability, (7 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > New York (0.04)
Europe > United Kingdom (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.47)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)