AITopics

We develop a maximum entropy (maxent) approach to generating recommendations in the context of a user's current navigation stream, suitable for environments where data is sparse, high-dimensional, and dynamic-- conditions typical of many recommendation applications. We address sparsity and dimensionality reduction by first clustering items based on user access patterns so as to attempt to minimize the apriori probability that recommendations will cross cluster boundaries and then recommending only within clusters. We address the inherent dynamic nature of the problem by explicitly modeling the data as a time series; we show how this representational expressivity fits naturally into a maxent framework. We conduct experiments on data from ResearchIndex, a popular online repository of over 470,000 computer science documents. We show that our maxent formulation outperforms several competing algorithms in offline tests simulating the recommendation of documents to ResearchIndex users.

maxent model, prediction, recommendation, (14 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > North Carolina > Orange County > Chapel Hill (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.61)

Vert, Jean-philippe, Kanehisa, Minoru

Graph-Driven Feature Extraction From Microarray Data Using Diffusion Kernels and Kernel CCA

We present an algorithm to extract features from high-dimensional gene expression profiles, based on the knowledge of a graph which links together genes known to participate to successive reactions in metabolic pathways. Motivated by the intuition that biologically relevant features are likely to exhibit smoothness with respect to the graph topology, the algorithm involves encoding the graph and the set of expression profiles into kernel functions, and performing a generalized form of canonical correlation analysis in the corresponding reproducible kernel Hilbert spaces. Function prediction experiments for the genes of the yeast S. Cerevisiae validate this approach by showing a consistent increase in performance when a state-of-the-art classifier uses the vector of features instead of the original expression profile to predict the functional class of a gene.

expression profile, graph, roc index, (14 more...)

Country:

North America > United States (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Eskin, Eleazar, Weston, Jason, Noble, William S., Leslie, Christina S.

Mismatch String Kernels for SVM Protein Classification

We introduce a class of string kernels, called mismatch kernels, for use with support vector machines (SVMs) in a discriminative approach to the protein classification problem.

kernel, mismatch kernel, sequence, (15 more...)

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.51)

Pasula, Hanna, Marthi, Bhaskara, Milch, Brian, Russell, Stuart J., Shpitser, Ilya

Identity Uncertainty and Citation Matching

Identity uncertainty is a pervasive problem in real-world data analysis. It arises whenever objects are not labeled with unique identifiers or when those identifiers may not be perceived perfectly. In such cases, two observations may or may not correspond to the same object. In this paper, we consider the problem in the context of citation matching--the problem of deciding which citations correspond to the same publication. Our approach is based on the use of a relational probability model to define a generative model for the domain, including models of author and title corruption and a probabilistic citation grammar. Identity uncertainty is handled by extending standard models to incorporate probabilities over the possible mappings between terms in the language and objects in the domain. Inference is based on Markov chain Monte Carlo, augmented with specific methods for generating efficient proposals when the domain contains many objects. Results on several citation data sets show that the method outperforms current algorithms for citation matching. The declarative, relational nature of the model also means that our algorithm can determine object characteristics such as author names by combining multiple citations of multiple papers.

algorithm, probability model, surname, (13 more...)

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)

Feature Selection by Maximum Marginal Diversity

Vasconcelos, Nuno

We address the question of feature selection in the context of visual recognition. It is shown that, besides efficient from a computational standpoint, the infomax principle is nearly optimal in the minimum Bayes error sense. The concept of marginal diversity is introduced, leading to a generic principle for feature selection (the principle of maximum marginal diversity) of extreme computational simplicity. The relationships between infomax and the maximization of marginal diversity are identified, uncovering the existence of a family of classification procedures for which near optimal (in the Bayes error sense) feature selection does not require combinatorial search. Examination of this family in light of recent studies on the statistics of natural images suggests that visual recognition problems are a subset of it.

diversity, feature selection, marginal diversity, (12 more...)

Country:

North America > United States > Colorado > Larimer County > Fort Collins (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Freeman, William T., Torralba, Antonio

Shape Recipes: Scene Representations that Refer to the Image

The goal of low-level vision is to estimate an underlying scene, given an observed image. Real-world scenes (eg, albedos or shapes) can be very complex, conventionally requiring high dimensional representations which are hard to estimate and store. We propose a low-dimensional representation, called a scene recipe, that relies on the image itself to describe the complex scene configurations. Shape recipes are an example: these are the regression coefficients that predict the bandpassed shape from image data. We describe the benefits of this representation, and show two uses illustrating their properties: (1) we improve stereo shape estimates by learning shape recipes at low resolution and applying them at full resolution; (2) Shape recipes implicitly contain information about lighting and materials and we use them for material segmentation.

representation, shape recipe, subband, (15 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > District of Columbia > Washington (0.04)
North America > United States > California > Monterey County > Pacific Grove (0.04)
Europe > France (0.04)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Bartlett, M.S., Littlewort, G.C., Sejnowski, T.J., Movellan, J.R.

A Prototype for Automatic Recognition of Spontaneous Facial Actions

Spontaneous facial expressions differ substantially from posed expressions, similar to how continuous, spontaneous speech differs from isolated words produced on command. Previous methods for automatic facial expression recognition assumed images were collected in controlled environments in which the subjects deliberately faced the camera. Since people often nod or turn their heads, automatic recognition of spontaneous facial behavior requires methods for handling out-of-image-plane head rotations. Here we explore an approach based on 3-D warping of images into canonical views. We evaluated the performance of the approach as a front-end for a spontaneous expression recognition system using support vector machines and hidden Markov models. This system employed general purpose learning mechanisms that can be applied to recognition of any facial movement. The system was tested for recognition of a set of facial actions defined by the Facial Action Coding System (FACS). We showed that 3D tracking and warping followed by machine learning techniques directly applied to the warped images, is a viable and promising technology for automatic facial expression recognition. One exciting aspect of the approach presented here is that information about movement dynamics emerged out of filters which were derived from the statistics of images.

expression, facial expression, sequence, (16 more...)

Country:

North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.69)

Kannan, Anitha, Jojic, Nebojsa, Frey, Brendan

Fast Transformation-Invariant Factor Analysis

Dimensionality reduction techniques such as principal component analysis and factor analysis are used to discover a linear mapping between high dimensional data samples and points in a lower dimensional subspace. In [6], Jojic and Frey introduced mixture of transformation-invariant component analyzers (MTCA) that can account for global transformations such as translations and rotations, perform clustering and learn local appearance deformations by dimensionality reduction.

mtca, sequence, transformation, (13 more...)

Country:

North America > Canada > Ontario > Toronto (0.15)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
(2 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Martin, David R., Fowlkes, Charless C., Malik, Jitendra

Learning to Detect Natural Image Boundaries Using Brightness and Texture

The goal of this work is to accurately detect and localize boundaries in natural scenes using local image measurements. We formulate features that respond to characteristic changes in brightness and texture associated with natural boundaries. In order to combine the information from these features in an optimal way, a classifier is trained using human labeled images as ground truth. We present precision-recall curves showing that the resulting detector outperforms existing approaches.

boundary, classifier, pixel, (14 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (0.32)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Kajarekar, Sachin S., Hermansky, Hynek

Analysis of Information in Speech Based on MANOVA

We propose analysis of information in speech using three sources - language (phone), speaker and channeL Information in speech is measured as mutual information between the source and the set of features extracted from speech signaL We assume that distribution of features can be modeled using Gaussian distribution. The mutual information is computed using the results of analysis of variability in speech. We observe similarity in the results of phone variability and phone information, and show that the results of the proposed analysis have more meaningful interpretations than the analysis of variability. 1 Introduction Speech signal carries information about the linguistic message, the speaker, the communication channeL In the previous work [1, 2], we proposed analysis of information in speech as analysis of variability in a set of features extracted from the speech signal. The variability was measured as covariance of the features, and analysis was performed using using multivariate analysis of variance (MANOVA). Total variability was divided into three types of variabilities, namely, intra-phone (or phone) variability, speaker variability, and channel variability.

channel information, information, variability, (14 more...)

Country:

North America > United States > Oregon > Washington County > Beaverton (0.04)
North America > United States > Colorado (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.36)