Goto

Collaborating Authors

 Technology


Pictorial Structures for Molecular Modeling: Interpreting Density Maps

Neural Information Processing Systems

X-ray crystallography is currently the most common way protein structures are elucidated. One of the most time-consuming steps in the crystallographic process is interpretation of the electron density map, a task that involves finding patterns in a three-dimensional picture of a protein. This paper describes DEFT (DEFormable Template), an algorithm using pictorial structures to build a flexible protein model from the protein's amino-acid sequence. Matching this pictorial structure into the density map is a way of automating density-map interpretation. Also described are several extensions to the pictorial structure matching algorithm necessary for this automated interpretation. DEFT is tested on a set of density maps ranging from 2 to 4ร… resolution, producing rootmean-squared errors ranging from 1.38 to 1.84ร….


Triangle Fixing Algorithms for the Metric Nearness Problem

Neural Information Processing Systems

Various problems in machine learning, databases, and statistics involve pairwise distances among a set of objects. It is often desirable for these distances to satisfy the properties of a metric, especially the triangle inequality. Applications where metric data is useful include clustering, classification, metric-based indexing, and approximation algorithms for various graph problems. This paper presents the Metric Nearness Problem: Given a dissimilarity matrix, find the "nearest" matrix of distances that satisfy the triangle inequalities.


Bayesian inference in spiking neurons

Neural Information Processing Systems

We propose a new interpretation of spiking neurons as Bayesian integrators accumulating evidence over time about events in the external world or the body, and communicating to other neurons their certainties about these events. In this model, spikes signal the occurrence of new information, i.e. what cannot be predicted from the past activity. As a result, firing statistics are close to Poisson, albeit providing a deterministic representation of probabilities. We proceed to develop a theory of Bayesian inference in spiking neural networks, recurrent interactions implementing a variant of belief propagation. Many perceptual and motor tasks performed by the central nervous system are probabilistic, and can be described in a Bayesian framework [4, 3].


The Power of Selective Memory: Self-Bounded Learning of Prediction Suffix Trees

Neural Information Processing Systems

Prediction suffix trees (PST) provide a popular and effective tool for tasks such as compression, classification, and language modeling. In this paper we take a decision theoretic view of PSTs for the task of sequence prediction. Generalizing the notion of margin to PSTs, we present an online PST learning algorithm and derive a loss bound for it. The depth of the PST generated by this algorithm scales linearly with the length of the input. We then describe a self-bounded enhancement of our learning algorithm which automatically grows a bounded-depth PST. We also prove an analogous mistake-bound for the self-bounded algorithm. The result is an efficient algorithm that neither relies on a-priori assumptions on the shape or maximal depth of the target PST nor does it require any parameters. To our knowledge, this is the first provably-correct PST learning algorithm which generates a bounded-depth PST while being competitive with any fixed PST determined in hindsight.


Analysis of a greedy active learning strategy

Neural Information Processing Systems

We abstract out the core search problem of active learning schemes, to better understand the extent to which adaptive labeling can improve sample complexity. We give various upper and lower bounds on the number of labels which need to be queried, and we prove that a popular greedy active learning rule is approximately as good as any other strategy for minimizing this number of labels.


Semigroup Kernels on Finite Sets

Neural Information Processing Systems

Complex objects can often be conveniently represented by finite sets of simpler components, such as images by sets of patches or texts by bags of words. We study the class of positive definite (p.d.) kernels for two such objects that can be expressed as a function of the merger of their respective sets of components. We prove a general integral representation of such kernels and present two particular examples. One of them leads to a kernel for sets of points living in a space endowed itself with a positive definite kernel. We provide experimental results on a benchmark experiment of handwritten digits image classification which illustrate the validity of the approach.


Trait Selection for Assessing Beef Meat Quality Using Non-linear SVM

Neural Information Processing Systems

In this paper we show that it is possible to model sensory impressions of consumers about beef meat. This is not a straightforward task; the reason is that when we are aiming to induce a function that maps object descriptions into ratings, we must consider that consumers' ratings are just a way to express their preferences about the products presented in the same testing session. Therefore, we had to use a special purpose SVM polynomial kernel. The training data set used collects the ratings of panels of experts and consumers; the meat was provided by 103 bovines of 7 Spanish breeds with different carcass weights and aging periods. Additionally, to gain insight into consumer preferences, we used feature subset selection tools. The result is that aging is the most important trait for improving consumers' appreciation of beef meat.


Confidence Intervals for the Area Under the ROC Curve

Neural Information Processing Systems

In many applications, good ranking is a highly desirable performance for a classifier. The criterion commonly used to measure the ranking quality of a classification algorithm is the area under the ROC curve (AUC). To report it properly, it is crucial to determine an interval of confidence for its value. This paper provides confidence intervals for the AUC based on a statistical and combinatorial analysis using only simple parameters such as the error rate and the number of positive and negative examples. The analysis is distribution-independent, it makes no assumption about the distribution of the scores of negative or positive examples. The results are of practical use and can be viewed as the equivalent for AUC of the standard confidence intervals given in the case of the error rate. They are compared with previous approaches in several standard classification tasks demonstrating the benefits of our analysis.


Distributed Information Regularization on Graphs

Neural Information Processing Systems

We provide a principle for semi-supervised learning based on optimizing the rate of communicating labels for unlabeled points with side information. The side information is expressed in terms of identities of sets of points or regions with the purpose of biasing the labels in each region to be the same. The resulting regularization objective is convex, has a unique solution, and the solution can be found with a pair of local propagation operations on graphs induced by the regions. We analyze the properties of the algorithm and demonstrate its performance on document classification tasks.


Theories of Access Consciousness

Neural Information Processing Systems

Theories of access consciousness address how it is that some mental states but not others are available for evaluation, choice behavior, and verbal report. Farah, O'Reilly, and Vecera (1994) argue that quality of representation is critical; Dehaene, Sergent, and Changeux (2003) argue that the ability to communicate representations is critical. We present a probabilistic information transmission or PIT model that suggests both of these conditions are essential for access consciousness. Having successfully modeled data from the repetition priming literature in the past, we use the PIT model to account for data from two experiments on subliminal priming, showing that the model produces priming even in the absence of accessibility and reportability of internal states. The model provides a mechanistic basis for understanding the dissociation of priming and awareness. Philosophy has made many attempts to identify distinct aspects of consciousness. Perhaps the most famous effort is Block's (1995) delineation of phenomenal and access consciousness. Phenomenal consciousness has to do with "what it is like" to experience chocolate or a pin prick. Access consciousness refers to internal states whose content is "(1) inferentially promiscuous, i.e., poised to be used as a premise in reasoning, (2) poised for control of action, and (3) poised for rational control of speech."