Goto

Collaborating Authors

 Country


A Rational Analysis of Cognitive Control in a Speeded Discrimination Task

Neural Information Processing Systems

We are interested in the mechanisms by which individuals monitor and adjust their performance of simple cognitive tasks. We model a speeded discrimination task in which individuals are asked to classify a sequence of stimuli (Jones & Braver, 2001). Response conflict arises when one stimulus class is infrequent relative to another, resulting in more errors and slower reaction times for the infrequent class. How do control processes modulate behavior based on the relative class frequencies? We explain performance from a rational perspective that casts the goal of individuals as minimizing a cost that depends both on error rate and reaction time.



Fragment Completion in Humans and Machines

Neural Information Processing Systems

Partial information can trigger a complete memory. At the same time, human memory is not perfect. A cue can contain enough information to specify an item in memory, but fail to trigger that item. In the context of word memory, we present experiments that demonstrate some basic patterns in human memory errors. We use cues that consist of word fragments. We show that short and long cues are completed more accurately than medium length ones and study some of the factors that lead to this behavior. We then present a novel computational model that shows some of the flexibility and patterns of errors that occur in human memory.


Probabilistic principles in unsupervised learning of visual structure: human data and a model

Neural Information Processing Systems

To find out how the representations of structured visual objects depend on the co-occurrence statistics of their constituents, we exposed subjects to a set of composite images with tight control exerted over (1) the conditional probabilities of the constituent fragments, and (2) the value of Barlow's criterion of "suspicious coincidence" (the ratio of joint probability to the product of marginals). We then compared the part verification response times for various probe/target combinations before and after the exposure. For composite probes, the speedup was much larger for targets that contained pairs of fragments perfectly predictive of each other, compared to those that did not. This effect was modulated by the significance of their co-occurrence as estimated by Barlow's criterion. For lone-fragment probes, the speedup in all conditions was generally lower than for composites. These results shed light on the brain's strategies for unsupervised acquisition of structural information in vision.



Modeling Temporal Structure in Classical Conditioning

Neural Information Processing Systems

The Temporal Coding Hypothesis of Miller and colleagues [7] suggests that animals integrate related temporal patterns of stimuli into single memory representations. We formalize this concept using quasi-Bayes estimation to update the parameters of a constrained hidden Markov model. This approach allows us to account for some surprising temporal effects in the second order conditioning experiments of Miller et al. [1, 2, 3], which other models are unable to explain.


Model Based Population Tracking and Automatic Detection of Distribution Changes

Neural Information Processing Systems

Probabilistic mixture models are used for a broad range of data analysis tasks such as clustering, classification, predictive modeling, etc. Due to their inherent probabilistic nature, mixture models can easily be combined with other probabilistic or non-probabilistic techniques thus forming more complex data analysis systems. In the case of online data (where there is a stream of data available) models can be constantly updated to reflect the most current distribution of the incoming data. However, in many business applications the models themselves represent a parsimonious summary of the data and therefore it is not desirable to change models frequently, much less with every new data point. In such a framework it becomes crucial to track the applicability of the mixture model and detect the point in time when the model fails to adequately represent the data. In this paper we formulate the problem of change detection and propose a principled solution. Empirical results over both synthetic and real-life data sets are presented.



Stochastic Mixed-Signal VLSI Architecture for High-Dimensional Kernel Machines

Neural Information Processing Systems

A mixed-signal paradigm is presented for high-resolution parallel innerproduct computationin very high dimensions, suitable for efficient implementation ofkernels in image processing. At the core of the externally digital architecture is a high-density, low-power analog array performing binary-binary partial matrix-vector multiplication. Full digital resolution is maintained even with low-resolution analog-to-digital conversion, owing torandom statistics in the analog summation of binary products. A random modulation scheme produces near-Bernoulli statistics even for highly correlated inputs. The approach is validated with real image data, and with experimental results from a CID/DRAM analog array prototype in 0.5


Spectral Relaxation for K-means Clustering

Neural Information Processing Systems

In K-means clusters are represented by centers of mass of their members, and it can be shown that the K-means algorithm of alternating between assigning cluster membership for each data vector to the nearest cluster center and computing the center of each cluster as the centroid of its member data vectors is equivalent to finding the minimum of a sum-of-squares cost function using coordinate descend. Despite the popularity of K means clustering, one of its major drawbacks is that the coordinate descend search method is prone to local minima. Much research has been done on computing refined initial points and adding explicit constraints to the sum-of-squares cost function for K-means clustering so that the search can converge to better local minimum [1,2]. In this paper we tackle the problem from a different angle: we find an equivalent formulation of the sum-of-squares minimization as a trace maximization problem with special constraints; relaxing the constraints leads to a maximization problem that possesses optimal global solutions. As a byproduct we also have an easily computable lower bound for the minimum of the sum-of-squares cost function. Our work is inspired by [9, 3] where connection to Gram matrix and extension of K means method to general Mercer kernels were investigated. The rest of the paper is organized as follows: in section 2, we derive the equivalent trace maximization formulation and discuss its spectral relaxation. In section 3, we discuss how to assign cluster membership using pivoted QR decomposition, taking into account the special structure of the partial eigenvector matrix. Finally, in section 4, we illustrate the performance of the clustering algorithms using document clustering as an example.