Country
Fast Discriminative Visual Codebooks using Randomized Clustering Forests
Moosmann, Frank, Triggs, Bill, Jurie, Frederic
Large numbers of descriptors and large codebooks are needed for good results and this becomes slow using k-means. We introduce Extremely Randomized Clustering Forests - ensembles of randomly created clustering trees - and show that these provide more accurate results, much faster training and testing and good resistance to background clutter in several state-of-the-art image classification tasks.
Modeling Dyadic Data with Binary Latent Factors
Meeds, Edward, Ghahramani, Zoubin, Neal, Radford M., Roweis, Sam T.
We introduce binary matrix factorization, a novel model for unsupervised matrix decomposition.The decomposition is learned by fitting a nonparametric Bayesian probabilistic model with binary latent variables to a matrix of dyadic data. Unlike bi-clustering models, which assign each row or column to a single cluster based on a categorical hidden feature, our binary feature model reflects the prior belief that items and attributes can be associated with more than one latent cluster at a time. We provide simple learning and inference rules for this new model and show how to extend it to an infinite model in which the number of features is not a priori fixed but is allowed to grow with the size of the data.
Dynamic Foreground/Background Extraction from Images and Videos using Random Patches
In this paper, we propose a novel exemplar-based approach to extract dynamic foreground regions from a changing background within a collection of images or a video sequence. By using image segmentation as a pre-processing step, we convert this traditional pixel-wise labeling problem into a lower-dimensional supervised, binarylabeling procedure on image segments. Our approach consists of three steps. First, a set of random image patches are spatially and adaptively sampled withineach segment. Second, these sets of extracted samples are formed into two "bags of patches" to model the foreground/background appearance, respectively.
Bayesian Detection of Infrequent Differences in Sets of Time Series with Shared Structure
Listgarten, Jennifer, Neal, Radford M., Roweis, Sam T., Puckrin, Rachel, Cutler, Sean
We present a hierarchical Bayesian model for sets of related, but different, classes of time series data. Our model performs alignment simultaneously across all classes, while detecting and characterizing class-specific differences. During inference themodel produces, for each class, a distribution over a canonical representation ofthe class. These class-specific canonical representations are automatically aligned to one another -- preserving common substructures, and highlighting differences.
Speakers optimize information density through syntactic reduction
If language users are rational, they might choose to structure their utterances so as to optimize communicative properties. In particular, information-theoretic and psycholinguistic considerations suggest that this may include maximizing the uniformity ofinformation density in an utterance. We investigate this possibility in the context of syntactic reduction, where the speaker has the option of either marking a higher-order unit (a phrase) with an extra word, or leaving it unmarked. We demonstrate that speakers are more likely to reduce less information-dense phrases. In a second step, we combine a stochastic model of structured utterance production with a logistic-regression model of syntactic reduction to study which types of cues speakers employ when estimating the predictability of upcoming elements. We demonstrate that the trend toward predictability-sensitive syntactic reduction (Jaeger, 2006) is robust in the face of a wide variety of control variables, andpresent evidence that speakers use both surface and structural cues for predictability estimation.