Goto

Collaborating Authors

 Sensing and Signal Processing


Group Sparse Coding

Neural Information Processing Systems

Bag-of-words document representations are often used in text, image and video processing. While it is relatively easy to determine a suitable word dictionary for text documents, there is no simple mapping from raw images or videos to dictionary terms. The classical approach builds a dictionary using vector quantization over a large set of useful visual descriptors extracted from a training set, and uses a nearest-neighbor algorithm to count the number of occurrences of each dictionary word in documents to be encoded. More robust approaches have been proposed recently that represent each visual descriptor as a sparse weighted combination of dictionary words. While favoring a sparse representation at the level of visual descriptors, those methods however do not ensure that images have sparse representation. In this work, we use mixed-norm regularization to achieve sparsity at the image level as well as a small overall dictionary. This approach can also be used to encourage using the same dictionary words for all the images in a class, providing a discriminative signal in the construction of image representations. Experimental results on a benchmark image classification dataset show that when compact image or dictionary representations are needed for computational efficiency, the proposed approach yields better mean average precision in classification.


Unsupervised Detection of Regions of Interest Using Iterative Link Analysis

Neural Information Processing Systems

This paper proposes a fast and scalable alternating optimization technique to detect regionsof interest (ROIs) in cluttered Web images without labels. The proposed approachdiscovers highly probable regions of object instances by iteratively repeating the following two functions: (1) choose the exemplar set (i.e. a small number of highly ranked reference ROIs) across the dataset and (2) refine the ROIs of each image with respect to the exemplar set. These two subproblems are formulated as ranking in two different similarity networks of ROI hypotheses by link analysis. The experiments with the PASCAL 06 dataset show that our unsupervised localization performance is better than one of state-of-the-art techniques andcomparable to supervised methods. Also, we test the scalability of our approach with five objects in Flickr dataset consisting of more than 200K images.


Hierarchical Modeling of Local Image Features through $L_p$-Nested Symmetric Distributions

Neural Information Processing Systems

We introduce a new family of distributions, called $L_p${\em -nested symmetric distributions}, whose densities access the data exclusively through a hierarchical cascade of $L_p$-norms. This class generalizes the family of spherically and $L_p$-spherically symmetric distributions which have recently been successfully used for natural image modeling. Similar to those distributions it allows for a nonlinear mechanism to reduce the dependencies between its variables. With suitable choices of the parameters and norms, this family also includes the Independent Subspace Analysis (ISA) model, which has been proposed as a means of deriving filters that mimic complex cells found in mammalian primary visual cortex. $L_p$-nested distributions are easy to estimate and allow us to explore the variety of models between ISA and the $L_p$-spherically symmetric models. Our main findings are that, without a preprocessing step of contrast gain control, the independent subspaces of ISA are in fact more dependent than the individual filter coefficients within a subspace and, with contrast gain control, where ISA finds more than one subspace, the filter responses were almost independent anyway.


Statistical Models of Linear and Nonlinear Contextual Interactions in Early Visual Processing

Neural Information Processing Systems

A central hypothesis about early visual processing is that it represents inputs in a coordinate system matched to the statistics of natural scenes. Simple versions of this lead to Gabor-like receptive fields and divisive gain modulation from local surrounds; these have led to influential neural and psychological models of visual processing. However, these accounts are based on an incomplete view of the visual context surrounding each point. Here, we consider an approximate model of linear and non-linear correlations between the responses of spatially distributed Gabor-like receptive fields, which, when trained on an ensemble of natural scenes, unifies a range of spatial context effects. The full model accounts for neural surround data in primary visual cortex (V1), provides a statistical foundation for perceptual phenomena associated with Lis (2002) hypothesis that V1 builds a saliency map, and fits data on the tilt illusion.


Nonparametric Bayesian Texture Learning and Synthesis

Neural Information Processing Systems

We present a nonparametric Bayesian method for texture learning and synthesis. A texture image is represented by a 2D-Hidden Markov Model (2D-HMM) where the hidden states correspond to the cluster labeling of textons and the transition matrix encodes their spatial layout (the compatibility between adjacent textons). 2D-HMM is coupled with the Hierarchical Dirichlet process (HDP) which allows the number of textons and the complexity of transition matrix grow as the input texture becomes irregular. The HDP makes use of Dirichlet process prior which favors regular textures by penalizing the model complexity. This framework (HDP-2D-HMM) learns the texton vocabulary and their spatial layout jointly and automatically. The HDP-2D-HMM results in a compact representation of textures which allows fast texture synthesis with comparable rendering quality over the state-of-the-art image-based rendering methods. We also show that HDP-2D-HMM can be applied to perform image segmentation and synthesis.


Multi-Label Prediction via Compressed Sensing

Neural Information Processing Systems

We consider multi-label prediction problems with large output spaces under the assumption of output sparsity - that the target (label) vectors have small support. We develop a general theory for a variant of the popular error correcting output code scheme, using ideas from compressed sensing for exploiting this sparsity.


Remote Monitoring of Activity, Location, and Exertion Levels

AAAI Conferences

The purpose of this study was to develop and test a platform that would assist the Environmental Protection Agency (EPA), and the scientific community at large, in the generation of a human activity and energy expenditure database of sufficient detail to accurately predict human exposures and dose to various pollutants. The monitoring system developed is easily extendable to the collection of other health-related data. Our protocol tested the use of a digital voice recorder to collect activity/location diary data assuming it to be a less burdensome and a more reliable method than using paper and pencil diaries or hand-held computers. We expected the data to be more complete and reliable than retrospective reports (diaries filled out at the end of day) because the recorders are easy to use, the diary entries are made as the events occur, and we expected that participants would be more likely to complete the study because of the reduced burden. The data collection plan was also expected to show that the cost of the transcription of the diary can be reduced substantially by using speech and language processing to translate the digital diaries into the EPAโ€™s Comprehensive Human Activity Database (CHAD).


The Cyborg Astrobiologist: Testing a Novelty-Detection Algorithm on Two Mobile Exploration Systems at Rivas Vaciamadrid in Spain and at the Mars Desert Research Station in Utah

arXiv.org Machine Learning

(ABRIDGED) In previous work, two platforms have been developed for testing computer-vision algorithms for robotic planetary exploration (McGuire et al. 2004b,2005; Bartolo et al. 2007). The wearable-computer platform has been tested at geological and astrobiological field sites in Spain (Rivas Vaciamadrid and Riba de Santiuste), and the phone-camera has been tested at a geological field site in Malta. In this work, we (i) apply a Hopfield neural-network algorithm for novelty detection based upon color, (ii) integrate a field-capable digital microscope on the wearable computer platform, (iii) test this novelty detection with the digital microscope at Rivas Vaciamadrid, (iv) develop a Bluetooth communication mode for the phone-camera platform, in order to allow access to a mobile processing computer at the field sites, and (v) test the novelty detection on the Bluetooth-enabled phone-camera connected to a netbook computer at the Mars Desert Research Station in Utah. This systems engineering and field testing have together allowed us to develop a real-time computer-vision system that is capable, for example, of identifying lichens as novel within a series of images acquired in semi-arid desert environments. We acquired sequences of images of geologic outcrops in Utah and Spain consisting of various rock types and colors to test this algorithm. The algorithm robustly recognized previously-observed units by their color, while requiring only a single image or a few images to learn colors as familiar, demonstrating its fast learning capability.


Algorithms for Image Analysis and Combination of Pattern Classifiers with Application to Medical Diagnosis

arXiv.org Artificial Intelligence

Medical Informatics and the application of modern signal processing in the assistance of the diagnostic process in medical imaging is one of the more recent and active research areas today. This thesis addresses a variety of issues related to the general problem of medical image analysis, specifically in mammography, and presents a series of algorithms and design approaches for all the intermediate levels of a modern system for computer-aided diagnosis (CAD). The diagnostic problem is analyzed with a systematic approach, first defining the imaging characteristics and features that are relevant to probable pathology in mammo-grams. Next, these features are quantified and fused into new, integrated radio-logical systems that exhibit embedded digital signal processing, in order to improve the final result and minimize the radiological dose for the patient. In a higher level, special algorithms are designed for detecting and encoding these clinically interest-ing imaging features, in order to be used as input to advanced pattern classifiers and machine learning models. Finally, these approaches are extended in multi-classifier models under the scope of Game Theory and optimum collective deci-sion, in order to produce efficient solutions for combining classifiers with minimum computational costs for advanced diagnostic systems. The material covered in this thesis is related to a total of 18 published papers, 6 in scientific journals and 12 in international conferences.


A Fully Automatic System for Restoration of Historical Document Images

AAAI Conferences

Historical document images are subject to intrinsic distortions such as background noise and bleed-through interference due to aging and extrinsic distortions such as displacement, uneven surfaces introduced during image acquisition procedure. In this paper, we propose a fully automatic restoration framework that corrects bleed-through distortion on double-sided handwritten historical document images. First, the two sides of a document are registered with corresponding control points which are selected by inspecting the images' gradient maps and minimizing a predefined dissimilarity measure. The established correspondences are refined by median filters and consistency checking. Piecewise linear mapping function is chosen to represent the spatial relationship between the two images. Based on the estimated transform model, backward re-sampling strategy and bi-cubic spline interpolation are adopted to obtain final registered images. Once the two sides of a page have been registered, enhancement/smearing feature images are extracted and iterative wavelet decomposition/construction is performed to restore the degraded images. Experiments on the real documents from the National Archives of Singapore demonstrate a completely automatic framework to the restoration of historical document images.