AITopics

1304.4634

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Government (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(2 more...)

Rudovic, Ognjen, Pantic, Maja, Pavlovic, Vladimir

Heteroscedastic Conditional Ordinal Random Fields for Pain Intensity Estimation from Facial Images

arXiv.org Machine LearningApr-3-2013

We propose a novel method for automatic pain intensity estimation from facial images based on the framework of kernel Conditional Ordinal Random Fields (KCORF). We extend this framework to account for heteroscedasticity on the output labels(i.e., pain intensity scores) and introduce a novel dynamic features, dynamic ranks, that impose temporal ordinal constraints on the static ranks (i.e., intensity scores). Our experimental results show that the proposed approach outperforms state-of-the art methods for sequence classification with ordinal data and other ordinal regression models. The approach performs significantly better than other models in terms of Intra-Class Correlation measure, which is the most accepted evaluation measure in the tasks of facial behaviour intensity estimation.

facial image, heteroscedastic conditional ordinal random field, pain intensity estimation

1301.5063

Genre: Research Report > Promising Solution (0.53)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.60)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.60)

arXiv.org Machine LearningMar-24-2013

A Diffusion Process on Riemannian Manifold for Visual Tracking

Chen, Marcus, Jen, Cham Tat, Kim, Pang Sze, Goh, Alvina

Robust visual tracking for long video sequences is a research area that has many important applications. The main challenges include how the target image can be modeled and how this model can be updated. In this paper, we model the target using a covariance descriptor, as this descriptor is robust to problems such as pixel-pixel misalignment, pose and illumination changes, that commonly occur in visual tracking. We model the changes in the template using a generative process. We introduce a new dynamical model for the template update using a random walk on the Riemannian manifold where the covariance descriptors lie in. This is done using log-transformed space of the manifold to free the constraints imposed inherently by positive semidefinite matrices. Modeling template variations and poses kinetics together in the state space enables us to jointly quantify the uncertainties relating to the kinematic states and the template in a principled way. Finally, the sequential inference of the posterior distribution of the kinematic states and the template is done using a particle filter. Our results shows that this principled approach can be robust to changes in illumination, poses and spatial affine transformation. In the experiments, our method outperformed the current state-of-the-art algorithm - the incremental Principal Component Analysis method, particularly when a target underwent fast poses changes and also maintained a comparable performance in stable target tracking cases.

artificial intelligence, health & medicine, template, (18 more...)

1303.5913

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report > New Finding (0.69)

Industry:

Leisure & Entertainment (0.48)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.88)

LaValle, Steven M., Hutchinson, Seth A.

On Considering Uncertainty and Alternatives in Low-Level Vision

arXiv.org Artificial IntelligenceMar-6-2013

In this paper we address the uncertainty issues involved in the low-level vision task of image segmentation. Researchers in computer vision have worked extensively on this problem, in which the goal is to partition (or segment) an image into regions that are homogeneous or uniform in some sense. This segmentation is often utilized by some higher level process, such as an object recognition system. We show that by considering uncertainty in a Bayesian formalism, we can use statistical image models to build an approximate representation of a probability distribution over a space of alternative segmentations. We give detailed descriptions of the various levels of uncertainty associated with this problem, discuss the interaction of prior and posterior distributions, and provide the operations for constructing this representation.

artificial intelligence, bayesian inference, segmentation, (19 more...)

arXiv.org Artificial Intelligence

1303.146

Country: North America > United States > California (0.28)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Friedman, Nir, Russell, Stuart

Image Segmentation in Video Sequences: A Probabilistic Approach

arXiv.org Artificial IntelligenceFeb-6-2013

"Background subtraction" is an old technique for finding moving objects in a video sequence for example, cars driving on a freeway. The idea is that subtracting the current image from a timeaveraged background image will leave only nonstationary objects. It is, however, a crude approximation to the task of classifying each pixel of the current image; it fails with slow-moving objects and does not distinguish shadows from moving objects. The basic idea of this paper is that we can classify each pixel using a model of how that pixel looks when it is part of different classes. We learn a mixture-of-Gaussians classification model for each pixel using an unsupervised technique- an efficient, incremental version of EM. Unlike the standard image-averaging approach, this automatically updates the mixture component for each class according to likelihood of membership; hence slow-moving objects are handled perfectly. Our approach also identifies and eliminates shadows much more effectively than other techniques such as thresholding. Application of this method as part of the Roadwatch traffic surveillance project is expected to result in significant improvements in vehicle identification and tracking.

artificial intelligence, machine learning, pixel, (16 more...)

arXiv.org Artificial Intelligence

1302.1539

Country: North America > United States > California > Alameda County > Berkeley (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Sensing and Signal Processing > Image Processing (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Denil, Misha, de Freitas, Nando

Recklessly Approximate Sparse Coding

arXiv.org Machine LearningJan-6-2013

It has recently been observed that certain extremely simple feature encoding techniques are able to achieve state of the art performance on several standard image classification benchmarks including deep belief networks, convolutional nets, factored RBMs, mcRBMs, convolutional RBMs, sparse autoencoders and several others. Moreover, these "triangle" or "soft threshold" encodings are ex- tremely efficient to compute. Several intuitive arguments have been put forward to explain this remarkable performance, yet no mathematical justification has been offered. The main result of this report is to show that these features are realized as an approximate solution to the a non-negative sparse coding problem. Using this connection we describe several variants of the soft threshold features and demonstrate their effectiveness on two image classification benchmark tasks.

deep learning, neural network, sparse, (19 more...)

1208.0959

Country: North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Ziegler, Andrew, Christiansen, Eric, Kriegman, David, Belongie, Serge J.

Locally Uniform Comparison Image Descriptor

Keypoint matching between pairs of images using popular descriptors like SIFT or a faster variant called SURF is at the heart of many computer vision algorithms including recognition, mosaicing, and structure from motion. For real-time mobile applications, very fast but less accurate descriptors like BRIEF and related methods use a random sampling of pairwise comparisons of pixel intensities in an image patch. Here, we introduce Locally Uniform Comparison Image Descriptor (LUCID), a simple description method based on permutation distances between the ordering of intensities of RGB values between two patches. LUCID is computable in linear time with respect to patch size and does not require floating point computation. An analysis reveals an underlying issue that limits the potential of BRIEF and related approaches compared to LUCID. Experiments demonstrate that LUCID is faster than BRIEF, and its accuracy is directly comparable to SURF while being more than an order of magnitude faster.

artificial intelligence, data mining, descriptor, (15 more...)

Country: North America > United States > California (0.14)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.90)
Information Technology > Data Science > Data Mining (0.68)

Kiros, Ryan, Szepesvári, Csaba

Deep Representations and Codes for Image Auto-Annotation

The task of assigning a set of relevant tags to an image is challenging due to the size and variability of tag vocabularies. Consequently, most existing algorithms focus on tag assignment and fix an often large number of hand-crafted features to describe image characteristics. In this paper we introduce a hierarchical model for learning representations of full sized color images from the pixel level, removing the need for engineered feature representations and subsequent feature selection. We benchmark our model on the STL-10 recognition dataset, achieving state-of-the-art performance. When our features are combined with TagProp (Guillaumin et al.), we outperform or compete with existing annotation approaches that use over a dozen distinct image descriptors. Furthermore, using 256-bit codes and Hamming distance for training TagProp, we exchange only a small reduction in performance for efficient storage and fast comparisons. In our experiments, using deeper architectures always outperform shallow ones.

annotation, deep learning, neural network, (19 more...)

Country: North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Zoran, Daniel, Weiss, Yair

Natural Images, Gaussian Mixtures and Dead Leaves

Simple Gaussian Mixture Models (GMMs) learned from pixels of natural image patches have been recently shown to be surprisingly strong performers in modeling the statistics of natural images. Here we provide an in depth analysis of this simple yet rich model. We show that such a GMM model is able to compete with even the most successful models of natural images in log likelihood scores, denoising performance and sample quality. We provide an analysis of what such a model learns from natural images as a function of number of mixture components - including covariance structure, contrast variation and intricate structures such as textures, boundaries and more. Finally, we show that the salient properties of the GMM learned from natural images can be derived from a simplified Dead Leaves model which explicitly models occlusion, explaining its surprising success relative to other models. 1 GMMs and natural image statistics models Many models for the statistics of natural image patches have been suggested in recent years.

artificial intelligence, machine learning, natural image, (16 more...)

Country: Asia > Middle East > Israel (0.14)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Sterne, Philip, Bornschein, Joerg, Sheikh, Abdul-saboor, Luecke, Joerg, Shelton, Jacquelyn A.

Why MCA? Nonlinear sparse coding with spike-and-slab prior for neurally plausible image encoding

Modelling natural images with sparse coding (SC) has faced two main challenges: flexibly representing varying pixel intensities and realistically representing lowlevel imagecomponents. This paper proposes a novel multiple-cause generative model of low-level image statistics that generalizes the standard SC model in two crucial points: (1) it uses a spike-and-slab prior distribution for a more realistic representation of component absence/intensity, and (2) the model uses the highly nonlinear combination rule of maximal causes analysis (MCA) instead of a linear combination.The major challenge is parameter optimization because a model with either (1) or (2) results in strongly multimodal posteriors. We show for the first time that a model combining both improvements can be trained efficiently while retaining the rich structure of the posteriors. We design an exact piecewise Gibbssampling method and combine this with a variational method based on preselection of latent dimensions. This combined training scheme tackles both analytical and computational intractability and enables application of the model to a large number of observed and hidden dimensions.

health & medicine, neurology, sparse, (19 more...)

Country: Europe > Germany (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)