Sensing and Signal Processing
Texture Classification Approach Based on Combination of Edge & Co-occurrence and Local Binary Pattern
Texture classification is one of the problems which has been paid much attention on by computer scientists since late 90s. If texture classification is done correctly and accurately, it can be used in many cases such as Pattern recognition, object tracking, and shape recognition. So far, there have been so many methods offered to solve this problem. Near all these methods have tried to extract and define features to separate different labels of textures really well. This article has offered an approach which has an overall process on the images of textures based on Local binary pattern and Gray Level Co-occurrence matrix and then by edge detection, and finally, extracting the statistical features from the images would classify them. Although, this approach is a general one and is could be used in different applications, the method has been tested on the stone texture and the results have been compared with some of the previous approaches to prove the quality of proposed approach.
Hybrid Generative/Discriminative Learning for Automatic Image Annotation
Yang, Shuang Hong, Bian, Jiang, Zha, Hongyuan
Automatic image annotation (AIA) raises tremendous challenges to machine learning as it requires modeling of data that are both ambiguous in input and output, e.g., images containing multiple objects and labeled with multiple semantic tags. Even more challenging is that the number of candidate tags is usually huge (as large as the vocabulary size) yet each image is only related to a few of them. This paper presents a hybrid generative-discriminative classifier to simultaneously address the extreme data-ambiguity and overfitting-vulnerability issues in tasks such as AIA. Particularly: (1) an Exponential-Multinomial Mixture (EMM) model is established to capture both the input and output ambiguity and in the meanwhile to encourage prediction sparsity; and (2) the prediction ability of the EMM model is explicitly maximized through discriminative learning that integrates variational inference of graphical models and the pairwise formulation of ordinal regression. Experiments show that our approach achieves both superior annotation performance and better tag scalability.
Perturbation of the Eigenvectors of the Graph Laplacian: Application to Image Denoising
Meyer, Francois G., Shen, Xilin
The original contributions of this paper are twofold: a new understanding of the influence of noise on the eigenvectors of the graph Laplacian of a set of image patches, and an algorithm to estimate a denoised set of patches from a noisy image. The algorithm relies on the following two observations: (1) the low-index eigenvectors of the diffusion, or graph Laplacian, operators are very robust to random perturbations of the weights and random changes in the connections of the patch-graph; and (2) patches extracted from smooth regions of the image are organized along smooth low-dimensional structures in the patch-set, and therefore can be reconstructed with few eigenvectors. Experiments demonstrate that our denoising algorithm outperforms the denoising gold-standards.
Cultural Analytics of Large Datasets from Flickr
Ushizima, Daniela (Lawrence Berkeley National Laboratory) | Manovich, Lev (University of California, San Diego) | Margolis, Todd (University of California, San Diego) | Douglas, Jeremy (Ashford University)
Deluge became a metaphor to describe the amount of information to which we are subjected, and very often we feel we are drowning while our access to information is rising. Devising mechanisms for exploring massive image sets according to perceptual attributes is still a challenge, even more when dealing with user-generated social media content. Such images tend to be heterogenous, and using metadata-only can be misleading. This paper describes a set of tools designed to analyze large sets of user-created art related images using image features describing color, texture, composition and orientation. The proposed pipeline permits to discriminate Flickr groups in terms of feature vectors and clustering parameters. The algorithms are general enough to be applied to other domains in which the main question is about the variability of the images.
Noisy Search with Comparative Feedback
We present theoretical results in terms of lower and upper bounds on the query complexity of noisy search with comparative feedback. In this search model, the noise in the feedback depends on the distance between query points and the search target. Consequently, the error probability in the feedback is not fixed but varies for the queries posed by the search algorithm. Our results show that a target out of n items can be found in O(log n) queries. We also show the surprising result that for k possible answers per query, the speedup is not log k (as for k-ary search) but only log log k in some cases.
On the Lagrangian Biduality of Sparsity Minimization Problems
Singaraju, Dheeraj, Elhamifar, Ehsan, Tron, Roberto, Yang, Allen Y., Sastry, S. Shankar
Recent results in Compressive Sensing have shown that, under certain conditions, the solution to an underdetermined system of linear equations with sparsity-based regularization can be accurately recovered by solving convex relaxations of the original problem. In this work, we present a novel primal-dual analysis on a class of sparsity minimization problems. We show that the Lagrangian bidual (i.e., the Lagrangian dual of the Lagrangian dual) of the sparsity minimization problems can be used to derive interesting convex relaxations: the bidual of the $\ell_0$-minimization problem is the $\ell_1$-minimization problem; and the bidual of the $\ell_{0,1}$-minimization problem for enforcing group sparsity on structured data is the $\ell_{1,\infty}$-minimization problem. The analysis provides a means to compute per-instance non-trivial lower bounds on the (group) sparsity of the desired solutions. In a real-world application, the bidual relaxation improves the performance of a sparsity-based classification framework applied to robust face recognition.
Portmanteau Vocabularies for Multi-Cue Image Representation
Khan, Fahad S., Weijer, Joost, Bagdanov, Andrew D., Vanrell, Maria
We describe a novel technique for feature combination in the bag-of-words model of image classification. Our approach builds discriminative compound words from primitive cues learned independently from training images. Our main observation is that modeling joint-cue distributions independently is more statistically robust for typical classification problems than attempting to empirically estimate the dependent, joint-cue distribution directly. We use Information theoretic vocabulary compression to find discriminative combinations of cues and the resulting vocabulary of portmanteau words is compact, has the cue binding property, and supports individual weighting of cues in the final image representation. State-of-the-art results on both the Oxford Flower-102 and Caltech-UCSD Bird-200 datasets demonstrate the effectiveness of our technique compared to other, significantly more complex approaches to multi-cue image representation
Let us first agree on what the term "semantics" means: An unorthodox approach to an age-old debate
Traditionally, semantics has been seen as a feature of human language. The advent of the information era has led to its widespread redefinition as an information feature. Contrary to this praxis, I define semantics as a special kind of information. Revitalizing the ideas of Bar-Hillel and Carnap I have recreated and re-established the notion of semantics as the notion of Semantic Information. I have proposed a new definition of information (as a description, a linguistic text, a piece of a story or a tale) and a clear segregation between two different types of information - physical and semantic information. I hope, I have clearly explained the (usually obscured and mysterious) interrelations between data and physical information as well as the relation between physical information and semantic information. Consequently, usually indefinable notions of "information", "knowledge", "memory", "learning" and "semantics" have also received their suitable illumination and explanation.
Heavy-tailed Distances for Gradient Based Image Descriptors
Jia, Yangqing, Darrell, Trevor
Many applications in computer vision measure the similarity between images or image patches based on some statistics such as oriented gradients. These are often modeled implicitly or explicitly with a Gaussian noise assumption, leading to the use of the Euclidean distance when comparing image descriptors. In this paper, we show that the statistics of gradient based image descriptors often follow a heavy-tailed distribution, which undermines any principled motivation for the use of Euclidean distances. We advocate for the use of a distance measure based on the likelihood ratio test with appropriate probabilistic models that fit the empirical data distribution. We instantiate this similarity measure with the Gamma-compound-Laplace distribution, and show significant improvement over existing distance measures in the application of SIFT feature matching, at relatively low computational cost.
Matrix Completion for Multi-label Image Classification
Cabral, Ricardo S., Torre, Fernando, Costeira, Joao P., Bernardino, Alexandre
Recently, image categorization has been an active research topic due to the urgent need to retrieve and browse digital images via semantic keywords. This paper formulates image categorization as a multi-label classification problem using recent advances in matrix completion. Under this setting, classification of testing data is posed as a problem of completing unknown label entries on a data matrix that concatenates training and testing features with training labels. We propose two convex algorithms for matrix completion based on a Rank Minimization criterion specifically tailored to visual data, and prove its convergence properties.