Images can convey rich semantics and evoke strong emotions in viewers. The research of my PhD thesis focuses on image emotion computing (IEC), which aims to predict the emotion perceptions of given images. The development of IEC is greatly constrained by two main challenges: affective gap and subjective evaluation. Previous works mainly focused on finding features that can express emotions better to bridge the affective gap, such as elements-of-art based features and shape features. According to the emotion representation models, including categorical emotion states (CES) and dimensional emotion space (DES), three different tasks are traditionally performed on IEC: affective image classification, regression and retrieval. The state-of-the-art methods on the three above tasks are image-centric, focusing on the dominant emotions for the majority of viewers. For my PhD thesis, I plan to answer the following questions: (1) Compared to the low-level elements-of-art based features, can we find some higher level features that are more interpretable and have stronger link to emotions? (2) Are the emotions that are evoked in viewers by an image subjective and different? If they are, how can we tackle the user-centric emotion prediction? (3) For image-centric emotion computing, can we predict the emotion distribution instead of the dominant emotion category?
This article is a comprehensive overview of Topic Modeling and its associated techniques. In natural language understanding (NLU) tasks, there is a hierarchy of lenses through which we can extract meaning -- from words to sentences to paragraphs to documents. At the document level, one of the most useful ways to understand text is by analyzing its topics. The process of learning, recognizing, and extracting these topics across a collection of documents is called topic modeling. In this post, we will explore topic modeling through 4 of the most popular techniques today: LSA, pLSA, LDA, and the newer, deep learning-based lda2vec.
Visual cortex neurons have receptive fields resembling oriented bandpass filters, and their response distributions on natural images are non-Gaussian. Inspired by this, we previously showed that comparing the response distribution to normal distribution with the same variance gives a good thresholding criterion for detecting salient levels of edginess in images. However, (1) the results were based on comparison with human data, thus, an objective, quantitative performance measure was not taken. Furthermore, (2) why a normal distribution would serve as a good baseline was not investigated in full. In this paper, we first conduct a quantitative analysis of the normal-distribution baseline, using artificial images that closely mimic the statistics of natural images.
A fundamental goal of research in molecular biology is to understand protein structure. Protein crystallography is currently the most successful method for determining the three-dimensional (3D) conformation of a protein, yet it remains labor intensive and relies on an expert's ability to derive and evaluate a protein scene model. In this paper, the problem of protein structure determination is formulated as an exercise in scene analysis. A computational methodology is presented in which a 3D image of a protein is segmented into a graph of critical points. Bayesian and certainty factor approaches are described and used to analyze critical point graphs and identify meaningful substructures, such as alpha-helices and beta-sheets. Results of applying the methodologies to protein images at low and medium resolution are reported. The research is related to approaches to representation, segmentation and classification in vision, as well as to top-down approaches to protein structure prediction.
To recognize an object in an image one must have some internal model of how that object may appear. We show how to learn such a model from a series of training images depicting a class of objects. The model represents a 3D object by a set of characteristic views, each defining a probability distribution over variation in object appearance. Features identified in an image through perceptual organization are represented by a graph whose nodes include feature labels and numeric measurements. Image graphs are partitioned into characteristic views by an incremental conceptual clustering algorithm. A learning procedure generalizes multiple image graphs to form a characteristic view graph in which the numeric measurements are described by probability distributions. A matching procedure, using a similarity metric based on a nonparametric probability density estimator, compares image and characteristic view graphs to identify an instance of a modeled object in an image. We present experimental results from a system constructed to test this approach. The system is demonstrated learning to recognize partially occluded objects in images using shape cues.