Images can convey rich semantics and evoke strong emotions in viewers. The research of my PhD thesis focuses on image emotion computing (IEC), which aims to predict the emotion perceptions of given images. The development of IEC is greatly constrained by two main challenges: affective gap and subjective evaluation. Previous works mainly focused on finding features that can express emotions better to bridge the affective gap, such as elements-of-art based features and shape features. According to the emotion representation models, including categorical emotion states (CES) and dimensional emotion space (DES), three different tasks are traditionally performed on IEC: affective image classification, regression and retrieval. The state-of-the-art methods on the three above tasks are image-centric, focusing on the dominant emotions for the majority of viewers. For my PhD thesis, I plan to answer the following questions: (1) Compared to the low-level elements-of-art based features, can we find some higher level features that are more interpretable and have stronger link to emotions? (2) Are the emotions that are evoked in viewers by an image subjective and different? If they are, how can we tackle the user-centric emotion prediction? (3) For image-centric emotion computing, can we predict the emotion distribution instead of the dominant emotion category?
This article is a comprehensive overview of Topic Modeling and its associated techniques. In natural language understanding (NLU) tasks, there is a hierarchy of lenses through which we can extract meaning -- from words to sentences to paragraphs to documents. At the document level, one of the most useful ways to understand text is by analyzing its topics. The process of learning, recognizing, and extracting these topics across a collection of documents is called topic modeling. In this post, we will explore topic modeling through 4 of the most popular techniques today: LSA, pLSA, LDA, and the newer, deep learning-based lda2vec.
Variants of GANs have now done insane stuff, like converting images of zebras to horses and vice versa. I found GANs fascinating, and in an effort to understand them better, I thought that I'd write this article, and in the process of explaining the math and code behind them, understand them better myself. Here's a link to a github repo I made for GAN resources: GANs learn a probability distribution of a dataset by pitting two neural networks against each other. Here's a great article that explains probability distributions and other concepts for those who aren't familiar with them: It tries to create images that look very similar to the dataset. The other model, the discriminator, acts like the police, and tries to detect whether the images generated were fake or not.
A fundamental goal of research in molecular biology is to understand protein structure. Protein crystallography is currently the most successful method for determining the three-dimensional (3D) conformation of a protein, yet it remains labor intensive and relies on an expert's ability to derive and evaluate a protein scene model. In this paper, the problem of protein structure determination is formulated as an exercise in scene analysis. A computational methodology is presented in which a 3D image of a protein is segmented into a graph of critical points. Bayesian and certainty factor approaches are described and used to analyze critical point graphs and identify meaningful substructures, such as alpha-helices and beta-sheets. Results of applying the methodologies to protein images at low and medium resolution are reported. The research is related to approaches to representation, segmentation and classification in vision, as well as to top-down approaches to protein structure prediction.
Visual cortex neurons have receptive fields resembling oriented bandpass filters, and their response distributions on natural images are non-Gaussian. Inspired by this, we previously showed that comparing the response distribution to normal distribution with the same variance gives a good thresholding criterion for detecting salient levels of edginess in images. However, (1) the results were based on comparison with human data, thus, an objective, quantitative performance measure was not taken. Furthermore, (2) why a normal distribution would serve as a good baseline was not investigated in full. In this paper, we first conduct a quantitative analysis of the normal-distribution baseline, using artificial images that closely mimic the statistics of natural images.