A Multi-Modal Approach to Infer Image Affect

Sundaresan, Ashok, Murugesan, Sugumar, Davis, Sean, Kappaganthu, Karthik, Jin, ZhongYi, Jain, Divya, Maunder, Anurag

Mar-13-2018–arXiv.org Machine Learning

The group affect or emotion in an image of people can be inferred by extracting features about both the people in the picture and the overall makeup of the scene. The state-of-the-art on this problem investigates a combination of facial features, scene extraction and even audio tonality. This paper combines three additional modalities, namely, human pose, text-based tagging and CNN extracted features / predictions. To the best of our knowledge, this is the first time all of the modalities were extracted using deep neural networks. We evaluate the performance of our approach against baselines and identify insights throughout this paper.

classifier, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

Mar-13-2018

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.29)

Genre:
- Research Report (0.50)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Vision > Face Recognition (1.00)
    - Natural Language (1.00)
    - Machine Learning
      - Neural Networks > Deep Learning (1.00)
      - Statistical Learning > Regression (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found