A Supervised Neural Autoregressive Topic Model for Simultaneous Image Classification and Annotation

Zheng, Yin, Zhang, Yu-Jin, Larochelle, Hugo

May-22-2013–arXiv.org Machine Learning

Hugo Larochelle D epartment d'Informatique Universit e de Sherbrooke, Sherbrooke (QC), Canada, J1K 2R1 hugo.larochelle@usherbrooke.ca March 22, 2018 Abstract Topic modeling based on latent Dirichlet allocation (LDA) has been a framework of choice to perform scene recognition and annotation. Recently, a new type of topic model called the Document Neural Autoregressive Distribution Estimator (DocNADE) was proposed and demonstrated state-of-the-art performance for document modeling. In this work, we show how to successfully apply and extend this model to the context of visual scene modeling. Specifically, we propose SupDocNADE, a supervised extension of DocNADE, that increases the discriminative power of the hidden topic features by incorporating label information into the training objective of the model. We also describe how to leverage information about the spatial position of the visual words and how to embed additional image annotations, so as to simultaneously perform image classification and annotation. We test our model on the Scene15, LabelMe and UIUC-Sports datasets and show that it compares favorably to other topic models such as the supervised variant of LDA. 1 Introduction Image classification and annotation are two important tasks in computer vision. In image classification, one tries to describe the image globally with a single descriptive label (such as coast, outdoor, inside city, etc.), while annotation focuses on tagging the local content within the image (such as whether it contains "sky", a "car ", a "tree ", etc.). Since these two problems are related, it is natural to attempt to solve them jointly. For example, an image labeled asstreet is more likely to be annotated with " car ", "pedestrian " or "building" than with "beach " or "see water ".

annotation, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

May-22-2013

arXiv.org PDF

Add feedback

Country:
- North America > Canada > Quebec > Estrie Region > Sherbrooke (0.24)

Genre:
- Research Report > New Finding (0.94)

Technology:
- Information Technology
  - Sensing and Signal Processing > Image Processing (1.00)
  - Artificial Intelligence
    - Vision > Image Understanding (1.00)
    - Representation & Reasoning > Uncertainty (1.00)
    - Natural Language > Discourse & Dialogue (1.00)
    - Machine Learning
      - Statistical Learning (1.00)
      - Neural Networks (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found