Order-Free RNN With Visual Attention for Multi-Label Classification

Chen, Shang-Fu (National Taiwan University) | Chen, Yi-Chen (National Taiwan University) | Yeh, Chih-Kuan (Carnegie Mellon University) | Wang, Yu-Chiang Frank (National Taiwan University)

AAAI Conferences 

While a number of research works (Zhang and Zhou 2006; Nam et al. 2014; Gong et al. 2013; Wei et al. 2014; We propose a recurrent neural network (RNN) based model Wang et al. 2016) start to advance the CNN architecture for image multi-label classification. Our model uniquely integrates for multi-label classification, CNN-RNN (Wang et al. and learning of visual attention and Long Short 2016) embeds image and semantic structures by projecting Term Memory (LSTM) layers, which jointly learns the labels both features into a joint embedding space. By further of interest and their co-occurrences, while the associated utilizing the component of Long Short Term Memory image regions are visually attended. Different from existing (LSTM) (Hochreiter and Schmidhuber 1997), a recurrent approaches utilize either model in their network architectures, neural network (RNN) structure is introduced to memorize training of our model does not require predefined long-term label dependency. As a result, CNN-RNN exhibits label orders. Moreover, a robust inference process is introduced promising multi-label classification performance with crosslabel so that prediction errors would not propagate and thus correlation implicitly preserved.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found