Plotting

 Institute of Computing Technology, Chinese Academy of Sciences


Auto-Balanced Filter Pruning for Efficient Convolutional Neural Networks

AAAI Conferences

In recent years considerable research efforts have been devoted to compression techniques of convolutional neural networks (CNNs). Many works so far have focused on CNN connection pruning methods which produce sparse parameter tensors in convolutional or fully-connected layers. It has been demonstrated in several studies that even simple methods can effectively eliminate connections of a CNN. However, since these methods make parameter tensors just sparser but no smaller, the compression may not transfer directly to acceleration without support from specially designed hardware. In this paper, we propose an iterative approach named Auto-balanced Filter Pruning, where we pre-train the network in an innovative auto-balanced way to transfer the representational capacity of its convolutional layers to a fraction of the filters, prune the redundant ones, then re-train it to restore the accuracy. In this way, a smaller version of the original network is learned and the floating-point operations (FLOPs) are reduced. By applying this method on several common CNNs, we show that a large portion of the filters can be discarded without obvious accuracy drop, leading to significant reduction of computational burdens. Concretely, we reduce the inference cost of LeNet-5 on MNIST, VGG-16 and ResNet-56 on CIFAR-10 by 95.1%, 79.7% and 60.9%, respectively.


Visual Relationship Detection With Deep Structural Ranking

AAAI Conferences

Visual relationship detection aims to describe the interactions between pairs of objects. Different from individual object learning tasks, the number of possible relationships are much larger, which makes it hard to explore only based on the visual appearance of objects. In addition, due to the limited human effort, the annotations for visual relationships are usually incomplete which increases the difficulty of model training and evaluation. In this paper, we propose a novel framework, called Deep Structural Ranking, for visual relationship detection. To complement the representation ability of visual appearance, we integrate multiple cues for predicting the relationships contained in an input image. Moreover, we design a new ranking objective function by enforcing the annotated relationships to have higher relevance scores. Unlike previous works, our proposed method can both facilitate the co-occurrence of relationships and mitigate the incompleteness problem. Experimental results show that our proposed method outperforms the state-of-the-art on the two widely used datasets. We also demonstrate its superiority in detecting zero-shot relationships.


Deep Structured Learning for Visual Relationship Detection

AAAI Conferences

In the research area of computer vision and artificial intelligence, learning the relationships of objects is an important way to deeply understand images. Most of recent works detect visual relationship by learning objects and predicates respectively in feature level, but the dependencies between objects and predicates have not been fully considered. In this paper, we introduce deep structured learning for visual relationship detection. Specifically, we propose a deep structured model, which learns relationship by using feature-level prediction and label-level prediction to improve learning ability of only using feature-level predication. The feature-level prediction learns relationship by discriminative features, and the label-level prediction learns relationships by capturing dependencies between objects and predicates based on the learnt relationship of feature level. Additionally, we use structured SVM (SSVM) loss function as our optimization goal, and decompose this goal into the subject, predicate, and object optimizations which become more simple and more independent. Our experiments on the Visual Relationship Detection (VRD) dataset and the large-scale Visual Genome (VG) dataset validate the effectiveness of our method, which outperforms state-of-the-art methods.


Mechanism-Aware Neural Machine for Dialogue Response Generation

AAAI Conferences

To the same utterance, people's responses in everyday dialogue may be diverse largely in terms of content semantics, speaking styles, communication intentions and so on. Previous generative conversational models ignore these 1-to-n relationships between a post to its diverse responses, and tend to return high-frequency but meaningless responses. In this study we propose a mechanism-aware neural machine for dialogue response generation. It assumes that there exists some latent responding mechanisms, each of which can generate different responses for a single input post. With this assumption we model different responding mechanisms as latent embeddings, and develop a encoder-diverter-decoder framework to train its modules in an end-to-end fashion. With the learned latent mechanisms, for the first time these decomposed modules can be used to encode the input into mechanism-aware context, and decode the responses with the controlled generation styles and topics. Finally, the experiments with human judgements, intuitive examples, detailed discussions demonstrate the quality and diversity of the generated responses with 9.80% increase of acceptable ratio over the best of six baseline methods.