Fang, Rui
Dual Adversarial Alignment for Realistic Support-Query Shift Few-shot Learning
Jiang, Siyang, Fang, Rui, Chen, Hsi-Wen, Ding, Wei, Chen, Ming-Syan
Support-query shift few-shot learning aims to classify unseen examples (query set) to labeled data (support set) based on the learned embedding in a low-dimensional space under a distribution shift between the support set and the query set. However, in real-world scenarios the shifts are usually unknown and varied, making it difficult to estimate in advance. Therefore, in this paper, we propose a novel but more difficult challenge, RSQS, focusing on Realistic Support-Query Shift few-shot learning. The key feature of RSQS is that the individual samples in a meta-task are subjected to multiple distribution shifts in each meta-task. In addition, we propose a unified adversarial feature alignment method called DUal adversarial ALignment framework (DuaL) to relieve RSQS from two aspects, i.e., inter-domain bias and intra-domain variance. On the one hand, for the inter-domain bias, we corrupt the original data in advance and use the synthesized perturbed inputs to train the repairer network by minimizing distance in the feature level. On the other hand, for intra-domain variance, we proposed a generator network to synthesize hard, i.e., less similar, examples from the support set in a self-supervised manner and introduce regularized optimal transportation to derive a smooth optimal transportation plan. Lastly, a benchmark of RSQS is built with several state-of-the-art baselines among three datasets (CIFAR100, mini-ImageNet, and Tiered-Imagenet). Experiment results show that DuaL significantly outperforms the state-of-the-art methods in our benchmark.
Filter Pruning via Filters Similarity in Consecutive Layers
Wang, Xiaorui, Wang, Jun, Tang, Xin, Gao, Peng, Fang, Rui, Xie, Guotong
Filter pruning is widely adopted to compress and accelerate the Convolutional Neural Networks (CNNs), but most previous works ignore the relationship between filters and channels in different layers. Processing each layer independently fails to utilize the collaborative relationship across layers. In this paper, we intuitively propose a novel pruning method by explicitly leveraging the Filters Similarity in Consecutive Layers (FSCL). FSCL compresses models by pruning filters whose corresponding features are more worthless in the model. The extensive experiments demonstrate the effectiveness of FSCL, and it yields remarkable improvement over state-of-the-art on accuracy, FLOPs and parameter reduction on several benchmark models and datasets.
STAGE: Span Tagging and Greedy Inference Scheme for Aspect Sentiment Triplet Extraction
Liang, Shuo, Wei, Wei, Mao, Xian-Ling, Fu, Yuanyuan, Fang, Rui, Chen, Dangyang
Aspect Sentiment Triplet Extraction (ASTE) has become an emerging task in sentiment analysis research, aiming to extract triplets of the aspect term, its corresponding opinion term, and its associated sentiment polarity from a given sentence. Recently, many neural networks based models with different tagging schemes have been proposed, but almost all of them have their limitations: heavily relying on 1) prior assumption that each word is only associated with a single role (e.g., aspect term, or opinion term, etc. ) and 2) word-level interactions and treating each opinion/aspect as a set of independent words. Hence, they perform poorly on the complex ASTE task, such as a word associated with multiple roles or an aspect/opinion term with multiple words. Hence, we propose a novel approach, Span TAgging and Greedy infErence (STAGE), to extract sentiment triplets in span-level, where each span may consist of multiple words and play different roles simultaneously. To this end, this paper formulates the ASTE task as a multi-class span classification problem. Specifically, STAGE generates more accurate aspect sentiment triplet extractions via exploring span-level information and constraints, which consists of two components, namely, span tagging scheme and greedy inference strategy. The former tag all possible candidate spans based on a newly-defined tagging set. The latter retrieves the aspect/opinion term with the maximum length from the candidate sentiment snippet to output sentiment triplets. Furthermore, we propose a simple but effective model based on the STAGE, which outperforms the state-of-the-arts by a large margin on four widely-used datasets. Moreover, our STAGE can be easily generalized to other pair/triplet extraction tasks, which also demonstrates the superiority of the proposed scheme STAGE.
PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for Multi-stage Ranking
Qiao, Yixuan, Chen, Hao, Wang, Jun, Lai, Yongquan, Liu, Tuozhen, Ye, Xianbin, Tang, Xin, Fang, Rui, Gao, Peng, Xie, Wenfeng, Xie, Guotong
This paper describes the PASH participation in TREC 2021 Deep Learning Track. In the recall stage, we adopt a scheme combining sparse and dense retrieval method. In the multi-stage ranking phase, point-wise and pair-wise ranking strategies are used one after another based on model continual pre-trained on general knowledge and document-level data. Compared to TREC 2020 Deep Learning Track, we have additionally introduced the generative model T5 to further enhance the performance.
Collaborative Language Grounding Toward Situated Human-Robot Dialogue
Chai, Joyce Y. (Michigan State University) | Fang, Rui (Thomson Reuters) | Liu, Changsong (Michigan State University) | She, Lanbo (Michigan State University)
One particular challenge is to ground human language to robot internal representation of the physical world. Although copresent in a shared environment, humans and robots have mismatched capabilities in reasoning, perception, and action. A robot not only needs to incorporate collaborative effort from human partners to better connect human language to its own representation, but also needs to make extra collaborative effort to communicate its representation in language that humans can understand. This article gives a brief introduction to this research effort and discusses several collaborative approaches to grounding language to perception and action.
Collaborative Language Grounding Toward Situated Human-Robot Dialogue
Chai, Joyce Y. (Michigan State University) | Fang, Rui (Thomson Reuters) | Liu, Changsong (Michigan State University) | She, Lanbo (Michigan State University)
To enable situated human-robot dialogue, techniques to support grounded language communication are essential. One particular challenge is to ground human language to robot internal representation of the physical world. Although copresent in a shared environment, humans and robots have mismatched capabilities in reasoning, perception, and action. Their representations of the shared environment and joint tasks are significantly misaligned. Humans and robots will need to make extra effort to bridge the gap and strive for a common ground of the shared world. Only then, is the robot able to engage in language communication and joint tasks. Thus computational models for language grounding will need to take collaboration into consideration. A robot not only needs to incorporate collaborative effort from human partners to better connect human language to its own representation, but also needs to make extra collaborative effort to communicate its representation in language that humans can understand. To address these issues, the Language and Interaction Research group (LAIR) at Michigan State University has investigated multiple aspects of collaborative language grounding. This article gives a brief introduction to this research effort and discusses several collaborative approaches to grounding language to perception and action.
Collaborative Models for Referring Expression Generation in Situated Dialogue
Fang, Rui (Michigan State University) | Doering, Malcolm (Michigan State University) | Chai, Joyce Y. (Michigan State University)
In situated dialogue with artificial agents (e.g., robots), although a human and an agent are co-present, the agent's representation and the human's representation of the shared environment are significantly mismatched. Because of this misalignment, our previous work has shown that when the agent applies traditional approaches to generate referring expressions for describing target objects with minimum descriptions, the intended objects often cannot be correctly identified by the human. To address this problem, motivated by collaborative behaviors in human referential communication, we have developed two collaborative models - an episodic model and an installment model - for referring expression generation. Both models, instead of generating a single referring expression to describe a target object as in the previous work, generate multiple small expressions that lead to the target object with the goal of minimizing the collaborative effort. In particular, our installment model incorporates human feedback in a reinforcement learning framework to learn the optimal generation strategies. Our empirical results have shown that the episodic model and the installment model outperform previous non-collaborative models with an absolute gain of 6% and 21% respectively.