AITopics | Xu, Canwen

Collaborating Authors

Xu, Canwen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Survey on Model Compression for Natural Language Processing

Xu, Canwen, McAuley, Julian

arXiv.org Artificial IntelligenceFeb-14-2022

With recent developments in new architectures like Transformer and pretraining techniques, significant progress has been made in applications of natural language processing (NLP). However, the high energy cost and long inference delay of Transformer is preventing NLP from entering broader scenarios including edge and mobile computing. Efficient NLP research aims to comprehensively consider computation, time and carbon emission for the entire life-cycle of NLP, including data preparation, model training and inference. In this survey, we focus on the inference stage and review the current state of model compression for NLP, including the benchmarks, metrics and methodology. We outline the current obstacles and future research directions.

artificial intelligence, natural language processing, survey article, (1 more...)

arXiv.org Artificial Intelligence

2202.07105

Genre:

Overview (0.87)
Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression

Xu, Canwen, Zhou, Wangchunshu, Ge, Tao, Xu, Ke, McAuley, Julian, Wei, Furu

arXiv.org Artificial IntelligenceSep-7-2021

Recent studies on compression of pretrained language models (e.g., BERT) usually use preserved accuracy as the metric for evaluation. In this paper, we propose two new metrics, label loyalty and probability loyalty that measure how closely a compressed model (i.e., student) mimics the original model (i.e., teacher). We also explore the effect of compression with regard to robustness under adversarial attacks. We benchmark quantization, pruning, knowledge distillation and progressive module replacing with loyalty and robustness. By combining multiple compression techniques, we provide a practical strategy to achieve better accuracy, loyalty and robustness.

artificial intelligence, neural network, robustness, (17 more...)

arXiv.org Artificial Intelligence

2109.03228

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Meta Learning for Knowledge Distillation

Zhou, Wangchunshu, Xu, Canwen, McAuley, Julian

arXiv.org Artificial IntelligenceJun-8-2021

We present Meta Learning for Knowledge Distillation (MetaDistil), a simple yet effective alternative to traditional knowledge distillation (KD) methods where the teacher model is fixed during training. We show the teacher network can learn to better transfer knowledge to the student network (i.e., learning to teach) with the feedback from the performance of the distilled student network in a meta learning framework. Moreover, we introduce a pilot update mechanism to improve the alignment between the inner-learner and meta-learner in meta learning algorithms that focus on an improved inner-learner. Experiments on various benchmarks show that MetaDistil can yield significant improvements compared with traditional KD algorithms and is less sensitive to the choice of different student capacity and hyperparameters, facilitating the use of KD on different tasks and models. With the prevalence of large neural networks with millions or billions of parameters, model compression is gaining prominence for facilitating efficient, eco-friendly deployment for machine learning applications. Previous works often train a large model as the "teacher"; then they fix the teacher and train a "student" model to mimic the behavior of the teacher, in order to transfer the knowledge from the teacher to the student. However, this paradigm has the following drawbacks: (1) The teacher is unaware of the student. Recent studies in pedagogy suggest student-centered learning, which considers students' characteristics and learning capability, has shown effectiveness improving students' performance (Cornelius-White, 2007; Wright, 2011).

computer based training, educational technology, metadistil, (20 more...)

arXiv.org Artificial Intelligence

2106.0457

Country: North America > United States > California (0.28)

Genre: Research Report (0.84)

Industry:

Education > Educational Setting (0.88)
Education > Educational Technology > Educational Software (0.57)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Blow the Dog Whistle: A Chinese Dataset for Cant Understanding with Common Sense and World Knowledge

Xu, Canwen, Zhou, Wangchunshu, Ge, Tao, Xu, Ke, McAuley, Julian, Wei, Furu

arXiv.org Artificial IntelligenceApr-6-2021

Cant is important for understanding advertising, comedies and dog-whistle politics. However, computational research on cant is hindered by a lack of available datasets. In this paper, we propose a large and diverse Chinese dataset for creating and understanding cant from a computational linguistics perspective. We formulate a task for cant understanding and provide both quantitative and qualitative analysis for tested word embedding similarity and pretrained language models. Experiments suggest that such a task requires deep language understanding, common sense, and world knowledge and thus can be a good testbed for pretrained language models and help models perform better on other tasks. The code is available at https://github.com/JetRunner/dogwhistle. The data and leaderboard are available at https://competitions.codalab.org/competitions/30451.

artificial intelligence, computer game, dataset, (17 more...)

arXiv.org Artificial Intelligence

2104.02704

Country: North America > United States > California (0.14)

Genre: Research Report > Experimental Study (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

DLocRL: A Deep Learning Pipeline for Fine-Grained Location Recognition and Linking in Tweets

Xu, Canwen, Li, Jing, Luo, Xiangyang, Pei, Jiaxin, Li, Chenliang, Ji, Donghong

arXiv.org Artificial IntelligenceMar-2-2019

In recent years, with the prevalence of social media and smart devices, people causally reveal their locations such as shops, hotels, and restaurants in their tweets. Recognizing and linking such fine-grained location mentions to well-defined location profiles are beneficial for retrieval and recommendation systems. In this paper, we propose DLocRL, a new deep learning pipeline for fine-grained location recognition and linking in tweets, and verify its effectiveness on a real-world Twitter dataset.

deep learning, neural network, tweet, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3308558.3313491

1901.07005

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > China > Hubei Province (0.14)

Genre: Research Report (1.00)

Industry: Consumer Products & Services > Restaurants (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback