AITopics | Duan, Sufeng

Collaborating Authors

Duan, Sufeng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving Non-autoregressive Machine Translation with Error Exposure and Consistency Regularization

Chen, Xinran, Duan, Sufeng, Liu, Gongshen

arXiv.org Artificial IntelligenceFeb-15-2024

Being one of the IR-NAT (Iterative-refinemennt-based NAT) frameworks, the Conditional Masked Language Model (CMLM) adopts the mask-predict paradigm to re-predict the masked low-confidence tokens. However, CMLM suffers from the data distribution discrepancy between training and inference, where the observed tokens are generated differently in the two cases. In this paper, we address this problem with the training approaches of error exposure and consistency regularization (EECR). We construct the mixed sequences based on model prediction during training, and propose to optimize over the masked tokens under imperfect observation conditions. We also design a consistency learning method to constrain the data distribution for the masked tokens under different observing situations to narrow down the gap between training and inference. The experiments on five translation benchmarks obtains an average improvement of 0.68 and 0.40 BLEU scores compared to the base models, respectively, and our CMLMC-EECR achieves the best performance with a comparable translation quality with the Transformer. The experiments results demonstrate the effectiveness of our method.

machine learning, natural language, translation, (18 more...)

arXiv.org Artificial Intelligence

2402.09725

Country: North America > Puerto Rico (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Multi-grained Evidence Inference for Multi-choice Reading Comprehension

Zhao, Yilin, Zhao, Hai, Duan, Sufeng

arXiv.org Artificial IntelligenceOct-27-2023

Multi-choice Machine Reading Comprehension (MRC) is a major and challenging task for machines to answer questions according to provided options. Answers in multi-choice MRC cannot be directly extracted in the given passages, and essentially require machines capable of reasoning from accurate extracted evidence. However, the critical evidence may be as simple as just one word or phrase, while it is hidden in the given redundant, noisy passage with multiple linguistic hierarchies from phrase, fragment, sentence until the entire passage. We thus propose a novel general-purpose model enhancement which integrates multi-grained evidence comprehensively, named Multi-grained evidence inferencer (Mugen), to make up for the inability. Mugen extracts three different granularities of evidence: coarse-, middle- and fine-grained evidence, and integrates evidence with the original passages, achieving significant and consistent performance improvement on four multi-choice MRC benchmarks.

machine learning, mugen, natural language, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TASLP.2023.3313885

2310.1807

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (1.00)

Industry: Education > Assessment & Standards > Student Performance (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

To Understand Representation of Layer-aware Sequence Encoders as Multi-order-graph

Duan, Sufeng, Zhao, Hai

arXiv.org Artificial IntelligenceMar-14-2023

Abstract--In this paper, we propose an explanation of representation for self-attention network (SAN) based neural sequence encoders, which regards the information captured by the model and the encoding of the model as graph structure and the generation of these graph structures respectively. The proposed explanation applies to existing works on SAN-based models and can explain the relationship among the ability to capture the structural or linguistic information, depth of model, and length of sentence, and can also be extended to other models such as recurrent neural network based models. We also propose a revisited multigraph called Multi-order-Graph (MoG) based on our explanation to model the graph structures in the SAN-based model as subgraphs in MoG and convert the encoding of SAN-based model to the generation of MoG. Based on our explanation, we further introduce a Graph-Transformer by enhancing the ability to capture multiple subgraphs of different orders and focusing on subgraphs of high orders. Experimental results on multiple neural machine translation tasks show that the Graph-Transformer can yield effective performance improvement. These works show that SAN-based models can embed structural which the encoder takes a sentence as input and generates the and linguistic information, and the information embedding ability corresponding contextualized representations for the decoder for is related to the model depth and sentence length. So far, although NLP tasks with various we may get intuitions as follows, (1) different layers in SANbased modeling ways, generally, there are mainly three types of encoder models may deliver different sorts of information, (2) architectures, recurrent neural network (RNN) [1], [2], [3], convolutional increasing the depth of the model can improve the performance neural network (CNN), and self-attention network (SAN) while improvement may be tiny when the model is too deep, from Transformer [4].

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2101.06397

Country:

Europe (1.00)
North America > Canada (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Add feedback

SG-Net: Syntax Guided Transformer for Language Representation

Zhang, Zhuosheng, Wu, Yuwei, Zhou, Junru, Duan, Sufeng, Zhao, Hai, Wang, Rui

arXiv.org Artificial IntelligenceJan-7-2021

Understanding human language is one of the key themes of artificial intelligence. For language representation, the capacity of effectively modeling the linguistic knowledge from the detail-riddled and lengthy texts and getting rid of the noises is essential to improve its performance. Traditional attentive models attend to all words without explicit constraint, which results in inaccurate concentration on some dispensable words. In this work, we propose using syntax to guide the text modeling by incorporating explicit syntactic constraints into attention mechanisms for better linguistically motivated word representations. In detail, for self-attention network (SAN) sponsored Transformer-based encoder, we introduce syntactic dependency of interest (SDOI) design into the SAN to form an SDOI-SAN with syntax-guided self-attention. Syntax-guided network (SG-Net) is then composed of this extra SDOI-SAN and the SAN from the original Transformer encoder through a dual contextual architecture for better linguistics inspired representation. The proposed SG-Net is applied to typical Transformer encoders. Extensive experiments on popular benchmark tasks, including machine reading comprehension, natural language inference, and neural machine translation show the effectiveness of the proposed SG-Net design.

deep learning, neural network, representation, (22 more...)

arXiv.org Artificial Intelligence

2012.13915

Country:

Asia > China (0.69)
North America > United States > California (0.28)

Genre: Research Report > Experimental Study (0.68)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
(2 more...)

Add feedback