AITopics | Jin, Lifeng

Collaborating Authors

Jin, Lifeng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

The Trickle-down Impact of Reward (In-)consistency on RLHF

Shen, Lingfeng, Chen, Sihao, Song, Linfeng, Jin, Lifeng, Peng, Baolin, Mi, Haitao, Khashabi, Daniel, Yu, Dong

arXiv.org Artificial IntelligenceSep-28-2023

Standard practice within Reinforcement Learning from Human Feedback (RLHF) involves optimizing against a Reward Model (RM), which itself is trained to reflect human preferences for desirable generations. A notable subject that is understudied is the (in-)consistency of RMs -- whether they can recognize the semantic changes to different prompts and appropriately adapt their reward assignments -- and their impact on the downstream RLHF model. In this paper, we visit a series of research questions relevant to RM inconsistency: (1) How can we measure the consistency of reward models? (2) How consistent are the existing RMs and how can we improve them? (3) In what ways does reward inconsistency influence the chatbots resulting from the RLHF model training? We propose Contrast Instructions -- a benchmarking strategy for the consistency of RM. Each example in Contrast Instructions features a pair of lexically similar instructions with different ground truth responses. A consistent RM is expected to rank the corresponding instruction and response higher than other combinations. We observe that current RMs trained with the standard ranking objective fail miserably on Contrast Instructions compared to average humans. To show that RM consistency can be improved efficiently without using extra training budget, we propose two techniques ConvexDA and RewardFusion, which enhance reward consistency through extrapolation during the RM training and inference stage, respectively. We show that RLHF models trained with a more consistent RM yield more useful responses, suggesting that reward inconsistency exhibits a trickle-down effect on the downstream RLHF process.

artificial intelligence, consistency, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2309.16155

Country:

North America > United States > New York (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE (0.14)
Asia > Indonesia > Sulawesi (0.14)

Genre:

Research Report > New Finding (0.87)
Research Report > Experimental Study (0.66)

Industry:

Information Technology > Security & Privacy (0.70)
Information Technology > Software (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stabilizing RLHF through Advantage Model and Selective Rehearsal

Peng, Baolin, Song, Linfeng, Tian, Ye, Jin, Lifeng, Mi, Haitao, Yu, Dong

arXiv.org Artificial IntelligenceSep-18-2023

Large Language Models (LLMs) have revolutionized natural language processing, yet aligning these models with human values and preferences using RLHF remains a significant challenge. This challenge is characterized by various instabilities, such as reward hacking and catastrophic forgetting. In this technical report, we propose two innovations to stabilize RLHF training: (i) Advantage Model, which directly models advantage score i.e., extra reward compared to the expected rewards and regulates score distributions across tasks to prevent reward hacking. Large language models (LLMs) have become a fundamental element in advancing natural language processing (NLP) and artificial intelligence (AI), showcasing an impressive ability to generate text that is both semantically and contextually relevant (OpenAI, 2023; Köpf et al., 2023; Touvron et al., 2023). Despite these advancements, LLMs have the risk of engaging in undesirable behaviors, such as fabricating information or producing biased, toxic, or even dangerous content, since LLMs are trained on a wide array of data, which can include low-quality sources. This has highlighted the necessities of LLM Alignments with human values, intentions, and preferences (Brown et al., 2020; Ouyang et al., 2022; Bai et al., 2022a; Glaese et al., 2022). Many approaches have been put forth to address the challenge LLM Alignments (Bai et al., 2022a; OpenAI, 2023; Askell et al., 2021). Among these approaches, Reinforcement Learning from Human Feedback (RLHF) has demonstrated its efficacy in aligning language models with human preferences.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2309.10202

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

Add feedback

Discover, Explanation, Improvement: An Automatic Slice Detection Framework for Natural Language Processing

Hua, Wenyue, Jin, Lifeng, Song, Linfeng, Mi, Haitao, Zhang, Yongfeng, Yu, Dong

arXiv.org Artificial IntelligenceSep-10-2023

Pretrained natural language processing (NLP) models have achieved high overall performance, but they still make systematic errors. Instead of manual error analysis, research on slice detection models (SDM), which automatically identify underperforming groups of datapoints, has caught escalated attention in Computer Vision for both understanding model behaviors and providing insights for future model training and designing. However, little research on SDM and quantitative evaluation of their effectiveness have been conducted on NLP tasks. Our paper fills the gap by proposing a benchmark named "Discover, Explain, Improve (DEIM)" for classification NLP tasks along with a new SDM Edisa. Edisa discovers coherent and underperforming groups of datapoints; DEIM then unites them under human-understandable concepts and provides comprehensive evaluation tasks and corresponding quantitative metrics. The evaluation in DEIM shows that Edisa can accurately select error-prone datapoints with informative semantic features that summarize error patterns. Detecting difficult datapoints directly boosts model performance without tuning any original model parameters, showing that discovered slices are actionable for users.

datapoint, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2211.04476

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Friend-training: Learning from Models of Different but Related Tasks

Zhang, Mian, Jin, Lifeng, Song, Linfeng, Mi, Haitao, Zhou, Xiabing, Yu, Dong

arXiv.org Artificial IntelligenceJan-31-2023

Current self-training methods such as standard self-training, co-training, tri-training, and others often focus on improving model performance on a single task, utilizing differences in input features, model architectures, and training processes. However, many tasks in natural language processing are about different but related aspects of language, and models trained for one task can be great teachers for other related tasks. In this work, we propose friend-training, a cross-task self-training framework, where models trained to do different tasks are used in an iterative training, pseudo-labeling, and retraining process to help each other for better selection of pseudo-labels. With two dialogue understanding tasks, conversational semantic role labeling and dialogue rewriting, chosen for a case study, we show that the models trained with the friend-training framework achieve the best performance compared to strong baselines.

computational linguistic, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2301.13683

Country:

Europe (1.00)
Asia (0.93)
North America > United States > Massachusetts (0.28)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Hierarchical Context Tagging for Utterance Rewriting

Jin, Lisa, Song, Linfeng, Jin, Lifeng, Yu, Dong, Gildea, Daniel

arXiv.org Artificial IntelligenceAug-7-2022

Utterance rewriting aims to recover coreferences and omitted information from the latest turn of a multi-turn dialogue. Recently, methods that tag rather than linearly generate sequences have proven stronger in both in- and out-of-domain rewriting settings. This is due to a tagger's smaller search space as it can only copy tokens from the dialogue context. However, these methods may suffer from low coverage when phrases that must be added to a source utterance cannot be covered by a single context span. This can occur in languages like English that introduce tokens such as prepositions into the rewrite for grammaticality. We propose a hierarchical context tagger (HCT) that mitigates this issue by predicting slotted rules (e.g., "besides_") whose slots are later filled with context spans. HCT (i) tags the source string with token-level edit actions and slotted rules and (ii) fills in the resulting rule slots with spans from the dialogue context. This rule tagging allows HCT to add out-of-context tokens and multiple spans at once; we further cluster the rules to truncate the long tail of the rule distribution. Experiments on several benchmarks show that HCT can outperform state-of-the-art rewriting systems by ~2 BLEU points.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1609/aaai.v36i10.21331

2206.11218

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.94)

Add feedback

Domain-Adaptive Pretraining Methods for Dialogue Understanding

Wu, Han, Xu, Kun, Song, Linfeng, Jin, Lifeng, Zhang, Haisong, Song, Linqi

arXiv.org Artificial IntelligenceMay-28-2021

Language models like BERT and SpanBERT pretrained on open-domain data have obtained impressive gains on various NLP tasks. In this paper, we probe the effectiveness of domain-adaptive pretraining objectives on downstream tasks. In particular, three objectives, including a novel objective focusing on modeling predicate-argument relations, are evaluated on two challenging dialogue understanding tasks. Experimental results demonstrate that domain-adaptive pretraining with proper objectives can significantly improve the performance of a strong baseline on these tasks, achieving the new state-of-the-art performances.

artificial intelligence, natural language, objective, (18 more...)

arXiv.org Artificial Intelligence

2105.13665

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.48)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.30)

Add feedback