AITopics | Yu, Ping

Collaborating Authors

Yu, Ping

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Diverse Preference Optimization

Lanchantin, Jack, Chen, Angelica, Dhuliawala, Shehzaad, Yu, Ping, Weston, Jason, Sukhbaatar, Sainbayar, Kulikov, Ilia

arXiv.org Artificial IntelligenceFeb-10-2025

Post-training of language models, either through reinforcement learning, preference optimization or supervised finetuning, tends to sharpen the output probability distribution and reduce the diversity of generated responses. This is particularly a problem for creative generative tasks where varied responses are desired. In this work we introduce Diverse Preference Optimization (DivPO), an optimization method which learns to generate much more diverse responses than standard pipelines, while maintaining the quality of the generations. In DivPO, preference pairs are selected by first considering a pool of responses, and a measure of diversity among them, and selecting chosen examples as being more rare but high quality, while rejected examples are more common, but low quality. DivPO results in generating 45.6% more diverse persona attributes, and an 74.6% increase in story diversity, while maintaining similar win rates as standard baselines.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.18101

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)

Add feedback

R.I.P.: Better Models by Survival of the Fittest Prompts

Yu, Ping, Yuan, Weizhe, Golovneva, Olga, Wu, Tianhao, Sukhbaatar, Sainbayar, Weston, Jason, Xu, Jing

arXiv.org Artificial IntelligenceJan-30-2025

Training data quality is one of the most important drivers of final model quality. In this work, we introduce a method for evaluating data integrity based on the assumption that low-quality input prompts result in high variance and low quality responses. This is achieved by measuring the rejected response quality and the reward gap between the chosen and rejected preference pair. Our method, Rejecting Instruction Preferences (RIP) can be used to filter prompts from existing training sets, or to make high quality synthetic datasets, yielding large performance gains across various benchmarks compared to unfiltered data. Using Llama 3.1-8B-Instruct, RIP improves AlpacaEval2 LC Win Rate by 9.4%, Arena-Hard by 8.7%, and WildBench by 9.9%. Using Llama 3.3-70B-Instruct, RIP improves Arena-Hard from 67.5 to 82.9, which is from 18th place to 6th overall in the leaderboard.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.18578

Country:

North America > Canada (0.14)
North America > United States (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Adaptive Decoding via Latent Preference Optimization

Dhuliawala, Shehzaad, Kulikov, Ilia, Yu, Ping, Celikyilmaz, Asli, Weston, Jason, Sukhbaatar, Sainbayar, Lanchantin, Jack

arXiv.org Artificial IntelligenceNov-14-2024

During language model decoding, it is known that using higher temperature sampling gives more creative responses, while lower temperatures are more factually accurate. However, such models are commonly applied to general instruction following, which involves both creative and fact seeking tasks, using a single fixed temperature across all examples and tokens. In this work, we introduce Adaptive Decoding, a layer added to the model to select the sampling temperature dynamically at inference time, at either the token or example level, in order to optimize performance. To learn its parameters we introduce Latent Preference Optimization (LPO) a general approach to train discrete latent variables such as choices of temperature. Our method outperforms all fixed decoding temperatures across a range of tasks that require different temperatures, including UltraFeedback, Creative Story Writing, and GSM8K.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2411.09661

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Distilling System 2 into System 1

Yu, Ping, Xu, Jing, Weston, Jason, Kulikov, Ilia

arXiv.org Artificial IntelligenceJul-9-2024

Large language models (LLMs) can spend extra compute during inference to generate intermediate thoughts, which helps to produce better final responses. Since Chain-of-Thought (Wei et al., 2022), many such System 2 techniques have been proposed such as Rephrase and Respond (Deng et al., 2023a), System 2 Attention (Weston and Sukhbaatar, 2023) and Branch-Solve-Merge (Saha et al., 2023). In this work we investigate self-supervised methods to ``compile'' (distill) higher quality outputs from System 2 techniques back into LLM generations without intermediate reasoning token sequences, as this reasoning has been distilled into System 1. We show that several such techniques can be successfully distilled, resulting in improved results compared to the original System 1 performance, and with less inference cost than System 2. We posit that such System 2 distillation will be an important feature of future continually learning AI systems, enabling them to focus System 2 capabilities on the reasoning tasks that they cannot yet do well.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2407.06023

Country: North America > United States > Illinois (0.14)

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Following Length Constraints in Instructions

Yuan, Weizhe, Kulikov, Ilia, Yu, Ping, Cho, Kyunghyun, Sukhbaatar, Sainbayar, Weston, Jason, Xu, Jing

arXiv.org Artificial IntelligenceJun-25-2024

Aligned instruction following models can better fulfill user requests than their unaligned counterparts. However, it has been shown that there is a length bias in evaluation of such models, and that training algorithms tend to exploit this bias by learning longer responses. In this work we show how to train models that can be controlled at inference time with instructions containing desired length constraints. Such models are superior in length instructed evaluations, outperforming standard instruction following models such as GPT4, Llama 3 and Mixtral.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2406.17744

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TCMD: A Traditional Chinese Medicine QA Dataset for Evaluating Large Language Models

Yu, Ping, Song, Kaitao, He, Fengchen, Chen, Ming, Lu, Jianfeng

arXiv.org Artificial IntelligenceJun-7-2024

The recently unprecedented advancements in Large Language Models (LLMs) have propelled the medical community by establishing advanced medical-domain models. However, due to the limited collection of medical datasets, there are only a few comprehensive benchmarks available to gauge progress in this area. In this paper, we introduce a new medical question-answering (QA) dataset that contains massive manual instruction for solving Traditional Chinese Medicine examination tasks, called TCMD. Specifically, our TCMD collects massive questions across diverse domains with their annotated medical subjects and thus supports us in comprehensively assessing the capability of LLMs in the TCM domain. Extensive evaluation of various general LLMs and medical-domain-specific LLMs is conducted. Moreover, we also analyze the robustness of current LLMs in solving TCM QA tasks by introducing randomness. The inconsistency of the experimental results also reveals the shortcomings of current LLMs in solving QA tasks. We also expect that our dataset can further facilitate the development of LLMs in the TCM area.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2406.04941

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Efficient Tool Use with Chain-of-Abstraction Reasoning

Gao, Silin, Dwivedi-Yu, Jane, Yu, Ping, Tan, Xiaoqing Ellen, Pasunuru, Ramakanth, Golovneva, Olga, Sinha, Koustuv, Celikyilmaz, Asli, Bosselut, Antoine, Wang, Tianlu

arXiv.org Artificial IntelligenceJan-30-2024

To achieve faithful reasoning that aligns with human expectations, large language models (LLMs) need to ground their reasoning to real-world knowledge (e.g., web facts, math and physical rules). Tools help LLMs access this external knowledge, but there remains challenges for fine-tuning LLM agents (e.g., Toolformer) to invoke tools in multi-step reasoning problems, where inter-connected tool calls require holistic and efficient tool usage planning. In this work, we propose a new method for LLMs to better leverage tools in multi-step reasoning. Our method, Chain-of-Abstraction (CoA), trains LLMs to first decode reasoning chains with abstract placeholders, and then call domain tools to reify each reasoning chain by filling in specific knowledge. This planning with abstract chains enables LLMs to learn more general reasoning strategies, which are robust to shifts of domain knowledge (e.g., math results) relevant to different reasoning questions. It also allows LLMs to perform decoding and calling of external tools in parallel, which avoids the inference delay caused by waiting for tool responses. In mathematical reasoning and Wiki QA domains, we show that our method consistently outperforms previous chain-of-thought and tool-augmented baselines on both in-distribution and out-of-distribution test sets, with an average ~6% absolute QA accuracy improvement. LLM agents trained with our method also show more efficient tool use, with inference speed being on average ~1.4x faster than baseline tool-augmented LLMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2401.17464

Country:

Asia (0.28)
North America > United States > New York (0.15)

Genre: Research Report (0.82)

Industry:

Media > Film (1.00)
Leisure & Entertainment > Sports > Tennis (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

YAYI 2: Multilingual Open-Source Large Language Models

Luo, Yin, Kong, Qingchao, Xu, Nan, Cao, Jia, Hao, Bao, Qu, Baoyu, Chen, Bo, Zhu, Chao, Zhao, Chenyang, Zhang, Donglei, Feng, Fan, Zhao, Feifei, Sun, Hailong, Yang, Hanxuan, Pan, Haojun, Liu, Hongyu, Guo, Jianbin, Du, Jiangtao, Wang, Jingyi, Li, Junfeng, Sun, Lei, Liu, Liduo, Dong, Lifeng, Liu, Lili, Wang, Lin, Zhang, Liwen, Wang, Minzheng, Wang, Pin, Yu, Ping, Li, Qingxiao, Yan, Rui, Zou, Rui, Li, Ruiqun, Huang, Taiwen, Wang, Xiaodong, Wu, Xiaofei, Peng, Xin, Zhang, Xina, Fang, Xing, Xiao, Xinglin, Hao, Yanni, Dong, Yao, Wang, Yigang, Liu, Ying, Jiang, Yongyu, Wang, Yungan, Wang, Yuqi, Wang, Zhangsheng, Yu, Zhaoxin, Luo, Zhen, Mao, Wenji, Wang, Lei, Zeng, Dajun

arXiv.org Artificial IntelligenceDec-22-2023

As the latest advancements in natural language processing, large language models (LLMs) have achieved human-level language understanding and generation abilities in many real-world tasks, and even have been regarded as a potential path to the artificial general intelligence. To better facilitate research on LLMs, many open-source LLMs, such as Llama 2 and Falcon, have recently been proposed and gained comparable performances to proprietary models. However, these models are primarily designed for English scenarios and exhibit poor performances in Chinese contexts. In this technical report, we propose YAYI 2, including both base and chat models, with 30 billion parameters. YAYI 2 is pre-trained from scratch on a multilingual corpus which contains 2.65 trillion tokens filtered by our pre-training data processing pipeline. The base model is aligned with human values through supervised fine-tuning with millions of instructions and reinforcement learning from human feedback. Extensive experiments on multiple benchmarks, such as MMLU and CMMLU, consistently demonstrate that the proposed YAYI 2 outperforms other similar sized open-source models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2312.14862

Country:

Asia > China (0.14)
North America > United States (0.14)

Genre: Research Report (0.82)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.93)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

OPT-R: Exploring the Role of Explanations in Finetuning and Prompting for Reasoning Skills of Large Language Models

AlKhamissi, Badr, Verma, Siddharth, Yu, Ping, Jin, Zhijing, Celikyilmaz, Asli, Diab, Mona

arXiv.org Artificial IntelligenceOct-24-2023

In this paper, we conduct a thorough investigation into the reasoning capabilities of Large Language Models (LLMs), focusing specifically on the Open Pretrained Transformers (OPT) models as a representative of such models. Our study entails finetuning three different sizes of OPT on a carefully curated reasoning corpus, resulting in two sets of finetuned models: OPT-R, finetuned without explanations, and OPT-RE, finetuned with explanations. We then evaluate all models on 57 out-of-domain tasks drawn from the SUPER-NATURALINSTRUCTIONS benchmark, covering 26 distinct reasoning skills, utilizing three prompting techniques. Through a comprehensive grid of 27 configurations and 6,156 test evaluations, we investigate the dimensions of finetuning, prompting, and scale to understand the role of explanations on different reasoning skills. Our findings reveal that having explanations in the fewshot exemplar has no significant impact on the model's performance when the model is finetuned, while positively affecting the non-finetuned counterpart. Moreover, we observe a slight yet consistent increase in classification accuracy as we incorporate explanations during prompting and finetuning, respectively. Finally, we offer insights on which skills benefit the most from incorporating explanations during finetuning and prompting, such as Numerical (+20.4%) and Analogical (+13.9%) reasoning, as well as skills that exhibit negligible or negative effects.

explanation, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2023.nlrse-1.10

2305.12001

Country: Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Self-Alignment with Instruction Backtranslation

Li, Xian, Yu, Ping, Zhou, Chunting, Schick, Timo, Zettlemoyer, Luke, Levy, Omer, Weston, Jason, Lewis, Mike

arXiv.org Artificial IntelligenceAug-14-2023

We present a scalable method to build a high quality instruction following language model by automatically labelling human-written text with corresponding instructions. Our approach, named instruction backtranslation, starts with a language model finetuned on a small amount of seed data, and a given web corpus. The seed model is used to construct training examples by generating instruction prompts for web documents (self-augmentation), and then selecting high quality examples from among these candidates (self-curation). This data is then used to finetune a stronger model. Finetuning LLaMa on two iterations of our approach yields a model that outperforms all other LLaMa-based models on the Alpaca leaderboard not relying on distillation data, demonstrating highly effective self-alignment.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2308.06259

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)

Add feedback