AITopics | Yuan, Xingdi

Collaborating Authors

Yuan, Xingdi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

debug-gym: A Text-Based Environment for Interactive Debugging

Yuan, Xingdi, Moss, Morgane M, Feghali, Charbel El, Singh, Chinmay, Moldavskaya, Darya, MacPhee, Drew, Caccia, Lucas, Pereira, Matheus, Kim, Minseon, Sordoni, Alessandro, Côté, Marc-Alexandre

arXiv.org Artificial IntelligenceMar-27-2025

Large Language Models (LLMs) are increasingly relied upon for coding tasks, yet in most scenarios it is assumed that all relevant information can be either accessed in context or matches their training data. We posit that LLMs can benefit from the ability to interactively explore a codebase to gather the information relevant to their task. To achieve this, we present a textual environment, namely debug-gym, for developing LLM-based agents in an interactive coding setting. Our environment is lightweight and provides a preset of useful tools, such as a Python debugger (pdb), designed to facilitate an LLM-based agent's interactive debugging. Beyond coding and debugging tasks, this approach can be generalized to other tasks that would benefit from information-seeking behavior by an LLM agent.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.21557

Genre: Research Report > New Finding (0.46)

Industry:

Education (0.67)
Leisure & Entertainment (0.46)
Energy (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Add feedback

Can Language Models Serve as Text-Based World Simulators?

Wang, Ruoyao, Todd, Graham, Xiao, Ziang, Yuan, Xingdi, Côté, Marc-Alexandre, Clark, Peter, Jansen, Peter

arXiv.org Artificial IntelligenceJun-10-2024

Virtual environments play a key role in benchmarking advances in complex planning and decision-making tasks but are expensive and complicated to build by hand. Can current language models themselves serve as world simulators, correctly predicting how actions change different world states, thus bypassing the need for extensive manual coding? Our goal is to answer this question in the context of text-based simulators. Our approach is to build and use a new benchmark, called ByteSized32-State-Prediction, containing a dataset of text game state transitions and accompanying game tasks. We use this to directly quantify, for the first time, how well LLMs can serve as text-based world simulators. We test GPT-4 on this dataset and find that, despite its impressive performance, it is still an unreliable world simulator without further innovations. This work thus contributes both new insights into current LLM's capabilities and weaknesses, as well as a novel benchmark to track future progress as new models appear.

large language model, machine learning, transition, (17 more...)

arXiv.org Artificial Intelligence

2406.06485

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.39)

Add feedback

Language-guided Skill Learning with Temporal Variational Inference

Fu, Haotian, Sharma, Pratyusha, Stengel-Eskin, Elias, Konidaris, George, Roux, Nicolas Le, Côté, Marc-Alexandre, Yuan, Xingdi

arXiv.org Artificial IntelligenceMay-27-2024

We present an algorithm for skill discovery from expert demonstrations. The algorithm first utilizes Large Language Models (LLMs) to propose an initial segmentation of the trajectories. Following that, a hierarchical variational inference framework incorporates the LLM-generated segmentation information to discover reusable skills by merging trajectory segments. To further control the trade-off between compression and reusability, we introduce a novel auxiliary objective based on the Minimum Description Length principle that helps guide this skill discovery process. Our results demonstrate that agents equipped with our method are able to discover skills that help accelerate learning and outperform baseline skill learning approaches on new long-horizon tasks in BabyAI, a grid world navigation environment, as well as ALFRED, a household simulation environment.

large language model, machine learning, trajectory, (20 more...)

arXiv.org Artificial Intelligence

2402.16354

Country:

Europe (1.00)
North America > Canada (0.93)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment > Games > Computer Games (0.48)
Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.34)

Add feedback

OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following

Shi, Haochen, Sun, Zhiyuan, Yuan, Xingdi, Côté, Marc-Alexandre, Liu, Bang

arXiv.org Artificial IntelligenceMar-5-2024

Embodied Instruction Following (EIF) is a crucial task in embodied learning, requiring agents to interact with their environment through egocentric observations to fulfill natural language instructions. Recent advancements have seen a surge in employing large language models (LLMs) within a framework-centric approach to enhance performance in embodied learning tasks, including EIF. Despite these efforts, there exists a lack of a unified understanding regarding the impact of various components-ranging from visual perception to action execution-on task performance. To address this gap, we introduce OPEx, a comprehensive framework that delineates the core components essential for solving embodied learning tasks: Observer, Planner, and Executor. Through extensive evaluations, we provide a deep analysis of how each component influences EIF task performance. Furthermore, we innovate within this space by deploying a multi-agent dialogue strategy on a TextWorld counterpart, further enhancing task performance. Our findings reveal that LLM-centric design markedly improves EIF outcomes, identify visual perception and low-level action execution as critical bottlenecks, and demonstrate that augmenting LLMs with a multi-agent framework further elevates performance.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2403.03017

Country:

North America > Canada (0.14)
North America > United States (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Policy Improvement using Language Feedback Models

Zhong, Victor, Misra, Dipendra, Yuan, Xingdi, Côté, Marc-Alexandre

arXiv.org Artificial IntelligenceFeb-12-2024

We introduce Language Feedback Models (LFMs) that identify desirable behaviour - actions that help achieve tasks specified in the instruction - for imitation learning in instruction following. To train LFMs, we obtain feedback from Large Language Models (LLMs) on visual trajectories verbalized to language descriptions. First, by using LFMs to identify desirable behaviour to imitate, we improve in task-completion rate over strong behavioural cloning baselines on three distinct language grounding environments (Touchdown, ScienceWorld, and ALFWorld). Second, LFMs outperform using LLMs as experts to directly predict actions, when controlling for the number of LLM output tokens. Third, LFMs generalize to unseen environments, improving task-completion rate by 3.5-12.0% through one round of adaptation. Finally, LFM can be modified to provide human-interpretable feedback without performance loss, allowing human verification of desirable behaviour for imitation learning.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2402.07876

Genre: Research Report (0.64)

Industry:

Transportation (0.46)
Materials (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

V-STaR: Training Verifiers for Self-Taught Reasoners

Hosseini, Arian, Yuan, Xingdi, Malkin, Nikolay, Courville, Aaron, Sordoni, Alessandro, Agarwal, Rishabh

arXiv.org Artificial IntelligenceFeb-9-2024

Common self-improvement approaches for large language models (LLMs), such as STaR (Zelikman et al., 2022), iteratively fine-tune LLMs on self-generated solutions to improve their problem-solving ability. However, these approaches discard the large amounts of incorrect solutions generated during this process, potentially neglecting valuable information in such solutions. To address this shortcoming, we propose V-STaR that utilizes both the correct and incorrect solutions generated during the self-improvement process to train a verifier using DPO that judges correctness of model-generated solutions. This verifier is used at inference time to select one solution among many candidate solutions. Running V-STaR for multiple iterations results in progressively better reasoners and verifiers, delivering a 4% to 17% test accuracy improvement over existing self-improvement and verification approaches on common code generation and math reasoning benchmarks with LLaMA2 models.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2402.06457

Country: North America > Canada (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Guiding Language Model Math Reasoning with Planning Tokens

Wang, Xinyi, Caccia, Lucas, Ostapenko, Oleksiy, Yuan, Xingdi, Wang, William Yang, Sordoni, Alessandro

arXiv.org Artificial IntelligenceFeb-5-2024

Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks, such as chain-of-thought reasoning. However, most of the existing approaches to enhance this ability rely heavily on data-driven methods, while neglecting the structural aspects of the model's reasoning capacity. We find that while LLMs can manage individual reasoning steps well, they struggle with maintaining consistency across an entire reasoning chain. To solve this, we introduce planning tokens at the start of each reasoning step, serving as a guide for the model, and add their embeddings to the model parameters. Our approach requires a negligible increase in trainable parameters (just 0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme. We demonstrate our method's effectiveness by applying it to three different LLMs, showing notable accuracy improvements across three math word problem datasets w.r.t. standard fine-tuning baselines.

large language model, natural language, planning token, (12 more...)

arXiv.org Artificial Intelligence

2310.05707

Country: North America > Canada (0.68)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Joint Prompt Optimization of Stacked LLMs using Variational Inference

Sordoni, Alessandro, Yuan, Xingdi, Côté, Marc-Alexandre, Pereira, Matheus, Trischler, Adam, Xiao, Ziang, Hosseini, Arian, Niedtner, Friederike, Roux, Nicolas Le

arXiv.org Artificial IntelligenceDec-4-2023

Large language models (LLMs) can be seen as atomic units of computation mapping sequences to a distribution over sequences. Thus, they can be seen as stochastic language layers in a language network, where the learnable parameters are the natural language prompts at each layer. By stacking two such layers and feeding the output of one layer to the next, we obtain a Deep Language Network (DLN). We first show how to effectively perform prompt optimization for a 1-Layer language network (DLN-1). Then, we present an extension that applies to 2-layer DLNs (DLN-2), where two prompts must be learned. The key idea is to consider the output of the first layer as a latent variable, which requires inference, and prompts to be learned as the parameters of the generative distribution. We first test the effectiveness of DLN-1 in multiple reasoning and natural language understanding tasks. Then, we show that DLN-2 can reach higher performance than a single layer, showing promise that we might reach comparable performance to GPT-4, even when each LLM in the network is smaller and less powerful.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2306.12509

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

GPT-3-driven pedagogical agents for training children's curious question-asking skills

Abdelghani, Rania, Wang, Yen-Hsiang, Yuan, Xingdi, Wang, Tong, Lucas, Pauline, Sauzéon, Hélène, Oudeyer, Pierre-Yves

arXiv.org Artificial IntelligenceMay-30-2023

In order to train children's ability to ask curiosity-driven questions, previous research has explored designing specific exercises relying on providing semantic and linguistic cues to help formulate such questions. But despite showing pedagogical efficiency, this method is still limited as it relies on generating the said cues by hand, which can be a very costly process. In this context, we propose to leverage advances in the natural language processing field (NLP) and investigate the efficiency of using a large language model (LLM) for automating the production of the pedagogical content of a curious question-asking (QA) training. We study generating the said content using the "prompt-based" method that consists of explaining the task to the LLM in natural text. We evaluate the output using human experts annotations and comparisons with hand-generated content. Results suggested indeed the relevance and usefulness of this content. We also conduct a field study in primary school (75 children aged 9-10), where we evaluate children's QA performance when having this training. We compare 3 types of content : 1) hand-generated content that proposes "closed" cues leading to predefined questions; 2) GPT-3-generated content that proposes the same type of cues; 3) GPT-3-generated content that proposes "open" cues leading to several possible questions. We see a similar QA performance between the two "closed" trainings (showing the scalability of the approach using GPT-3), and a better one for participants with the "open" training. These results suggest the efficiency of using LLMs to support children in generating more curious questions, using a natural language prompting approach that affords usability by teachers and other users not specialists of AI techniques. Furthermore, results also show that open-ended content may be more suitable for training curious question-asking skills.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s40593-023-00340-7

2211.14228

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.93)
Education > Educational Setting > K-12 Education > Primary School (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Augmenting Autotelic Agents with Large Language Models

Colas, Cédric, Teodorescu, Laetitia, Oudeyer, Pierre-Yves, Yuan, Xingdi, Côté, Marc-Alexandre

arXiv.org Artificial IntelligenceMay-21-2023

Humans learn to master open-ended repertoires of skills by imagining and practicing their own goals. This autotelic learning process, literally the pursuit of self-generated (auto) goals (telos), becomes more and more open-ended as the goals become more diverse, abstract and creative. The resulting exploration of the space of possible skills is supported by an inter-individual exploration: goal representations are culturally evolved and transmitted across individuals, in particular using language. Current artificial agents mostly rely on predefined goal representations corresponding to goal spaces that are either bounded (e.g. list of instructions), or unbounded (e.g. the space of possible visual inputs) but are rarely endowed with the ability to reshape their goal representations, to form new abstractions or to imagine creative goals. In this paper, we introduce a language model augmented autotelic agent (LMA3) that leverages a pretrained language model (LM) to support the representation, generation and learning of diverse, abstract, human-relevant goals. The LM is used as an imperfect model of human cultural transmission; an attempt to capture aspects of humans' common-sense, intuitive physics and overall interests. Specifically, it supports three key components of the autotelic architecture: 1)~a relabeler that describes the goals achieved in the agent's trajectories, 2)~a goal generator that suggests new high-level goals along with their decomposition into subgoals the agent already masters, and 3)~reward functions for each of these goals. Without relying on any hand-coded goal representations, reward functions or curriculum, we show that LMA3 agents learn to master a large diversity of skills in a task-agnostic text-based environment.

machine learning, natural language, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2305.12487

Country: Europe (0.92)

Genre:

Research Report (0.64)
Workflow (0.57)

Industry:

Education (1.00)
Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback