AITopics | Xu, Mengda

Plotting

Xu, Mengda

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

O3D: Offline Data-driven Discovery and Distillation for Sequential Decision-Making with Large Language Models

Xiao, Yuchen, Sun, Yanchao, Xu, Mengda, Madhushani, Udari, Vann, Jared, Garg, Deepeka, Ganesh, Sumitra

arXiv.org Artificial IntelligenceFeb-5-2024

Recent advancements in large language models (LLMs) have exhibited promising performance in solving sequential decision-making problems. By imitating few-shot examples provided in the prompts (i.e., in-context learning), an LLM agent can interact with an external environment and complete given tasks without additional training. However, such few-shot examples are often insufficient to generate high-quality solutions for complex and long-horizon tasks, while the limited context length cannot consume larger-scale demonstrations with long interaction horizons. To this end, we propose an offline learning framework that utilizes offline data at scale (e.g, logs of human interactions) to improve LLM-powered policies without finetuning. The proposed method O3D (Offline Data-driven Discovery and Distillation) automatically discovers reusable skills and distills generalizable knowledge across multiple tasks based on offline interaction data, advancing the capability of solving downstream tasks. Empirical results under two interactive decision-making benchmarks (ALFWorld and WebShop) verify that O3D can notably enhance the decision-making capabilities of LLMs through the offline discovery and distillation process, and consistently outperform baselines across various LLMs.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2310.14403

Genre: Research Report > New Finding (0.46)

Industry: Education (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

XSkill: Cross Embodiment Skill Discovery

Xu, Mengda, Xu, Zhenjia, Chi, Cheng, Veloso, Manuela, Song, Shuran

arXiv.org Artificial IntelligenceSep-28-2023

Human demonstration videos are a widely available data source for robot learning and an intuitive user interface for expressing desired behavior. However, directly extracting reusable robot manipulation skills from unstructured human videos is challenging due to the big embodiment difference and unobserved action parameters. To bridge this embodiment gap, this paper introduces XSkill, an imitation learning framework that 1) discovers a cross-embodiment representation called skill prototypes purely from unlabeled human and robot manipulation videos, 2) transfers the skill representation to robot actions using conditional diffusion policy, and finally, 3) composes the learned skill to accomplish unseen tasks specified by a human prompt video. Our experiments in simulation and real-world environments show that the discovered skill prototypes facilitate both skill transfer and composition for unseen tasks, resulting in a more general and scalable imitation learning framework. The benchmark, code, and qualitative results are on https://xskill.cs.columbia.edu/

artificial intelligence, cross embodiment skill discovery, xskill

arXiv.org Artificial Intelligence

2307.09955

Genre: Research Report (0.40)

Industry: Education (0.53)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations

Vadori, Nelson, Ardon, Leo, Ganesh, Sumitra, Spooner, Thomas, Amrouni, Selim, Vann, Jared, Xu, Mengda, Zheng, Zeyu, Balch, Tucker, Veloso, Manuela

arXiv.org Artificial IntelligenceAug-1-2023

We study a game between liquidity provider and liquidity taker agents interacting in an over-the-counter market, for which the typical example is foreign exchange. We show how a suitable design of parameterized families of reward functions coupled with shared policy learning constitutes an efficient solution to this problem. By playing against each other, our deep-reinforcement-learning-driven agents learn emergent behaviors relative to a wide spectrum of objectives encompassing profit-and-loss, optimal execution and market share. In particular, we find that liquidity providers naturally learn to balance hedging and skewing, where skewing refers to setting their buy and sell prices asymmetrically as a function of their inventory. We further introduce a novel RL-based calibration algorithm which we found performed well at imposing constraints on the game equilibrium. On the theoretical side, we are able to show convergence rates for our multi-agent policy gradient algorithm under a transitivity assumption, closely related to generalized ordinal potential games.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2210.07184

Country: North America > United States (0.45)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Mixture of basis for interpretable continual learning with distribution shifts

Xu, Mengda, Ganesh, Sumitra, Pasula, Pranay

arXiv.org Artificial IntelligenceJan-5-2022

Continual learning in environments with shifting data distributions is a challenging problem with several real-world applications. In this paper we consider settings in which the data distribution(task) shifts abruptly and the timing of these shifts are not known. Furthermore, we consider a semi-supervised task-agnostic setting in which the learning algorithm has access to both task-segmented and unsegmented data for offline training. We propose a novel approach called mixture of Basismodels (MoB) for addressing this problem setting. The core idea is to learn a small set of basis models and to construct a dynamic, task-dependent mixture of the models to predict for the current task. We also propose a new methodology to detect observations that are out-of-distribution with respect to the existing basis models and to instantiate new models as needed. We test our approach in multiple domains and show that it attains better prediction error than existing methods in most cases while using fewer models than other multiple model approaches. Moreover, we analyze the latent task representations learned by MoB and show that similar tasks tend to cluster in the latent space and that the latent representation shifts at the task boundaries when tasks are dissimilar.

artificial intelligence, banking & finance, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2201.01853

Genre: Research Report (1.00)

Industry:

Banking & Finance (0.94)
Education > Educational Setting (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Towards a fully RL-based Market Simulator

Ardon, Leo, Vadori, Nelson, Spooner, Thomas, Xu, Mengda, Vann, Jared, Ganesh, Sumitra

arXiv.org Artificial IntelligenceOct-13-2021

We present a new financial framework where two families of RL-based agents representing the Liquidity Providers and Liquidity Takers learn simultaneously to satisfy their objective. Thanks to a parametrized reward formulation and the use of Deep RL, each group learns a shared policy able to generalize and interpolate over a wide range of behaviors. This is a step towards a fully RL-based market simulator replicating complex market conditions particularly suited to study the dynamics of the financial market under various scenarios.

artificial intelligence, banking & finance, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2110.06829

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.50)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback