AITopics | Liu, Jason Xinyu

Collaborating Authors

Liu, Jason Xinyu

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

{\lambda}: A Benchmark for Data-Efficiency in Long-Horizon Indoor Mobile Manipulation Robotics

Jaafar, Ahmed, Raman, Shreyas Sundara, Wei, Yichen, Harithas, Sudarshan, Juliani, Sofia, Wernerfelt, Anneke, Quartey, Benedict, Idrees, Ifrah, Liu, Jason Xinyu, Tellex, Stefanie

arXiv.org Artificial IntelligenceJan-7-2025

Efficiently learning and executing long-horizon mobile manipulation (MoMa) tasks is crucial for advancing robotics in household and workplace settings. However, current MoMa models are data-inefficient, underscoring the need for improved models that require realistic-sized benchmarks to evaluate their efficiency, which do not exist. To address this, we introduce the LAMBDA ({\lambda}) benchmark (Long-horizon Actions for Mobile-manipulation Benchmarking of Directed Activities), which evaluates the data efficiency of models on language-conditioned, long-horizon, multi-room, multi-floor, pick-and-place tasks using a dataset of manageable size, more feasible for collection. The benchmark includes 571 human-collected demonstrations that provide realism and diversity in simulated and real-world settings. Unlike planner-generated data, these trajectories offer natural variability and replay-verifiability, ensuring robust learning and evaluation. We benchmark several models, including learning-based models and a neuro-symbolic modular approach combining foundation models with task and motion planning. Learning-based models show suboptimal success rates, even when leveraging pretrained weights, underscoring significant data inefficiencies. However, the neuro-symbolic approach performs significantly better while being more data efficient. Findings highlight the need for more data-efficient learning-based MoMa approaches. {\lambda} addresses this gap by serving as a key benchmark for evaluating the data efficiency of those future models in handling household robotics tasks.

benchmark, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2412.05313

Country:

North America > United States (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

A Survey of Robotic Language Grounding: Tradeoffs between Symbols and Embeddings

Cohen, Vanya, Liu, Jason Xinyu, Mooney, Raymond, Tellex, Stefanie, Watkins, David

arXiv.org Artificial IntelligenceJun-22-2024

With large language models, robots can understand language more flexibly and more capable than ever before. This survey reviews and situates recent literature into a spectrum with two poles: 1) mapping between language and some manually defined formal representation of meaning, and 2) mapping between language and high-dimensional vector spaces that translate directly to low-level robot policy. Using a formal representation allows the meaning of the language to be precisely represented, limits the size of the learning problem, and leads to a framework for interpretability and formal safety guarantees. Methods that embed language and perceptual data into high-dimensional spaces avoid this manually specified symbolic structure and thus have the potential to be more general when fed enough data but require more data and computing to train. We discuss the benefits and tradeoffs of each approach and finish by providing directions for future work that achieves the best of both worlds.

logic & formal reasoning, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2405.13245

Country: North America > United States (0.14)

Genre: Overview (1.00)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

Open-vocabulary Pick and Place via Patch-level Semantic Maps

Jia, Mingxi, Huang, Haojie, Zhang, Zhewen, Wang, Chenghao, Zhao, Linfeng, Wang, Dian, Liu, Jason Xinyu, Walters, Robin, Platt, Robert, Tellex, Stefanie

arXiv.org Artificial IntelligenceJun-21-2024

Controlling robots through natural language instructions in open-vocabulary scenarios is pivotal for enhancing human-robot collaboration and complex robot behavior synthesis. However, achieving this capability poses significant challenges due to the need for a system that can generalize from limited data to a wide range of tasks and environments. Existing methods rely on large, costly datasets and struggle with generalization. This paper introduces Grounded Equivariant Manipulation (GEM), a novel approach that leverages the generative capabilities of pre-trained vision-language models and geometric symmetries to facilitate few-shot and zero-shot learning for open-vocabulary robot manipulation tasks. Our experiments demonstrate GEM's high sample efficiency and superior generalization across diverse pick-and-place tasks in both simulation and real-world experiments, showcasing its ability to adapt to novel instructions and unseen objects with minimal data requirements. GEM advances a significant step forward in the domain of language-conditioned robot control, bridging the gap between semantic understanding and action generation in robotic systems.

large language model, natural language, semantic map, (18 more...)

arXiv.org Artificial Intelligence

2406.15677

Country: North America > United States > New York (0.14)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)
Information Technology > Artificial Intelligence > Robots > Robots in the Workplace (0.61)

Add feedback

Grounding Complex Natural Language Commands for Temporal Tasks in Unseen Environments

Liu, Jason Xinyu, Yang, Ziyi, Idrees, Ifrah, Liang, Sam, Schornstein, Benjamin, Tellex, Stefanie, Shah, Ankit

arXiv.org Artificial IntelligenceOct-17-2023

Grounding navigational commands to linear temporal logic (LTL) leverages its unambiguous semantics for reasoning about long-horizon tasks and verifying the satisfaction of temporal constraints. Existing approaches require training data from the specific environment and landmarks that will be used in natural language to understand commands in those environments. We propose Lang2LTL, a modular system and a software package that leverages large language models (LLMs) to ground temporal navigational commands to LTL specifications in environments without prior language data. We comprehensively evaluate Lang2LTL for five well-defined generalization behaviors. Lang2LTL demonstrates the state-of-the-art ability of a single model to ground navigational commands to diverse temporal specifications in 21 city-scaled environments. Finally, we demonstrate a physical robot using Lang2LTL can follow 52 semantically diverse navigational commands in two indoor environments.

artificial intelligence, grounding complex natural language command, large language model, (2 more...)

arXiv.org Artificial Intelligence

2302.11649

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.53)

Add feedback

Skill Transfer for Temporally-Extended Task Specifications

Liu, Jason Xinyu, Shah, Ankit, Rosen, Eric, Konidaris, George, Tellex, Stefanie

arXiv.org Artificial IntelligenceMar-5-2023

Deploying robots in real-world domains, such as households and flexible manufacturing lines, requires the robots to be taskable on demand. Linear temporal logic (LTL) is a widely-used specification language with a compositional grammar that naturally induces commonalities across tasks. However, the majority of prior research on reinforcement learning with LTL specifications treats every new formula independently. We propose LTL-Transfer, a novel algorithm that enables subpolicy reuse across tasks by segmenting policies for training tasks into portable transition-centric skills capable of satisfying a wide array of unseen LTL specifications while respecting safety-critical constraints. Experiments in a Minecraft-inspired domain show that LTL-Transfer can satisfy over 90% of 500 unseen tasks after training on only 50 task specifications and never violating a safety constraint. We also deployed LTL-Transfer on a quadruped mobile manipulator in an analog household environment to demonstrate its ability to transfer to many fetch and delivery tasks in a zero-shot fashion.

logic & formal reasoning, machine learning, specification, (20 more...)

arXiv.org Artificial Intelligence

2206.05096

Genre: Research Report (0.40)

Industry:

Education (0.50)
Leisure & Entertainment > Games > Computer Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.93)

Add feedback