AITopics | Jaafar, Ahmed

Plotting

Jaafar, Ahmed

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

{\lambda}: A Benchmark for Data-Efficiency in Long-Horizon Indoor Mobile Manipulation Robotics

Jaafar, Ahmed, Raman, Shreyas Sundara, Wei, Yichen, Harithas, Sudarshan, Juliani, Sofia, Wernerfelt, Anneke, Quartey, Benedict, Idrees, Ifrah, Liu, Jason Xinyu, Tellex, Stefanie

arXiv.org Artificial IntelligenceJan-7-2025

Efficiently learning and executing long-horizon mobile manipulation (MoMa) tasks is crucial for advancing robotics in household and workplace settings. However, current MoMa models are data-inefficient, underscoring the need for improved models that require realistic-sized benchmarks to evaluate their efficiency, which do not exist. To address this, we introduce the LAMBDA ({\lambda}) benchmark (Long-horizon Actions for Mobile-manipulation Benchmarking of Directed Activities), which evaluates the data efficiency of models on language-conditioned, long-horizon, multi-room, multi-floor, pick-and-place tasks using a dataset of manageable size, more feasible for collection. The benchmark includes 571 human-collected demonstrations that provide realism and diversity in simulated and real-world settings. Unlike planner-generated data, these trajectories offer natural variability and replay-verifiability, ensuring robust learning and evaluation. We benchmark several models, including learning-based models and a neuro-symbolic modular approach combining foundation models with task and motion planning. Learning-based models show suboptimal success rates, even when leveraging pretrained weights, underscoring significant data inefficiencies. However, the neuro-symbolic approach performs significantly better while being more data efficient. Findings highlight the need for more data-efficient learning-based MoMa approaches. {\lambda} addresses this gap by serving as a key benchmark for evaluating the data efficiency of those future models in handling household robotics tasks.

benchmark, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2412.05313

Country:

North America > United States (0.14)
Europe > Netherlands (0.14)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)

Add feedback

Compositional Zero-Shot Learning for Attribute-Based Object Reference in Human-Robot Interaction

Gao, Peng, Jaafar, Ahmed, Reily, Brian, Reardon, Christopher, Zhang, Hao

arXiv.org Artificial IntelligenceDec-21-2023

Language-enabled robots have been widely studied over the past years to enable natural human-robot interaction and teaming in various real-world applications. Language-enabled robots must be able to comprehend referring expressions to identify a particular object from visual perception using a set of referring attributes extracted from natural language. However, visual observations of an object may not be available when it is referred to, and the number of objects and attributes may also be unbounded in open worlds. To address the challenges, we implement an attribute-based compositional zero-shot learning method that uses a list of attributes to perform referring expression comprehension in open worlds. We evaluate the approach on two datasets including the MIT-States and the Clothing 16K. The preliminary experimental results show that our implemented approach allows a robot to correctly identify the objects referred to by human commands.

artificial intelligence, large language model, natural language, (12 more...)

arXiv.org Artificial Intelligence

2312.13655

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.64)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)

Add feedback

Do Physicians Know How to Prompt? The Need for Automatic Prompt Optimization Help in Clinical Note Generation

Yao, Zonghai, Jaafar, Ahmed, Wang, Beining, Zhu, Yue, Yang, Zhichao, Yu, Hong

arXiv.org Artificial IntelligenceNov-16-2023

This study examines the effect of prompt engineering on the performance of Large Language Models (LLMs) in clinical note generation. We introduce an Automatic Prompt Optimization (APO) framework to refine initial prompts and compare the outputs of medical experts, non-medical experts, and APO-enhanced GPT3.5 and GPT4. Results highlight GPT4 APO's superior performance in standardizing prompt quality across clinical note sections. A human-in-the-loop approach shows that experts maintain content quality post-APO, with a preference for their own modifications, suggesting the value of expert customization. We recommend a two-phase optimization process, leveraging APO-GPT4 for consistency and expert input for personalization.

large language model, machine learning, preprint arxiv, (18 more...)

arXiv.org Artificial Intelligence

2311.09684

Country:

North America > United States > Michigan (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report > Experimental Study (0.66)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback