AITopics | embodied ai

Collaborating Authors

embodied ai

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Alexa Arena: A User-Centric Interactive Platform for Embodied AI

Neural Information Processing SystemsDec-24-2025, 19:17:05 GMT

We introduce Alexa Arena, a user-centric simulation platform to facilitate research in building assistive conversational embodied agents. Alexa Arena features multi-room layouts and an abundance of interactable objects. With user-friendly graphics and control mechanisms, the platform supports the development of gamified robotic tasks readily accessible to general human users, allowing high-efficiency data collection and EAI system evaluation. Along with the platform, we introduce a dialog-enabled task completion benchmark with online human evaluations.

alexa arena, name change, user-centric interactive platform, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.80)

Add feedback

Generations in Dialogue: Embodied AI, robotics, perception, and action with Professor Roberto Martín-Martín

AIHubNov-27-2025, 14:29:28 GMT

Generations in Dialogue: Bridging Perspectives in AI is a podcast from AAAI featuring thought-provoking discussions between AI experts, practitioners, and enthusiasts from different age groups and backgrounds. Each episode delves into how generational experiences shape views on AI, exploring the challenges, opportunities, and ethical considerations that come with the advancement of this transformative technology. In the third episode of this new series from AAAI, host Ella Lan chats to Professor Roberto Martín-Martín about taking a screwdriver to his toys as a child, how his research focus has evolved over time, how different generations interact with technology, making robots for everyone, being inspired by colleagues, advice for early-career researchers, and how machines can enhance human capabilities. Roberto Martín-Martín is an Assistant Professor of Computer Science at the University of Texas at Austin, where his research integrates robotics, computer vision, and machine learning to build autonomous agents capable of perceiving, learning, and acting in the real world. He previously worked as an AI Researcher at Salesforce AI and as a Postdoctoral Scholar at the Stanford Vision and Learning Lab with Silvio Savarese and Fei-Fei Li, leading projects in visuomotor learning, mobile manipulation, and human-robot interaction.

artificial intelligence, dialogue, perception, (9 more...)

AIHub

Country:

North America > United States > Texas > Travis County > Austin (0.25)
Europe > Spain > Galicia > Madrid (0.05)

Genre:

Research Report > New Finding (0.36)
Overview (0.36)

Industry: Information Technology (0.53)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.74)

Add feedback

Image Quality Assessment for Embodied AI

Li, Chunyi, Xiao, Jiaohao, Zhang, Jianbo, Wen, Farong, Zhang, Zicheng, Tian, Yuan, Zhu, Xiangyang, Liu, Xiaohong, Cheng, Zhengxue, Lin, Weisi, Zhai, Guangtao

arXiv.org Artificial IntelligenceOct-15-2025

Embodied AI has developed rapidly in recent years, but it is still mainly deployed in laboratories, with various distortions in the Real-world limiting its application. Traditionally, Image Quality Assessment (IQA) methods are applied to predict human preferences for distorted images; however, there is no IQA method to assess the usability of an image in embodied tasks, namely, the perceptual quality for robots. To provide accurate and reliable quality indicators for future embodied scenarios, we first propose the topic: IQA for Embodied AI. Specifically, we (1) based on the Mertonian system and meta-cognitive theory, constructed a perception-cognition-decision-execution pipeline and defined a comprehensive subjective score collection process; (2) established the Embodied-IQA database, containing over 36k reference/distorted image pairs, with more than 5m fine-grained annotations provided by Vision Language Models/Vision Language Action-models/Real-world robots; (3) trained and validated the performance of mainstream IQA methods on Embodied-IQA, demonstrating the need to develop more accurate quality indicators for Embodied AI. We sincerely hope that through evaluation, we can promote the application of Embodied AI under complex distortions in the Real-world. Project page: https://github.com/lcysyzxdxc/EmbodiedIQA

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2505.16815

Genre:

Research Report (0.63)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

PersONAL: Towards a Comprehensive Benchmark for Personalized Embodied Agents

Ziliotto, Filippo, Akkara, Jelin Raphael, Daniele, Alessandro, Ballan, Lamberto, Serafini, Luciano, Campari, Tommaso

arXiv.org Artificial IntelligenceSep-25-2025

Recent advances in Embodied AI have enabled agents to perform increasingly complex tasks and adapt to diverse environments. However, deploying such agents in realistic human-centered scenarios, such as domestic households, remains challenging, particularly due to the difficulty of modeling individual human preferences and behaviors. In this work, we introduce PersONAL (PERSonalized Object Navigation And Localization, a comprehensive benchmark designed to study personalization in Embodied AI. Agents must identify, retrieve, and navigate to objects associated with specific users, responding to natural-language queries such as "find Lily's backpack". PersONAL comprises over 2,000 high-quality episodes across 30+ photorealistic homes from the HM3D dataset. Each episode includes a natural-language scene description with explicit associations between objects and their owners, requiring agents to reason over user-specific semantics. The benchmark supports two evaluation modes: (1) active navigation in unseen environments, and (2) object grounding in previously mapped scenes. Experiments with state-of-the-art baselines reveal a substantial gap to human performance, highlighting the need for embodied agents capable of perceiving, reasoning, and memorizing over personalized information; paving the way towards real-world assistive robot.

agent, artificial intelligence, natural language, (16 more...)

arXiv.org Artificial Intelligence

2509.19843

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Embodied Arena: A Comprehensive, Unified, and Evolving Evaluation Platform for Embodied AI

Ni, Fei, Zhang, Min, Li, Pengyi, Yuan, Yifu, Zhang, Lingfeng, Liu, Yuecheng, Han, Peilong, Kou, Longxin, Ma, Shaojin, Qiao, Jinbin, Bravo, David Gamaliel Arcos, Wang, Yuening, Hu, Xiao, Zhang, Zhanguang, Yao, Xianze, Li, Yutong, Zhang, Zhao, Wen, Ying, Chen, Ying-Cong, Liang, Xiaodan, Lin, Liang, He, Bin, Bou-Ammar, Haitham, Wang, He, Xu, Huazhe, Deng, Jiankang, Luo, Shan, Jiang, Shuqiang, Pan, Wei, Gao, Yang, Zafeiriou, Stefanos, Peters, Jan, Zhuang, Yuzheng, Zhang, Yingxue, Zheng, Yan, Tang, Hongyao, Hao, Jianye

arXiv.org Artificial IntelligenceSep-24-2025

Embodied AI development significantly lags behind large foundation models due to three critical challenges: (1) lack of systematic understanding of core capabilities needed for Embodied AI, making research lack clear objectives; (2) absence of unified and standardized evaluation systems, rendering cross-benchmark evaluation infeasible; and (3) underdeveloped automated and scalable acquisition methods for embodied data, creating critical bottlenecks for model scaling. To address these obstacles, we present Embodied Arena, a comprehensive, unified, and evolving evaluation platform for Embodied AI. Our platform establishes a systematic embodied capability taxonomy spanning three levels (perception, reasoning, task execution), seven core capabilities, and 25 fine-grained dimensions, enabling unified evaluation with systematic research objectives. We introduce a standardized evaluation system built upon unified infrastructure supporting flexible integration of 22 diverse benchmarks across three domains (2D/3D Embodied Q&A, Navigation, Task Planning) and 30+ advanced models from 20+ worldwide institutes. Additionally, we develop a novel LLM-driven automated generation pipeline ensuring scalable embodied evaluation data with continuous evolution for diversity and comprehensiveness. Embodied Arena publishes three real-time leaderboards (Embodied Q&A, Navigation, Task Planning) with dual perspectives (benchmark view and capability view), providing comprehensive overviews of advanced model capabilities. Especially, we present nine findings summarized from the evaluation results on the leaderboards of Embodied Arena. This helps to establish clear research veins and pinpoint critical research problems, thereby driving forward progress in the field of Embodied AI.

embodied arena, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.15273

Country: Asia > China (0.68)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

Multi-Modal Multi-Task (M3T) Federated Foundation Models for Embodied AI: Potentials and Challenges for Edge Integration

Borazjani, Kasra, Abdisarabshali, Payam, Nadimi, Fardis, Khosravan, Naji, Liwang, Minghui, Wang, Xianbin, Hong, Yiguang, Hosseinalipour, Seyyedali

arXiv.org Artificial IntelligenceSep-9-2025

As embodied AI systems become increasingly multi-modal, personalized, and interactive, they must learn effectively from diverse sensory inputs, adapt continually to user preferences, and operate safely under resource and privacy constraints. These challenges expose a pressing need for machine learning models capable of swift, context-aware adaptation while balancing model generalization and personalization. Here, two methods emerge as suitable candidates, each offering parts of these capabilities: multi-modal multi-task foundation models (M3T-FMs) provide a pathway toward generalization across tasks and modalities, whereas federated learning (FL) offers the infrastructure for distributed, privacy-preserving model updates and user-level model personalization. However, when used in isolation, each of these approaches falls short of meeting the complex and diverse capability requirements of real-world embodied AI environments. In this vision paper, we introduce multi-modal multi-task federated foundation models (M3T-FFMs) for embodied AI, a new paradigm that unifies the strengths of M3T-FMs with the privacy-preserving distributed training nature of FL, enabling intelligent systems at the wireless edge. We collect critical deployment dimensions of M3T-FFMs in embodied AI ecosystems under a unified framework, which we name "EMBODY": Embodiment heterogeneity, Modality richness and imbalance, Bandwidth and compute constraints, On-device continual learning, Distributed control and autonomy, and Yielding safety, privacy, and personalization. For each, we identify concrete challenges and envision actionable research directions. We also present an evaluation framework for deploying M3T-FFMs in embodied AI systems, along with the associated trade-offs. Finally, we present a prototype implementation of M3T-FFMs and evaluate their energy and latency performance.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.11191

Genre: Research Report (0.50)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)

Add feedback

Embodied AI in Social Spaces: Responsible and Adaptive Robots in Complex Setting -- UKAIRS 2025 (Copy)

Landowska, Aleksandra, Bergin, Aislinn D Gomez, Abioye, Ayodeji O., Deshmukh, Jayati, Bouadouki, Andriana, Wheadon, Maria, Georgara, Athina, Price, Dominic, Nguyen, Tuyen, Ao, Shuang, Singh, Lokesh, Long, Yi, Miele, Raffaele, Fischer, Joel E., Ramchurn, Sarvapali D.

arXiv.org Artificial IntelligenceSep-4-2025

This paper introduces and overviews a multidisciplinary project aimed at developing responsible and adaptive multi-human multi-robot (MHMR) systems for complex, dynamic settings. The project integrates co-design, ethical frameworks, and multimodal sensing to create AI-driven robots that are emotionally responsive, context-aware, and aligned with the needs of diverse users. We outline the project's vision, methodology, and early outcomes, demonstrating how embodied AI can support sustainable, ethical, and human-centred futures.

artificial intelligence, computer science, university, (12 more...)

arXiv.org Artificial Intelligence

2509.00218

Country: Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.20)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.71)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.50)

Add feedback

Multimodal Data Storage and Retrieval for Embodied AI: A Survey

Lu, Yihao, Tang, Hao

arXiv.org Artificial IntelligenceAug-20-2025

Embodied AI (EAI) agents continuously interact with the physical world, generating vast, heterogeneous multimodal data streams that traditional management systems are ill-equipped to handle. In this survey, we first systematically evaluate five storage architectures (Graph Databases, Multi-Model Databases, Data Lakes, Vector Databases, and Time-Series Databases), focusing on their suitability for addressing EAI's core requirements, including physical grounding, low-latency access, and dynamic scalability. We then analyze five retrieval paradigms (Fusion Strategy-Based Retrieval, Representation Alignment-Based Retrieval, Graph-Structure-Based Retrieval, Generation Model-Based Retrieval, and Efficient Retrieval-Based Optimization), revealing a fundamental tension between achieving long-term semantic coherence and maintaining real-time responsiveness. Based on this comprehensive analysis, we identify key bottlenecks, spanning from the foundational Physical Grounding Gap to systemic challenges in cross-modal integration, dynamic adaptation, and open-world generalization. Finally, we outline a forward-looking research agenda encompassing physics-aware data models, adaptive storage-retrieval co-optimization, and standardized benchmarking, to guide future research toward principled data management solutions for EAI. Our survey is based on a comprehensive review of more than 180 related studies, providing a rigorous roadmap for designing the robust, high-performance data management frameworks essential for the next generation of autonomous embodied systems.

large language model, machine learning, real time system, (18 more...)

arXiv.org Artificial Intelligence

2508.13901

Country: Asia > China (0.28)

Genre: Overview (1.00)

Industry:

Information Technology (0.93)
Education (0.67)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(9 more...)

Add feedback

Empowering Virtual Agents With Intelligent Systems

Communications of the ACMJul-22-2025, 14:37:50 GMT

While embodied AI is commonly understood as general-purpose intelligence that empowers various forms of robotics,9 we believe that its scope extends significantly beyond robotic platforms alone. Embodied AI, as we define it, refers to intelligent systems capable of learning from and actively interacting with their environments, continuously adapting based on real-time sensor feedback and context-driven decision-making. Specifically, we define Environmental Embodied AI as an intelligent virtual agent capable of real-time perception, learning, and interaction with its surrounding environment through sensor inputs, enabling it to actuate environmental elements, e.g. Distinct from traditional embodied AI systems primarily associated with robotic platforms, Environmental Embodied AI specifically emphasizes non-robotic applications, employing virtual agents to directly influence physical or operational states within environments. These intelligent systems autonomously analyze environmental data, dynamically adapting behaviors to optimize outcomes and significantly reduce ecological footprints, inherently supporting environmentally sustainable practices.

artificial intelligence, empowering virtual agent, intelligent system, (1 more...)

Communications of the ACM

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.92)

Add feedback

Toward Embodied AGI: A Review of Embodied AI and the Road Ahead

Wang, Yequan, Sun, Aixin

arXiv.org Artificial IntelligenceMay-21-2025

Artificial General Intelligence (AGI) is often envisioned as inherently embodied. With recent advances in robotics and foundational AI models, we stand at the threshold of a new era-one marked by increasingly generalized embodied AI systems. This paper contributes to the discourse by introducing a systematic taxonomy of Embodied AGI spanning five levels (L1-L5). We review existing research and challenges at the foundational stages (L1-L2) and outline the key components required to achieve higher-level capabilities (L3-L5). Building on these insights and existing technologies, we propose a conceptual framework for an L3+ robotic brain, offering both a technical outlook and a foundation for future exploration.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2505.14235

Country: North America > United States (0.93)

Genre: Overview (1.00)

Industry:

Transportation > Ground > Road (0.47)
Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Add feedback