AITopics

World models are emerging as a transformative paradigm in artificial intelligence, enabling agents to construct internal representations of their environments for predictive reasoning, planning, and decision-making. By learning latent dynamics, world models provide a sample-efficient framework that is especially valuable in data-constrained or safety-critical scenarios. In this paper, we present a comprehensive overview of world models, highlighting their architecture, training paradigms, and applications across prediction, generation, planning, and causal reasoning. We compare and distinguish world models from related concepts such as digital twins, the metaverse, and foundation models, clarifying their unique role as embedded cognitive engines for autonomous agents. We further propose Wireless Dreamer, a novel world model-based reinforcement learning framework tailored for wireless edge intelligence optimization, particularly in low-altitude wireless networks (LAWNs). Through a weather-aware UAV trajectory planning case study, we demonstrate the effectiveness of our framework in improving learning efficiency and decision quality.

artificial intelligence, machine learning, world model, (14 more...)

2506.00417

Country: Asia (0.46)

Genre:

Overview (0.68)
Research Report (0.50)
Workflow (0.47)

Industry:

Information Technology (0.69)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

France, Kordel K., Peddi, Rohith, Dennler, Nik, Daescu, Ovidiu

Position: Olfaction Standardization is Essential for the Advancement of Embodied Artificial Intelligence

Despite extraordinary progress in artificial intelligence (AI), modern systems remain incomplete representations of human cognition. Vision, audition, and language have received disproportionate attention due to well-defined benchmarks, standardized datasets, and consensus-driven scientific foundations. In contrast, olfaction - a high-bandwidth, evolutionarily critical sense - has been largely overlooked. This omission presents a foundational gap in the construction of truly embodied and ethically aligned super-human intelligence. We argue that the exclusion of olfactory perception from AI architectures is not due to irrelevance but to structural challenges: unresolved scientific theories of smell, heterogeneous sensor technologies, lack of standardized olfactory datasets, absence of AI-oriented benchmarks, and difficulty in evaluating sub-perceptual signal processing. These obstacles have hindered the development of machine olfaction despite its tight coupling with memory, emotion, and contextual reasoning in biological systems. In this position paper, we assert that meaningful progress toward general and embodied intelligence requires serious investment in olfactory research by the AI community. We call for cross-disciplinary collaboration - spanning neuroscience, robotics, machine learning, and ethics - to formalize olfactory benchmarks, develop multimodal datasets, and define the sensory capabilities necessary for machines to understand, navigate, and act within human environments. Recognizing olfaction as a core modality is essential not only for scientific completeness, but for building AI systems that are ethically grounded in the full scope of the human experience.

machine learning, natural language, olfaction, (17 more...)

2506.00398

Country:

North America > United States (0.93)
Europe (0.68)

Genre:

Research Report (0.64)
Overview (0.46)

Industry:

Consumer Products & Services (1.00)
Materials (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Feng, Shuai, Chan, Wei-Chuang, Chouhan, Srishti, Ayala, Junior Francisco Garcia, Medicherla, Srujananjali, Clark, Kyle, Shi, Mingwei

Whispers of Many Shores: Cultural Alignment through Collaborative Cultural Expertise

Current LLMs often lack the nuanced understanding required for diverse cultural contexts, and adapting them typically involves costly full fine-tuning. To address this, we introduce a novel soft prompt fine-tuning framework that enables efficient and modular cultural alignment. Our method utilizes vectorized prompt tuning to dynamically route queries to a committee of culturally specialized'expert' LLM configurations, created by optimizing soft prompt embeddings without altering the base model's parameters. Extensive experiments demonstrate that our framework significantly enhances cultural sensitivity and adaptability, improving alignment scores from 0.208 to 0.820 (cf.Table 1), offering a robust solution for culturally-aware LLM deployment. This research paves the way for subsequent investigations into enhanced cultural coverage and dynamic expert adaptation, crucial for realizing autonomous AI with deeply nuanced understanding in a globally interconnected world.

large language model, machine learning, natural language, (17 more...)

2506.00242

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)
Africa (0.68)

Genre:

Research Report (0.65)
Questionnaire & Opinion Survey (0.46)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.48)
Health & Medicine > Consumer Health (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

MedOrch: Medical Diagnosis with Tool-Augmented Reasoning Agents for Flexible Extensibility

He, Yexiao, Li, Ang, Liu, Boyi, Yao, Zhewei, He, Yuxiong

Healthcare decision-making represents one of the most challenging domains for Artificial Intelligence (AI), requiring the integration of diverse knowledge sources, complex reasoning, and various external analytical tools. Current AI systems often rely on either task-specific models, which offer limited adaptability, or general language models without grounding with specialized external knowledge and tools. We introduce MedOrch, a novel framework that orchestrates multiple specialized tools and reasoning agents to provide comprehensive medical decision support. MedOrch employs a modular, agent-based architecture that facilitates the flexible integration of domain-specific tools without altering the core system. Furthermore, it ensures transparent and traceable reasoning processes, enabling clinicians to meticulously verify each intermediate step underlying the system's recommendations. We evaluate MedOrch across three distinct medical applications: Alzheimer's disease diagnosis, chest X-ray interpretation, and medical visual question answering, using authentic clinical datasets. The results demonstrate MedOrch's competitive performance across these diverse medical tasks. Notably, in Alzheimer's disease diagnosis, MedOrch achieves an accuracy of 93.26%, surpassing the state-of-the-art baseline by over four percentage points. For predicting Alzheimer's disease progression, it attains a 50.35% accuracy, marking a significant improvement. In chest X-ray analysis, MedOrch exhibits superior performance with a Macro AUC of 61.2% and a Macro F1-score of 25.5%. Moreover, in complex multimodal visual question answering (Image+Table), MedOrch achieves an accuracy of 54.47%. These findings underscore MedOrch's potential to advance healthcare AI by enabling reasoning-driven tool utilization for multimodal medical data processing and supporting intricate cognitive tasks in clinical decision-making.

large language model, machine learning, natural language, (16 more...)

2506.00235

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Sorrel: A simple and flexible framework for multi-agent reinforcement learning

Gelpí, Rebekah A., Ju, Yibing, Jackson, Ethan C., Tang, Yikai, Verch, Shon, Voelcker, Claas, Cunningham, William A.

We introduce Sorrel (https://github.com/social-ai-uoft/sorrel), a simple Python interface for generating and testing new multi-agent reinforcement learning environments. This interface places a high degree of emphasis on simplicity and accessibility, and uses a more psychologically intuitive structure for the basic agent-environment loop, making it a useful tool for social scientists to investigate how learning and social interaction leads to the development and change of group dynamics. In this short paper, we outline the basic design philosophy and features of Sorrel.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2506.00228

Country:

North America > Canada > Ontario > Toronto (0.16)
North America > United States (0.14)

Genre: Research Report (0.40)

Industry: Education (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.34)

Literature Review Of Multi-Agent Debate For Problem-Solving

Tillmann, Arne

Multi-agent large language models (MA-LLMs) are a rapidly growing research area that leverages multiple interacting language agents to tackle complex tasks, outperforming single-agent large language models. This literature review synthesizes the latest research on agent profiles, communication structures, and decision-making processes, drawing insights from both traditional multi-agent systems and state-of-the-art MA-LLM studies. In doing so, it aims to address the lack of direct comparisons in the field, illustrating how factors like scalability, communication structure, and decision-making processes influence MA-LLM performance. By examining frequent practices and outlining current challenges, the review reveals that multi-agent approaches can yield superior results but also face elevated computational costs and under-explored challenges unique to MA-LLM. Overall, these findings provide researchers and practitioners with a roadmap for developing robust and efficient multi-agent AI solutions.

artificial intelligence, large language model, natural language, (17 more...)

2506.00066

Country:

Asia (1.00)
North America > United States (0.69)
Europe > Germany > Lower Saxony (0.28)

Genre:

Overview (1.00)
Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.69)

Sucholutsky, Ilia, Collins, Katherine M., Jacoby, Nori, Thompson, Bill D., Hawkins, Robert D.

Using LLMs to Advance the Cognitive Science of Collectives

Cognitive science and artificial intelligence (AI) have grown up together as fields. The computational models of human minds developed in cognitive science have long served as benchmarks to articulate what it means for a system to be flexibly intelligent. Recent advances in AI, particularly around large language models (LLMs), are creating new opportunities to reciprocate this influence. Already, LLMs are being offered as scalable "cognitive models" of human behavior [Binz et al., 2024], automatic analysts of unstructured psychological text [Rathje et al., 2024], and components in neurosymbolic cognitive architectures (e.g., [W ong et al., 2023]). However, most applications of LLMs to cognitive science have so far focused on individual cognition.

artificial intelligence, large language model, natural language, (14 more...)

2506.00052

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (1.00)

From Motion to Behavior: Hierarchical Modeling of Humanoid Generative Behavior Control

Zhang, Jusheng, Tang, Jinzhou, Liu, Sidi, Li, Mingyan, Zhang, Sheng, Wang, Jian, Wang, Keze

Human motion generative modeling or synthesis aims to characterize complicated human motions of daily activities in diverse real-world environments. However, current research predominantly focuses on either low-level, short-period motions or high-level action planning, without taking into account the hierarchical goal-oriented nature of human activities. In this work, we take a step forward from human motion generation to human behavior modeling, which is inspired by cognitive science. We present a unified framework, dubbed Generative Behavior Control (GBC), to model diverse human motions driven by various high-level intentions by aligning motions with hierarchical behavior plans generated by large language models (LLMs). Our insight is that human motions can be jointly controlled by task and motion planning in robotics, but guided by LLMs to achieve improved motion diversity and physical fidelity. Meanwhile, to overcome the limitations of existing benchmarks, i.e., lack of behavioral plans, we propose GBC-100K dataset annotated with a hierarchical granularity of semantic and motion plans driven by target goals. Our experiments demonstrate that GBC can generate more diverse and purposeful high-quality human motions with 10* longer horizons compared with existing methods when trained on GBC-100K, laying a foundation for future research on behavioral modeling of human motions. Our dataset and source code will be made publicly available.

large language model, machine learning, natural language, (20 more...)

2506.00043

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Neural Information Processing SystemsJun-2-2025, 01:11:33 GMT

Reviews: Robust Multi-agent Counterfactual Prediction

Reviewers found this paper to be an original and useful addition to the field of multi-agent games. While some of the presentation could be clarified (see specific reviewer comments, e.g. about the revelation game), there was a consensus that the paper is generally well-written and clear enough for publication, with the proposed corrections.

review, robust multi-agent counterfactual prediction

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.78)

Neural Information Processing SystemsJun-2-2025, 00:33:33 GMT

Reviews: Multi-Agent Common Knowledge Reinforcement Learning

My two biggest complaints center on 1) the illustrative single-step matrix game of section 4.1 and figure 3 and 2) the practical applications of MACKRL. 1) Since the primary role of the single-step matrix game in section 4.1 is illustrative, it should be much clearer what is going on. How are all 3 policies parameterized? What information does each have access to? What is the training data? First, let's focus on the JAL policy. As presented up until this point in the paper, JAL means centralized training *and* execution.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.51)