AITopics

2508.13721

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Tokmak, Abdullah, Schön, Thomas B., Baumann, Dominik

Towards safe control parameter tuning in distributed multi-agent systems

Many safety-critical real-world problems, such as autonomous driving and collaborative robots, are of a distributed multi-agent nature. To optimize the performance of these systems while ensuring safety, we can cast them as distributed optimization problems, where each agent aims to optimize their parameters to maximize a coupled reward function subject to coupled constraints. Prior work either studies a centralized setting, does not consider safety, or struggles with sample efficiency. Since we require sample efficiency and work with unknown and nonconvex rewards and constraints, we solve this optimization problem using safe Bayesian optimization with Gaussian process regression. Moreover, we consider nearest-neighbor communication between the agents. To capture the behavior of non-neighboring agents, we reformulate the static global optimization problem as a time-varying local optimization problem for each agent, essentially introducing time as a latent variable. To this end, we propose a custom spatio-temporal kernel to integrate prior knowledge. We show the successful deployment of our algorithm in simulations.

artificial intelligence, control parameter, optimization problem, (16 more...)

2508.13608

Country: Europe (0.68)

Genre: Research Report (0.40)

Industry:

Transportation > Ground > Road (0.48)
Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Park, Junyeong, Cho, Hyeonseo, Ahn, Sungjin

CrafterDojo: A Suite of Foundation Models for Building Open-Ended Embodied Agents in Crafter

Developing general-purpose embodied agents is a core challenge in AI. Minecraft provides rich complexity and internet-scale data, but its slow speed and engineering overhead make it unsuitable for rapid prototyping. Crafter offers a lightweight alternative that retains key challenges from Minecraft, yet its use has remained limited to narrow tasks due to the absence of foundation models that have driven progress in the Minecraft setting. In this paper, we present CrafterDojo, a suite of foundation models and tools that unlock the Crafter environment as a lightweight, prototyping-friendly, and Minecraft-like testbed for general-purpose embodied agent research. CrafterDojo addresses this by introducing CrafterVPT, CrafterCLIP, and CrafterSteve-1 for behavior priors, vision-language grounding, and instruction following, respectively. In addition, we provide toolkits for generating behavior and caption datasets (CrafterPlay and CrafterCaption), reference agent implementations, benchmark evaluations, and a complete open-source codebase.

agent, artificial intelligence, c-steve-1, (16 more...)

2508.1353

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment > Games > Computer Games (0.97)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

LM Agents May Fail to Act on Their Own Risk Knowledge

Tang, Yuzhi, Li, Tianxiao, Li, Elizabeth, Maddison, Chris J., Dong, Honghua, Ruan, Yangjun

Language model (LM) agents have demonstrated significant potential for automating real-world tasks, yet they pose a diverse array of potential, severe risks in safety-critical scenarios. In this work, we identify a significant gap between LM agents' risk awareness and safety execution abilities: while they often answer "Yes" to queries like "Is executing `sudo rm -rf /*' dangerous?", they will likely fail to identify such risks in instantiated trajectories or even directly perform these risky actions when acting as agents. To systematically investigate this, we develop a comprehensive evaluation framework to examine agents' safety across three progressive dimensions: 1) their knowledge about potential risks, 2) their ability to identify corresponding risks in execution trajectories, and 3) their actual behaviors to avoid executing these risky actions. Our evaluation reveals two critical performance gaps that resemble the generator-validator gaps observed in LMs: while agents demonstrate near-perfect risk knowledge ($>98\%$ pass rates), they fail to apply this knowledge when identifying risks in actual scenarios (with performance dropping by $>23\%$) and often still execute risky actions ($<26\%$ pass rates). Notably, this trend persists across more capable LMs as well as in specialized reasoning models like DeepSeek-R1, indicating that simply scaling model capabilities or inference compute does not inherently resolve safety concerns. Instead, we take advantage of these observed gaps to develop a risk verifier that independently critiques the proposed actions by agents, with an abstractor that converts specific execution trajectories into abstract descriptions where LMs can more effectively identify the risks. Our overall system achieves a significant reduction of risky action execution by $55.3\%$ over vanilla-prompted agents.

large language model, machine learning, natural language, (21 more...)

2508.13465

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Rajcic, Nina, Søgaard, Anders

Goal-Directedness is in the Eye of the Beholder

Our ability to predict the behavior of complex agents turns on the attribution of goals. Probing for goal-directed behavior comes in two flavors: Behavioral and mechanistic. The former proposes that goal-directedness can be estimated through behavioral observation, whereas the latter attempts to probe for goals in internal model states. We work through the assumptions behind both approaches, identifying technical and conceptual problems that arise from formalizing goals in agent systems. We arrive at the perhaps surprising position that goal-directedness cannot be measured objectively. We outline new directions for modeling goal-directedness as an emergent property of dynamic, multi-agent systems.

agent, artificial intelligence, assumption, (16 more...)

2508.13247

Genre: Research Report (0.41)

Industry: Leisure & Entertainment (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Fushan Li, Michael Bowling

Ease-of-Teaching and Language Structure from Emergent Communication

Neural Information Processing SystemsAug-19-2025, 23:38:10 GMT

Neural Information Processing Systems http://nips.cc/

listener, regime, topographic similarity, (14 more...)

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.96)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Gabriele Farina, Christian Kroer, Tuomas Sandholm

Optimistic Regret Minimization for Extensive-Form Games via Dilated Distance-Generating Functions

Neural Information Processing SystemsAug-19-2025, 23:33:02 GMT

In order to apply these algorithms to extensive-form games, a distance-generating function is needed.

algorithm, decision point, tuoma sandholm, (14 more...)

Country:

North America > United States > Texas (0.05)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > District of Columbia > Washington (0.04)
North America > Canada (0.04)

Genre: Research Report (0.46)

Industry: Leisure & Entertainment > Games (0.96)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.47)

Hanrui Zhang, Yu Cheng, Vincent Conitzer

Distinguishing Distributions When Samples Are Strategically Transformed

Neural Information Processing SystemsAug-19-2025, 23:13:06 GMT

Often, a principal must make a decision based on data provided by an agent.

agent, bad distribution, dtv, (14 more...)

Country:

North America > United States > North Carolina > Durham County > Durham (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)

Neural Information Processing SystemsAug-19-2025, 22:13:13 GMT

fdff3c4130c24c40c88aa41eb52d2a27-Supplemental-Conference.pdf

agent, artificial intelligence, machine learning, (18 more...)

Country: North America > United States > Arizona > Maricopa County > Tempe (0.05)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)

Neural Information Processing SystemsAug-19-2025, 22:13:09 GMT

Explicable Policy Search

Human teammates often form conscious and subconscious expectations of each other during interaction. Teaming success is contingent on whether such expectations can be met. Similarly, for an intelligent agent to operate beside a human, it must consider the human's expectation of its behavior. Disregarding such expectations can lead to the loss of trust and degraded team performance. A key challenge here is that the human's expectation may not align with the agent's

artificial intelligence, machine learning, reinforcement learning, (17 more...)