AITopics

2505.1476

Country:

North America > United States (0.28)
Europe > Spain (0.28)

Genre: Research Report (0.64)

Industry: Telecommunications (0.35)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

WIREDMay-21-2025, 16:19:37 GMT

Jack Dorsey's Block Made an AI Agent to Boost Its Own Productivity

At a company-wide hackathon this month, developers at finance firm Block built a dizzying number of prototype tools including a database debugger, a program for identifying duplicated code, and an app that automates Bitcoin support. The sudden productivity boost was driven by Goose, an artificial intelligence agent developed by Block several months ago that can help with coding and other work like knocking together data visualizations or mocking up new product features. "We've always had really strong hack weeks, but this one was at another level," says Jackie Brosamer, who leads the AI and data platform at Block. "We have tens of ideas that we're looking to bring to production." Goose helped developers at Block to develop a new agent-to-agent communication server at the hackathon.

goose, jack dorsey, own productivity, (7 more...)

WIRED

Industry: Banking & Finance (0.80)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Embedded Mean Field Reinforcement Learning for Perimeter-defense Game

Wang, Li, Yu, Xin, Lv, Xuxin, Ai, Gangzheng, Wu, Wenjun

With the rapid advancement of unmanned aerial vehicles (UAVs) and missile technologies, perimeter-defense game between attackers and defenders for the protection of critical regions have become increasingly complex and strategically significant across a wide range of domains. However, existing studies predominantly focus on small-scale, simplified two-dimensional scenarios, often overlooking realistic environmental perturbations, motion dynamics, and inherent heterogeneity--factors that pose substantial challenges to real-world applicability. To bridge this gap, we investigate large-scale heterogeneous perimeter-defense game in a three-dimensional setting, incorporating realistic elements such as motion dynamics and wind fields. We derive the Nash equilibrium strategies for both attackers and defenders, characterize the victory regions, and validate our theoretical findings through extensive simulations. To tackle large-scale heterogeneous control challenges in defense strategies, we propose an Embedded Mean-Field Actor-Critic (EMFAC) framework. EMFAC leverages representation learning to enable high-level action aggregation in a mean-field manner, supporting scalable coordination among defenders. Furthermore, we introduce a lightweight agent-level attention mechanism based on reward representation, which selectively filters observations and mean-field information to enhance decision-making efficiency and accelerate convergence in large-scale tasks. Extensive simulations across varying scales demonstrate the effectiveness and adaptability of EMFAC, which outperforms established baselines in both convergence speed and overall performance. To further validate practicality, we test EMFAC in small-scale real-world experiments and conduct detailed analyses, offering deeper insights into the framework's effectiveness in complex scenarios.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2505.14209

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > United States > Maryland > Prince George's County > College Park (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Government > Military (1.00)
Education (0.68)
Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Model Cards for AI Teammates: Comparing Human-AI Team Familiarization Methods for High-Stakes Environments

Bowers, Ryan, Agbeyibor, Richard, Kolb, Jack, Feigh, Karen

-- We compare three methods of familiarizing a human with an artificial intelligence (AI) teammate ("agent") prior to operation in a collaborative, fast-paced intelligence, surveillance, and reconnaissance (ISR) environment. In a between-subjects user study (n=60), participants either read documentation about the agent, trained alongside the agent prior to the mission, or were given no familiarization. Results showed that the most valuable information about the agent included details of its decision-making algorithms and its relative strengths and weaknesses compared to the human. This information allowed the familiarization groups to form sophisticated team strategies more quickly than the control group. Documentation-based familiarization led to the fastest adoption of these strategies, but also biased participants towards risk-averse behavior that prevented high scores. Participants familiarized through direct interaction were able to infer much of the same information through observation, and were more willing to take risks and experiment with different control modes, but reported weaker understanding of the agent's internal processes. Significant differences were seen between individual participants' risk tolerance and methods of AI interaction, which should be considered when designing human-AI control interfaces. Based on our findings, we recommend a human-AI team familiarization method that combines AI documentation, structured in-situ training, and exploratory interaction. I. INTRODUCTION Governments have long sought to reduce reliance on human operators in high-stakes domains such as aircraft surveillance, coastal scanning, and mountainous search-and-rescue. Simultaneously, the capabilities of deep learning techniques and accessibility of high-powered compute resources have made autonomous teammates technically viable for many use cases. In recent years governments have begun supporting research to apply embodied artificial intelligence (AI) platforms to reduce the number of humans sent into high-risk scenarios by increasing the level of authority granted to AI systems in human-AI teams.

artificial intelligence, deep learning, machine learning, (17 more...)

2505.13773

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government (0.70)
Aerospace & Defense (0.68)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

MLZero: A Multi-Agent System for End-to-end Machine Learning Automation

Fang, Haoyang, Han, Boran, Erickson, Nick, Zhang, Xiyuan, Zhou, Su, Dagar, Anirudh, Zhang, Jiani, Turkmen, Ali Caner, Hu, Cuixiong, Rangwala, Huzefa, Wu, Ying Nian, Wang, Bernie, Karypis, George

Existing AutoML systems have advanced the automation of machine learning (ML); however, they still require substantial manual configuration and expert input, particularly when handling multimodal data. We introduce MLZero, a novel multi-agent framework powered by Large Language Models (LLMs) that enables end-to-end ML automation across diverse data modalities with minimal human intervention. A cognitive perception module is first employed, transforming raw multimodal inputs into perceptual context that effectively guides the subsequent workflow. To address key limitations of LLMs, such as hallucinated code generation and outdated API knowledge, we enhance the iterative code generation process with semantic and episodic memory. MLZero demonstrates superior performance on MLE-Bench Lite, outperforming all competitors in both success rate and solution quality, securing six gold medals. Additionally, when evaluated on our Multimodal AutoML Agent Benchmark, which includes 25 more challenging tasks spanning diverse data modalities, MLZero outperforms the competing methods by a large margin with a success rate of 0.92 (+263.6\%) and an average rank of 2.28. Our approach maintains its robust effectiveness even with a compact 8B LLM, outperforming full-size systems from existing solutions.

large language model, machine learning, natural language, (13 more...)

2505.13941

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material > Course Syllabus & Notes (0.92)

Industry:

Transportation > Air (1.00)
Information Technology (1.00)
Education (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Prompt Stability Matters: Evaluating and Optimizing Auto-Generated Prompt in General-Purpose Systems

Chen, Ke, Zhou, Yufei, Zhang, Xitong, Wang, Haohan

Automatic prompt generation plays a crucial role in enabling general-purpose multi-agent systems to perform diverse tasks autonomously. Existing methods typically evaluate prompts based on their immediate task performance, overlooking the intrinsic qualities that determine their reliability. This outcome-centric view not only limits interpretability but also fails to account for the inherent stochasticity of large language models (LLMs). In this work, we bring attention to prompt stability-the consistency of model responses across repeated executions-as a key factor for building robust and effective prompt generation systems. To quantify this, we propose semantic stability as a criterion for assessing the response consistency of prompts, and fine-tune a LLaMA-based evaluator to measure it automatically across tasks. These components have enabled us to develop the first stability-aware general-purpose prompt generation system that leverages stability feedback to iteratively enhance both prompt quality and system-level performance. Furthermore, we establish a logical chain between prompt stability and task success by analyzing the structural dependencies within our system, proving stability as a necessary condition for effective system-level execution. Empirical results across general and domain-specific tasks demonstrate that our stability-aware framework improves both accuracy and output consistency. By shifting the focus from one-off results to persistent reliability, our work offers a new perspective on prompt design and contributes practical tools for building more trustworthy general-purpose systems.

artificial intelligence, natural language, stability, (16 more...)

2505.13546

Country: North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Bhardwaj, Devansh, Beniwal, Arjun, Chaudhari, Shreyas, Kalyan, Ashwin, Rajpurohit, Tanmay, Narasimhan, Karthik R., Deshpande, Ameet, Murahari, Vishvak

Agent Context Protocols Enhance Collective Inference

AI agents have become increasingly adept at complex tasks such as coding, reasoning, and multimodal understanding. However, building generalist systems requires moving beyond individual agents to collective inference -- a paradigm where multi-agent systems with diverse, task-specialized agents complement one another through structured communication and collaboration. Today, coordination is usually handled with imprecise, ad-hoc natural language, which limits complex interaction and hinders interoperability with domain-specific agents. We introduce Agent context protocols (ACPs): a domain- and agent-agnostic family of structured protocols for agent-agent communication, coordination, and error handling. ACPs combine (i) persistent execution blueprints -- explicit dependency graphs that store intermediate agent outputs -- with (ii) standardized message schemas, enabling robust and fault-tolerant multi-agent collective inference. ACP-powered generalist systems reach state-of-the-art performance: 28.3 % accuracy on AssistantBench for long-horizon web assistance and best-in-class multimodal technical reports, outperforming commercial AI systems in human evaluation. ACPs are highly modular and extensible, allowing practitioners to build top-tier generalist agents quickly.

agent, artificial intelligence, query, (15 more...)

2505.14569

Country:

Europe (1.00)
North America > United States > California (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine (0.68)
Banking & Finance (0.68)
Consumer Products & Services > Travel (0.49)
Transportation > Ground > Road (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Reachability Barrier Networks: Learning Hamilton-Jacobi Solutions for Smooth and Flexible Control Barrier Functions

Kim, Matthew, Sharpless, William, Jeong, Hyun Joe, Tonkens, Sander, Bansal, Somil, Herbert, Sylvia

Recent developments in autonomous driving and robotics underscore the necessity of safety-critical controllers. Control barrier functions (CBFs) are a popular method for appending safety guarantees to a general control framework, but they are notoriously difficult to generate beyond low dimensions. Existing methods often yield non-differentiable or inaccurate approximations that lack integrity, and thus fail to ensure safety. In this work, we use physics-informed neural networks (PINNs) to generate smooth approximations of CBFs by computing Hamilton-Jacobi (HJ) optimal control solutions. These reachability barrier networks (RBNs) avoid traditional dimensionality constraints and support the tuning of their conservativeness post-training through a parameterized discount term. To ensure robustness of the discounted solutions, we leverage conformal prediction methods to derive probabilistic safety guarantees for RBNs. We demonstrate that RBNs are highly accurate in low dimensions, and safer than the standard neural CBF approach in high dimensions. Namely, we showcase the RBNs in a 9D multi-vehicle collision avoidance problem where it empirically proves to be 5.5x safer and 1.9x less conservative than the neural CBFs, offering a promising method to synthesize CBFs for general nonlinear autonomous systems.

artificial intelligence, controller, machine learning, (18 more...)

2505.11755

Country: North America > United States (0.28)

Genre: Research Report (0.84)

Industry: Transportation (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Bi-level Mean Field: Dynamic Grouping for Large-Scale MARL

Zheng, Yuxuan, Zhou, Yihe, Xu, Feiyang, Song, Mingli, Liu, Shunyu

Large-scale Multi-Agent Reinforcement Learning (MARL) often suffers from the curse of dimensionality, as the exponential growth in agent interactions significantly increases computational complexity and impedes learning efficiency. To mitigate this, existing efforts that rely on Mean Field (MF) simplify the interaction landscape by approximating neighboring agents as a single mean agent, thus reducing overall complexity to pairwise interactions. However, these MF methods inevitably fail to account for individual differences, leading to aggregation noise caused by inaccurate iterative updates during MF learning. In this paper, we propose a Bi-level Mean Field (BMF) method to capture agent diversity with dynamic grouping in large-scale MARL, which can alleviate aggregation noise via bi-level interaction. Specifically, BMF introduces a dynamic group assignment module, which employs a Variational AutoEncoder (VAE) to learn the representations of agents, facilitating their dynamic grouping over time. Furthermore, we propose a bi-level interaction module to model both inter- and intra-group interactions for effective neighboring aggregation. Experiments across various tasks demonstrate that the proposed BMF yields results superior to the state-of-the-art methods.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

2505.06706

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.94)

Quantum-Enhanced Hybrid Reinforcement Learning Framework for Dynamic Path Planning in Autonomous Systems

Tomar, Sahil, Alam, Shamshe, Kumar, Sandeep, Mathur, Amit

In this paper, a novel quantum classical hybrid framework is proposed that synergizes quantum with Classical Reinforcement Learning. By leveraging the inherent parallelism of quantum computing, the proposed approach generates robust Q tables and specialized turn cost estimations, which are then integrated with a classical Reinforcement Learning pipeline. The Classical Quantum fusion results in rapid convergence of training, reducing the training time significantly and improved adaptability in scenarios featuring static, dynamic, and moving obstacles. Simulator based evaluations demonstrate significant enhancements in path efficiency, trajectory smoothness, and mission success rates, underscoring the potential of framework for real time, autonomous navigation in complex and unpredictable environments. Furthermore, the proposed framework was tested beyond simulations on practical scenarios, including real world map data such as the IIT Delhi campus, reinforcing its potential for real time, autonomous navigation in complex and unpredictable environments.

machine learning, obstacle, reinforcement learning, (16 more...)

2504.2066

Genre: Research Report (1.00)

Industry:

Transportation (1.00)
Information Technology > Robotics & Automation (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.86)