subagent
Probabilistic Modeling of Latent Agentic Substructures in Deep Neural Networks
Lee, Su Hyeong, Kondor, Risi, Ngo, Richard
We develop a theory of intelligent agency grounded in probabilistic modeling for neural models. Agents are represented as outcome distributions with epistemic utility given by log score, and compositions are defined through weighted logarithmic pooling that strictly improves every member's welfare. We prove that strict unanimity is impossible under linear pooling or in binary outcome spaces, but possible with three or more outcomes. Our framework admits recursive structure via cloning invariance, continuity, and openness, while tilt-based analysis rules out trivial duplication. Finally, we formalize an agentic alignment phenomenon in LLMs using our theory: eliciting a benevolent persona ("Luigi'") induces an antagonistic counterpart ("Waluigi"), while a manifest-then-suppress Waluigi strategy yields strictly larger first-order misalignment reduction than pure Luigi reinforcement alone. These results clarify how developing a principled mathematical framework for how subagents can coalesce into coherent higher-level entities provides novel implications for alignment in agentic AI systems.
The AI Data Scientist
Akimov, Farkhad, Nwadike, Munachiso Samuel, Iklassov, Zangir, Takáč, Martin
Imagine decision-makers uploading data and, within minutes, receiving clear, actionable insights delivered straight to their fingertips. That is the promise of the AI Data Scientist, an autonomous Agent powered by large language models (LLMs) that closes the gap between evidence and action. Rather than simply writing code or responding to prompts, it reasons through questions, tests ideas, and delivers end-to-end insights at a pace far beyond traditional workflows. Guided by the scientific tenet of the hypothesis, this Agent uncovers explanatory patterns in data, evaluates their statistical significance, and uses them to inform predictive modeling. It then translates these results into recommendations that are both rigorous and accessible. At the core of the AI Data Scientist is a team of specialized LLM Subagents, each responsible for a distinct task such as data cleaning, statistical testing, validation, and plain-language communication. These Subagents write their own code, reason about causality, and identify when additional data is needed to support sound conclusions. Together, they achieve in minutes what might otherwise take days or weeks, enabling a new kind of interaction that makes deep data science both accessible and actionable.
- Information Technology (0.68)
- Banking & Finance (0.68)
- Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.46)
- Education > Educational Setting > Online (0.46)
GeoFlow: Agentic Workflow Automation for Geospatial Tasks
Bhattaram, Amulya, Chung, Justin, Chung, Stanley, Gupta, Ranit, Ramamoorthy, Janani, Gullapalli, Kartikeya, Marculescu, Diana, Stamoulis, Dimitrios
We present GeoFlow, a method that automatically generates agentic workflows for geospatial tasks. Unlike prior work that focuses on reasoning decomposition and leaves API selection implicit, our method provides each agent with detailed tool-calling objectives to guide geospatial API invocation at runtime. GeoFlow increases agentic success by 6.8% and reduces token usage by up to fourfold across major LLM families compared to state-of-the-art approaches.
- Research Report > New Finding (0.47)
- Research Report > Promising Solution (0.34)
Towards resilient cities: A hybrid simulation framework for risk mitigation through data driven decision making
Carraminana, David, Bernardos, Ana M., Besada, Juan A., Casar, Jose R.
Providing a comprehensive view of the city operation and offering useful metrics for decision making is a well known challenge for urban risk analysis systems. Existing systems are, in many cases, generalizations of previous domain specific tools and or methodologies that may not cover all urban interdependencies and makes it difficult to have homogeneous indicators. In order to overcome this limitation while seeking for effective support to decision makers, this article introduces a novel hybrid simulation framework for risk mitigation. The framework is built on a proposed city concept that considers urban space as a Complex Adaptive System composed by interconnected Critical Infrastructures. In this concept, a Social System, which models daily patterns and social interactions of the citizens in the Urban Landscape, drives the CIs demand to configure the full city picture. The frameworks hybrid design integrates agent based and network based modeling by breaking down city agents into system dependent subagents, to enable both inter and intra system interaction simulation, respectively. A layered structure of indicators at different aggregation levels is also developed, to ensure that decisions are not only data driven but also explainable. Therefore, the proposed simulation framework can serve as a DSS tool that allows the quantitative analysis of the impact of threats at different levels. First, system level metrics can be used to get a broad view on the city resilience. Then, agent level metrics back those figures and provide better explainability. On implementation, the proposed framework enables component reusability (for eased coding), simulation federation (enabling the integration of existing system oriented simulators), discrete simulation in accelerated time (for rapid scenario simulation) and decision oriented visualization (for informed outputs).
- Research Report (0.50)
- Workflow (0.46)
- Water & Waste Management > Water Management (1.00)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- (8 more...)
TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation
Wang, Yaoxiang, Wu, Zhiyong, Yao, Junfeng, Su, Jinsong
The emergence of Large Language Models (LLMs) like ChatGPT has inspired the development of LLM-based agents capable of addressing complex, real-world tasks. However, these agents often struggle during task execution due to methodological constraints, such as error propagation and limited adaptability. To address this issue, we propose a multi-agent framework based on dynamic Task Decomposition and Agent Generation (TDAG). This framework dynamically decomposes complex tasks into smaller subtasks and assigns each to a specifically generated subagent, thereby enhancing adaptability in diverse and unpredictable real-world tasks. Simultaneously, existing benchmarks often lack the granularity needed to evaluate incremental progress in complex, multi-step tasks. In response, we introduce ItineraryBench in the context of travel planning, featuring interconnected, progressively complex tasks with a fine-grained evaluation system. ItineraryBench is designed to assess agents' abilities in memory, planning, and tool usage across tasks of varying complexity. Our experimental results reveal that TDAG significantly outperforms established baselines, showcasing its superior adaptability and context awareness in complex task scenarios.
Inroads into Autonomous Network Defence using Explained Reinforcement Learning
Foley, Myles, Wang, Mia, M, Zoe, Hicks, Chris, Mavroudis, Vasilios
Computer network defence is a complicated task that has necessitated a high degree of human involvement. However, with recent advancements in machine learning, fully autonomous network defence is becoming increasingly plausible. This paper introduces an end-to-end methodology for studying attack strategies, designing defence agents and explaining their operation. First, using state diagrams, we visualise adversarial behaviour to gain insight about potential points of intervention and inform the design of our defensive models. We opt to use a set of deep reinforcement learning agents trained on different parts of the task and organised in a shallow hierarchy. Our evaluation shows that the resulting design achieves a substantial performance improvement compared to prior work. Finally, to better investigate the decision-making process of our agents, we complete our analysis with a feature ablation and importance study.
- Asia (0.04)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Information Technology > Security & Privacy (1.00)
- Leisure & Entertainment > Games > Computer Games (0.46)
- Government > Military > Cyberwarfare (0.46)
Deceptive Reinforcement Learning in Model-Free Domains
This paper investigates deceptive reinforcement learning for privacy preservation in model-free and continuous action space domains. In reinforcement learning, the reward function defines the agent's objective. In adversarial scenarios, an agent may need to both maximise rewards and keep its reward function private from observers. Recent research presented the ambiguity model (AM), which selects actions that are ambiguous over a set of possible reward functions, via pre-trained $Q$-functions. Despite promising results in model-based domains, our investigation shows that AM is ineffective in model-free domains due to misdirected state space exploration. It is also inefficient to train and inapplicable in continuous action space domains. We propose the deceptive exploration ambiguity model (DEAM), which learns using the deceptive policy during training, leading to targeted exploration of the state space. DEAM is also applicable in continuous action spaces. We evaluate DEAM in discrete and continuous action space path planning environments. DEAM achieves similar performance to an optimal model-based version of AM and outperforms a model-free version of AM in terms of path cost, deceptiveness and training efficiency. These results extend to the continuous domain.
- Research Report > Experimental Study (0.69)
- Research Report > New Finding (0.46)
Low impact agency: review and discussion
The problem of artificial intelligence safety can be seen as can be seen as ensuring an agent with the power of causing harm chooses to not do so. In the limit, the agent can be powerful enough that causing existential catastrophe is within its limit, and it has incentives to doing so [6], so our task is to guarantee that it chooses not to. A possible approach is penalize changes in the world caused by agent, leading to the agent not causing catastrophe because that leads to large changes in the world[24]. The hope is that this is a relatively easy objective to align the agent with, as opposed to aligning it with the full range of human values. So, our desideratum is that the AI achieves something while doing as little in the world as possible .
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
Developing cooperative policies for multi-stage tasks
Erskine, Jordan, Lehnert, Chris
This paper proposes the Cooperative Soft Actor Critic (CSAC) method of enabling consecutive reinforcement learning agents to cooperatively solve a long time horizon multi-stage task. This method is achieved by modifying the policy of each agent to maximise both the current and next agent's critic. Cooperatively maximising each agent's critic allows each agent to take actions that are beneficial for its task as well as subsequent tasks. Using this method in a multi-room maze domain, the cooperative policies were able to outperform both uncooperative policies as well as a single agent trained across the entire domain. CSAC achieved a success rate of at least 20\% higher than the uncooperative policies, and converged on a solution at least 4 times faster than the single agent.
- Oceania > Australia > Queensland > Brisbane (0.04)
- Asia > Middle East > Jordan (0.04)
MSDF: A Deep Reinforcement Learning Framework for Service Function Chain Migration
Chen, Ruoyun, Lu, Hancheng, Lu, Yujiao, Liu, Jinxue
Under dynamic traffic, service function chain (SFC) migration is considered as an effective way to improve resource utilization. However, the lack of future network information leads to non-optimal solutions, which motivates us to study reinforcement learning based SFC migration from a long-term perspective. In this paper, we formulate the SFC migration problem as a minimization problem with the objective of total network operation cost under constraints of users' quality of service. We firstly design a deep Q-network based algorithm to solve single SFC migration problem, which can adjust migration strategy online without knowing future information. Further, a novel multi-agent cooperative framework, called MSDF, is proposed to address the challenge of considering multiple SFC migration on the basis of single SFC migration. MSDF reduces the complexity thus accelerates the convergence speed, especially in large scale networks. Experimental results demonstrate that MSDF outperforms typical heuristic algorithms under various scenarios.
- North America > United States (0.04)
- Asia > China > Anhui Province > Hefei (0.04)