primary agent
Agentic Metacognition: Designing a "Self-Aware" Low-Code Agent for Failure Prediction and Human Handoff
The inherent non-deterministic nature of autonomous agents, particularly within low-code/no-code (LCNC) environments, presents significant reliability challenges. Agents can become trapped in unforeseen loops, generate inaccurate outputs, or encounter unrecoverable failures, leading to user frustration and a breakdown of trust. This report proposes a novel architectural pattern to address these issues: the integration of a secondary, "metacognitive" layer that actively monitors the primary LCNC agent. Inspired by human introspection, this layer is designed to predict impending task failures based on a defined set of triggers, such as excessive latency or repetitive actions. Upon predicting a failure, the metacognitive agent proactively initiates a human handoff, providing the user with a clear summary of the agent's "thought process" and a detailed explanation of why it could not proceed. An empirical analysis of a prototype system demonstrates that this approach significantly increases the overall task success rate. However, this performance gain comes with a notable increase in computational overhead. The findings reframe human handoffs not as an admission of defeat but as a core design feature that enhances system resilience, improves user experience, and builds trust by providing transparency into the agent's internal state. The report discusses the practical and ethical implications of this approach and identifies key directions for future research.
Solving Multiagent Path Finding on Highly Centralized Networks
Fioravantes, Foivos, Knop, Dušan, Křišťan, Jan Matyáš, Melissinos, Nikolaos, Opler, Michal, Vu, Tung Anh
The Mutliagent Path Finding (MAPF) problem consists of identifying the trajectories that a set of agents should follow inside a given network in order to reach their desired destinations as soon as possible, but without colliding with each other. We aim to minimize the maximum time any agent takes to reach their goal, ensuring optimal path length. In this work, we complement a recent thread of results that aim to systematically study the algorithmic behavior of this problem, through the parameterized complexity point of view. First, we show that MAPF is NP-hard when the given network has a star-like topology (bounded vertex cover number) or is a tree with $11$ leaves. Both of these results fill important gaps in our understanding of the tractability of this problem that were left untreated in the recent work of [Fioravantes et al. Exact Algorithms and Lowerbounds for Multiagent Path Finding: Power of Treelike Topology. AAAI'24]. Nevertheless, our main contribution is an exact algorithm that scales well as the input grows (FPT) when the topology of the given network is highly centralized (bounded distance to clique). This parameter is significant as it mirrors real-world networks. In such environments, a bunch of central hubs (e.g., processing areas) are connected to only few peripheral nodes.
Watson: A Cognitive Observability Framework for the Reasoning of Foundation Model-Powered Agents
Rombaut, Benjamin, Masoumzadeh, Sogol, Vasilevski, Kirill, Lin, Dayi, Hassan, Ahmed E.
As foundation models (FMs) play an increasingly prominent role in complex software systems, such as FM-powered agentic software (i.e., Agentware), they introduce significant challenges for developers regarding observability. Unlike traditional software, agents operate autonomously, using extensive data and opaque implicit reasoning, making it difficult to observe and understand their behavior during runtime, especially when they take unexpected actions or encounter errors. In this paper, we highlight the limitations of traditional operational observability in the context of FM-powered software, and introduce cognitive observability as a new type of required observability that has emerged for such innovative systems. We then propose a novel framework that provides cognitive observability into the implicit reasoning processes of agents (a.k.a. reasoning observability), and demonstrate the effectiveness of our framework in boosting the debuggability of Agentware and, in turn, the abilities of an Agentware through a case study on AutoCodeRover, a cuttingedge Agentware for autonomous program improvement.
- North America > Canada > Quebec > Montreal (0.14)
- North America > Canada > Ontario > Kingston (0.04)
Good Parenting is all you need -- Multi-agentic LLM Hallucination Mitigation
Kwartler, Ted, Berman, Matthew, Aqrawi, Alan
This study explores the ability of Large Language Model (LLM) agents to detect and correct hallucinations in AI-generated content. A primary agent was tasked with creating a blog about a fictional Danish artist named Flipfloppidy, which was then reviewed by another agent for factual inaccuracies. Most LLMs hallucinated the existence of this artist. Across 4,900 test runs involving various combinations of primary and reviewing agents, advanced AI models such as Llama3-70b and GPT-4 variants demonstrated near-perfect accuracy in identifying hallucinations and successfully revised outputs in 85% to 100% of cases following feedback. These findings underscore the potential of advanced AI models to significantly enhance the accuracy and reliability of generated content, providing a promising approach to improving AI workflow orchestration.
- Europe > Denmark > Capital Region > Copenhagen (0.05)
- North America > Canada (0.04)
- Asia > Singapore (0.04)
- Asia > Indonesia > Bali (0.04)
Polaris: A Safety-focused LLM Constellation Architecture for Healthcare
Mukherjee, Subhabrata, Gamble, Paul, Ausin, Markel Sanz, Kant, Neel, Aggarwal, Kriti, Manjunath, Neha, Datta, Debajyoti, Liu, Zhengliang, Ding, Jiayuan, Busacca, Sophia, Bianco, Cezanne, Sharma, Swapnil, Lasko, Rae, Voisard, Michelle, Harneja, Sanchay, Filippova, Darya, Meixiong, Gerry, Cha, Kevin, Youssefi, Amir, Buvanesh, Meyhaa, Weingram, Howard, Bierman-Lytle, Sebastian, Mangat, Harpreet Singh, Parikh, Kim, Godil, Saad, Miller, Alex
We develop Polaris, the first safety-focused LLM constellation for real-time patient-AI healthcare conversations. Unlike prior LLM works in healthcare focusing on tasks like question answering, our work specifically focuses on long multi-turn voice conversations. Our one-trillion parameter constellation system is composed of several multibillion parameter LLMs as co-operative agents: a stateful primary agent that focuses on driving an engaging conversation and several specialist support agents focused on healthcare tasks performed by nurses to increase safety and reduce hallucinations. We develop a sophisticated training protocol for iterative co-training of the agents that optimize for diverse objectives. We train our models on proprietary data, clinical care plans, healthcare regulatory documents, medical manuals, and other medical reasoning documents. We align our models to speak like medical professionals, using organic healthcare conversations and simulated ones between patient actors and experienced nurses. This allows our system to express unique capabilities such as rapport building, trust building, empathy and bedside manner. Finally, we present the first comprehensive clinician evaluation of an LLM system for healthcare. We recruited over 1100 U.S. licensed nurses and over 130 U.S. licensed physicians to perform end-to-end conversational evaluations of our system by posing as patients and rating the system on several measures. We demonstrate Polaris performs on par with human nurses on aggregate across dimensions such as medical safety, clinical readiness, conversational quality, and bedside manner. Additionally, we conduct a challenging task-based evaluation of the individual specialist support agents, where we demonstrate our LLM agents significantly outperform a much larger general-purpose LLM (GPT-4) as well as from its own medium-size class (LLaMA-2 70B).
- South America > Uruguay > Maldonado > Maldonado (0.04)
- North America > United States > New York (0.04)
- Personal > Interview (0.67)
- Research Report > Experimental Study (0.45)
Social Planning: Achieving Goals by Altering Others' Mental States
Pearce, Chris (University of Auckland) | Meadows, Ben (University of Auckland) | Langley, Pat (University of Auckland) | Barley, Mike (University of Auckland)
In this paper, we discuss a computational approach to the cognitivetask of social planning. First, we specify a class of planningproblems that involve an agent who attempts to achieve its goalsby altering other agents' mental states. Next, we describe SFPS,a flexible problem solver that generates social plans of this sort,including ones that include deception and reasoning about otheragents' beliefs. We report the results for experiments on socialscenarios that involve different levels of sophistication and thatdemonstrate both SFPS's capabilities and the sources of its power.Finally, we discuss how our approach to social planning has beeninformed by earlier work in the area and propose directions foradditional research on the topic.
- North America > United States > Kansas > Stevens County (0.14)
- Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- (6 more...)