gatekeeper
Autonomy Architectures for Safe Planning in Unknown Environments Under Budget Constraints
Cherenson, Daniel M., Agrawal, Devansh R., Panagou, Dimitra
Mission planning can often be formulated as a constrained control problem under multiple path constraints (i.e., safety constraints) and budget constraints (i.e., resource expenditure constraints). In a priori unknown environments, verifying that an offline solution will satisfy the constraints for all time can be difficult, if not impossible. We present ReRoot, a novel sampling-based framework that enforces safety and budget constraints for nonlinear systems in unknown environments. The main idea is that ReRoot grows multiple reverse RRT* trees online, starting from renewal sets, i.e., sets where the budget constraints are renewed. The dynamically feasible backup trajectories guarantee safety and reduce resource expenditure, which provides a principled backup policy when integrated into the gatekeeper safety verification architecture. We demonstrate our approach in simulation with a fixed-wing UAV in a GNSS-denied environment with a budget constraint on localization error that can be renewed at visual landmarks.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
- Asia > Middle East > Jordan (0.04)
- Transportation (0.46)
- Aerospace & Defense (0.46)
R3R: Decentralized Multi-Agent Collision Avoidance with Infinite-Horizon Safety
Vielmetti, Thomas Marshall, Agrawal, Devansh R., Panagou, Dimitra
Existing decentralized methods for multi-agent motion planning lack formal, infinite-horizon safety guarantees, especially for communication-constrained systems. We present R3R, to our knowledge the first decentralized and asynchronous framework for multi-agent motion planning under distance-based communication constraints with infinite-horizon safety guarantees for systems of nonlinear agents. R3R's novelty lies in combining our gatekeeper safety framework with a geometric constraint called R-Boundedness, which together establish a formal link between an agent's communication radius and its ability to plan safely. We constrain trajectories to within a fixed planning radius that is a function of the agent's communication radius, which enables trajectories to be shown provably safe for all time, using only local information. Our algorithm is fully asynchronous, and ensures the forward invariance of these guarantees even in time-varying networks where agents asynchronously join, leave, and replan. We validate our approach in simulations of up to 128 Dubins vehicles, demonstrating 100% safety in dense, obstacle rich scenarios. Our results demonstrate that R3R's performance scales with agent density rather than problem size, providing a practical solution for scalable and provably safe multi-agent systems.
Communication Bias in Large Language Models: A Regulatory Perspective
Kuenzler, Adrian, Schmid, Stefan
Large language models (LLMs) are a prominent subset of AI, built on advanced neural network architectures that can generate new data, including text, images, and audio. LLMs utilize various technologies to identify patterns in a given set of training data, without requiring explicit instructions about what to look for [ 12, 35 ] . LLMs typically assume that the training data follows a probability distribution, and once they have identified existing patterns, they can generate new instances that are similar to the original data. By drawing from and combining training data, LLMs can create new content that tran scends the initial dataset [1 7 ].
- Asia > China > Hong Kong (0.04)
- Europe > Germany > Berlin (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (4 more...)
- Law > Statutes (1.00)
- Information Technology > Security & Privacy (1.00)
- Government > Regional Government > Europe Government (0.93)
- (2 more...)
Modeling Transformers as complex networks to analyze learning dynamics
The process by which Large Language Models (LLMs) acquire complex capabilities during training remains a key open question in mechanistic interpretability. This project investigates whether these learning dynamics can be characterized through the lens of Complex Network Theory (CNT). I introduce a novel methodology to represent a Transformer-based LLM as a directed, weighted graph where nodes are the model's computational components (attention heads and MLPs) and edges represent causal influence, measured via an intervention-based ablation technique. By tracking the evolution of this component-graph across 143 training checkpoints of the Pythia-14M model on a canonical induction task, I analyze a suite of graph-theoretic metrics. The results reveal that the network's structure evolves through distinct phases of exploration, consolidation, and refinement. Specifically, I identify the emergence of a stable hierarchy of information spreader components and a dynamic set of information gatherer components, whose roles reconfigure at key learning junctures. This work demonstrates that a component-level network perspective offers a powerful macroscopic lens for visualizing and understanding the self-organizing principles that drive the formation of functional circuits in LLMs.
- Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
- North America > United States (0.04)
- Europe > Italy > Lombardy > Milan (0.04)
Guarding Your Conversations: Privacy Gatekeepers for Secure Interactions with Cloud-Based AI Models
Uzor, GodsGift, Al-Qudah, Hasan, Ineza, Ynes, Serwadda, Abdul
--The interactive nature of Large Language Models (LLMs), which closely track user data and context, has prompted users to share personal and private information in unprecedented ways. Even when users opt out of allowing their data to be used for training, these privacy settings offer limited protection when LLM providers operate in jurisdictions with weak privacy laws, invasive government surveillance, or poor data security practices. In such cases, the risk of sensitive information, including Personally Identifiable Information (PII), being mishandled or exposed remains high. T o address this, we propose the concept of an "LLM gatekeeper", a lightweight, locally run model that filters out sensitive information from user queries before they are sent to the potentially untrustworthy, though highly capable, cloud-based LLM. Through experiments with human subjects, we demonstrate that this dual-model approach introduces minimal overhead while significantly enhancing user privacy, without compromising the quality of LLM responses. Large Language Models (LLMs) like ChatGPT have revolutionized digital interactions by providing personalized, context-aware responses that evolve with the dialogue. Unlike traditional information sources, LLMs' dynamic engagement often leads users to share increasingly personal details over multiple sessions, sometimes unknowingly. This gradual accumulation of sensitive information, compounded by the public's limited understanding of risks like neural network memorization, increases the likelihood of unintentional disclosure. The issue is further exacerbated when proprietary LLMs operate in jurisdictions with weak privacy regulations, limited data security, or invasive governmental surveillance.
- North America > United States > Texas > Lubbock County > Lubbock (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- Asia (0.04)
Online Safety under Multiple Constraints and Input Bounds using gatekeeper: Theory and Applications
Agrawal, Devansh R., Panagou, Dimitra
NCREASING use of robotic systems in real-world applications necessitates advanced controllers that ensure safety, robustness, and effectiveness in human-machine teaming [1]. This letter formalizes and builds upon our recent work on online safety verification and control [2], which introduces gatekeeper as a novel algorithmic component between the planner and the controller of the autonomous system. To briefly illustrate the principle behind gatekeeper, consider a Unmanned Aerial V ehicle (UA V) navigating an unknown environment. The UA V follows a nominal trajectory, generated by its planner and tracked by its controller. At each iteration, gatekeeper performs two key steps: (i) it evaluates the currently known safe set (derived from onboard sensing), and a backup set, which represents a region the UA V can retreat to if the nominal trajectory is predicted to exit the safe set in the future; (ii) it constructs a candidate trajectory by stitching together the nominal trajectory (up to a future time horizon) and a backup trajectory that leads safely into the backup set. The authors would like to acknowledge the support of the National Science Foundation (NSF) under grant no.
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Sequential Diagnosis with Language Models
Nori, Harsha, Daswani, Mayank, Kelly, Christopher, Lundberg, Scott, Ribeiro, Marco Tulio, Wilson, Marc, Liu, Xiaoxuan, Sounderajah, Viknesh, Carlson, Jonathan, Lungren, Matthew P, Gross, Bay, Hames, Peter, Suleyman, Mustafa, King, Dominic, Horvitz, Eric
Artificial intelligence holds great promise for expanding access to expert medical knowledge and reasoning. However, most evaluations of language models rely on static vignettes and multiple-choice questions that fail to reflect the complexity and nuance of evidence-based medicine in real-world settings. In clinical practice, physicians iteratively formulate and revise diagnostic hypotheses, adapting each subsequent question and test to what they've just learned, and weigh the evolving evidence before committing to a final diagnosis. To emulate this iterative process, we introduce the Sequential Diagnosis Benchmark, which transforms 304 diagnostically challenging New England Journal of Medicine clinicopathological conference (NEJM-CPC) cases into stepwise diagnostic encounters. A physician or AI begins with a short case abstract and must iteratively request additional details from a gatekeeper model that reveals findings only when explicitly queried. Performance is assessed not just by diagnostic accuracy but also by the cost of physician visits and tests performed. We also present the MAI Diagnostic Orchestrator (MAI-DxO), a model-agnostic orchestrator that simulates a panel of physicians, proposes likely differential diagnoses and strategically selects high-value, cost-effective tests. When paired with OpenAI's o3 model, MAI-DxO achieves 80% diagnostic accuracy--four times higher than the 20% average of generalist physicians. MAI-DxO also reduces diagnostic costs by 20% compared to physicians, and 70% compared to off-the-shelf o3. When configured for maximum accuracy, MAI-DxO achieves 85.5% accuracy. These performance gains with MAI-DxO generalize across models from the OpenAI, Gemini, Claude, Grok, DeepSeek, and Llama families. We highlight how AI systems, when guided to think iteratively and act judiciously, can advance diagnostic precision and cost-effectiveness in clinical care.
- North America > United States (0.28)
- South America > Argentina (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
- Health & Medicine > Diagnostic Medicine (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.93)
IDA-Bench: Evaluating LLMs on Interactive Guided Data Analysis
Li, Hanyu, Liu, Haoyu, Zhu, Tingyu, Guo, Tianyu, Zheng, Zeyu, Deng, Xiaotie, Jordan, Michael I.
Large Language Models (LLMs) show promise as data analysis agents, but existing benchmarks overlook the iterative nature of the field, where experts' decisions evolve with deeper insights of the dataset. To address this, we introduce IDA-Bench, a novel benchmark evaluating LLM agents in multi-round interactive scenarios. Derived from complex Kaggle notebooks, tasks are presented as sequential natural language instructions by an LLM-simulated user. Agent performance is judged by comparing its final numerical output to the human-derived baseline. Initial results show that even state-of-the-art coding agents (like Claude-3.7-thinking) succeed on < 50% of the tasks, highlighting limitations not evident in single-turn tests. This work underscores the need to improve LLMs' multi-round capabilities for building more reliable data analysis agents, highlighting the necessity of achieving a balance between instruction following and reasoning.
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > Jordan (0.14)
- North America > United States > California > Alameda County > Berkeley (0.04)
- (11 more...)
- Workflow (1.00)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Education (0.93)
- Transportation (0.69)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Superplatforms Have to Attack AI Agents
Lin, Jianghao, Zhu, Jiachen, Zhou, Zheli, Xi, Yunjia, Liu, Weiwen, Yu, Yong, Zhang, Weinan
Over the past decades, superplatforms, digital companies that integrate a vast range of third-party services and applications into a single, unified ecosystem, have built their fortunes on monopolizing user attention through targeted advertising and algorithmic content curation. Yet the emergence of AI agents driven by large language models (LLMs) threatens to upend this business model. Agents can not only free user attention with autonomy across diverse platforms and therefore bypass the user-attention-based monetization, but might also become the new entrance for digital traffic. Hence, we argue that superplatforms have to attack AI agents to defend their centralized control of digital traffic entrance. Specifically, we analyze the fundamental conflict between user-attention-based monetization and agent-driven autonomy through the lens of our gatekeeping theory. We show how AI agents can disintermediate superplatforms and potentially become the next dominant gatekeepers, thereby forming the urgent necessity for superplatforms to proactively constrain and attack AI agents. Moreover, we go through the potential technologies for superplatform-initiated attacks, covering a brand-new, unexplored technical area with unique challenges. We have to emphasize that, despite our position, this paper does not advocate for adversarial attacks by superplatforms on AI agents, but rather offers an envisioned trend to highlight the emerging tensions between superplatforms and AI agents. Our aim is to raise awareness and encourage critical discussion for collaborative solutions, prioritizing user interests and perserving the openness of digital ecosystems in the age of AI agents.
Enabling Safety for Aerial Robots: Planning and Control Architectures
Naveed, Kaleb Ben, Agrawal, Devansh R., Cherenson, Daniel M., Lee, Haejoon, Gilbert, Alia, Parwana, Hardik, Chipade, Vishnu S., Bentz, William, Panagou, Dimitra
To do this, it uses the concept of a perceived safe set B k, and of a backup safe set C k. The perceived safe set B k is constructed using the sensory information available up to time t k, and is possibly time-varying. Similarly, the backup safe set C k (also potentially time-varying) represents the set where the robot should terminate its trajectory in case a violation of safety is predicted. More specifically, at each iteration k, gatekeeper constructs a candidate trajectory (defined later) using newly available information, checks whether the candidate trajectory is valid, and if so, replaces the old committed trajectory with the new candidate trajectory. The candidate trajectory is constructed as a concatenation "stitching" of the nominal mission-optimized trajectory and of the backup trajectory, which by construction terminates at the backup set, which is a robustly controlled-invariant set. The candidate trajectory is considered valid if it lies strictly within the perceived safe set. Thus, if a candidate trajectory is valid, it can be safely tracked for all t t k, otherwise the committed trajectory from the prior iteration is used and tracked by the controller. Since the committed trajectory is updated with a candidate trajectory only if the latter is valid, it follows that the committed trajectory can always be safely tracked. B. Energy-Aware Planning: Introducing eware and meSch The Energy-A ware Filter ( eware) introduced in [2] is an application of gatekeeper .
- North America > United States > Michigan > Washtenaw County > Ann Arbor (0.15)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)