Goto

Collaborating Authors

 Agents


Improving Consistency in Vehicle Trajectory Prediction Through Preference Optimization

arXiv.org Artificial Intelligence

Trajectory prediction is an essential step in the pipeline of an autonomous vehicle. Inaccurate or inconsistent predictions regarding the movement of agents in its surroundings lead to poorly planned maneuvers and potentially dangerous situations for the end-user. Current state-of-the-art deep-learning-based trajectory prediction models can achieve excellent accuracy on public datasets. However, when used in more complex, interactive scenarios, they often fail to capture important interdependencies between agents, leading to inconsistent predictions among agents in the traffic scene. Inspired by the efficacy of incorporating human preference into large language models, this work fine-tunes trajectory prediction models in multi-agent settings using preference optimization. By taking as input automatically calculated preference rankings among predicted futures in the fine-tuning process, our experiments--using state-of-the-art models on three separate datasets--show that we are able to significantly improve scene consistency while minimally sacrificing trajectory prediction accuracy and without adding any excess computational requirements at inference time.


An AI-native experimental laboratory for autonomous biomolecular engineering

arXiv.org Artificial Intelligence

Autonomous scientific research, capable of independently conducting complex experiments and serving non-specialists, represents a long-held aspiration. Achieving it requires a fundamental paradigm shift driven by artificial intelligence (AI). While autonomous experimental systems are emerging, they remain confined to areas featuring singular objectives and well-defined, simple experimental workflows, such as chemical synthesis and catalysis. We present an AI-native autonomous laboratory, targeting highly complex scientific experiments for applications like autonomous biomolecular engineering. This system autonomously manages instrumentation, formulates experiment-specific procedures and optimization heuristics, and concurrently serves multiple user requests. Founded on a co-design philosophy of models, experiments, and instruments, the platform supports the co-evolution of AI models and the automation system. This establishes an end-to-end, multi-user autonomous laboratory that handles complex, multi-objective experiments across diverse instrumentation. Our autonomous laboratory supports fundamental nucleic acid functions-including synthesis, transcription, amplification, and sequencing. It also enables applications in fields such as disease diagnostics, drug development, and information storage. Without human intervention, it autonomously optimizes experimental performance to match state-of-the-art results achieved by human scientists. In multi-user scenarios, the platform significantly improves instrument utilization and experimental efficiency. This platform paves the way for advanced biomaterials research to overcome dependencies on experts and resource barriers, establishing a blueprint for science-as-a-service at scale.


STELLA: Self-Evolving LLM Agent for Biomedical Research

arXiv.org Artificial Intelligence

Modern biomedical research is defined by both immense opportunity and staggering complexity. As a cornerstone of science, it generates vast quantities of data from large-scale experiments, but this progress is hampered by a research landscape that is profoundly fragmented (1-3). The knowledge, specialized software, and databases required to make discoveries are numerous, constantly evolving, and dispersed, forcing researchers to expend significant time and effort on the manual and labor-intensive task of discovering, learning, and integrating these disparate resources. While the advent of AI agents holds the promise of automating this intricate work (4-6), current systems inherit a critical limitation: they typically rely on manually curated, static toolsets (7-14). This approach is inefficient, fails to scale, and cannot keep pace with the rapid evolution of biomedical science, leaving the agents perpetually behind the cutting edge. This raises a critical question: Can we design a self-evolving agent that transcends these limitations by automatically discovering and integrating new tools, continuously updating its knowledge base, and iteratively upgrading its own capabilities through direct experience? Here we present STELLA, a generalist biomedical AI agent designed around the core principle of self-evolution (15). STELLA learns and improves from every problem it solves, continuously enhancing its own reasoning strategies and technical abilities.


The Download: AI agents hype, and Google's electricity plans

MIT Technology Review

At Google's I/O 2025 event in May, the company showed off a digital assistant that didn't just answer questions; it helped work on a bicycle repair by finding a matching user manual, locating a YouTube tutorial, and even calling a local store to ask about a part, all with minimal human nudging. Such capabilities could soon extend far outside the Google ecosystem. The vision is exciting: Intelligent software agents that act like digital coworkers, booking your flights, rescheduling meetings, filing expenses, and talking to each other behind the scenes to get things done. But if we're not careful, we're going to derail the whole idea before it has a chance to deliver real benefits. And when expectations get out of hand, a backlash isn't far behind.


Introducing the NASA Onboard Artificial Intelligence Research (OnAIR) platform: an interview with Evana Gizzi

AIHub

The Thirty-Seventh Annual Conference on Innovative Applications of Artificial Intelligence (IAAI 2025), which took place alongside AAAI 2025, serves as a showcase for successful applications and novel uses of AI. One such application is the Onboard Artificial Intelligence Research (OnAIR) platform, introduced by Evana Gizzi and colleagues in their paper OnAIR: Applications of The NASA On-Board Artificial Intelligence Research Platform. This open-source software pipeline and cognitive architecture tool has been designed to aid space research and missions. We spoke to Evana, Artificial Intelligence Research Lead at NASA Goddard Space Flight Center, about the OnAIR platform, some of the particular challenges of deploying AI-based solutions in space, and how the tool has been used so far. OnAIR is an open-source software pipeline and cognitive architecture tool.


Don't let hype about AI agents get ahead of reality

MIT Technology Review

Let's start with the term "agent" itself. Right now, it's being slapped on everything from simple scripts to sophisticated AI workflows. There's no shared definition, which leaves plenty of room for companies to market basic automation as something much more advanced. That kind of "agentwashing" doesn't just confuse customers; it invites disappointment. We don't necessarily need a rigid standard, but we do need clearer expectations about what these systems are supposed to do, how autonomously they operate, and how reliably they perform. And reliability is the next big challenge.


Using multi-agent architecture to mitigate the risk of LLM hallucinations

arXiv.org Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have significantly enhanced the ability to develop systems that comprehend customer requests and determine the necessary actions to fulfill them. In today's competitive market, delivering superior custome r service is crucial for attracting and retaining clients. Satisfied customers are more likely to become loyal, repeat buyers, and advocate for your brand, leading to increased revenue and market share (Strikingly, 2024) . In industries characterized by intense competition, implementing LLM - based services that effectively address customer needs and enhance satisfaction is becoming a key determinant of a company's growth and success. By leveraging LLMs, businesses can deliver more personalized, efficient, and scalable support, and thereby improve customer experience and foster loyalty (Iopex, 2024) .


Automated Vehicles Should be Connected with Natural Language

arXiv.org Artificial Intelligence

Multi-agent collaborative driving promises improvements in traffic safety and efficiency through collective perception and decision making. However, existing communication media -- including raw sensor data, neural network features, and perception results -- suffer limitations in bandwidth efficiency, information completeness, and agent interoperability. Moreover, traditional approaches have largely ignored decision-level fusion, neglecting critical dimensions of collaborative driving. In this paper we argue that addressing these challenges requires a transition from purely perception-oriented data exchanges to explicit intent and reasoning communication using natural language. Natural language balances semantic density and communication bandwidth, adapts flexibly to real-time conditions, and bridges heterogeneous agent platforms. By enabling the direct communication of intentions, rationales, and decisions, it transforms collaborative driving from reactive perception-data sharing into proactive coordination, advancing safety, efficiency, and transparency in intelligent transportation systems.


LANet: A Lane Boundaries-Aware Approach For Robust Trajectory Prediction

arXiv.org Artificial Intelligence

--Accurate motion forecasting is critical for safe and efficient autonomous driving, enabling vehicles to predict future trajectories and make informed decisions in complex traffic scenarios. Most of the current designs of motion prediction models are based on the major representation of lane centerlines, which limits their capability to capture critical road environments and traffic rules and constraints. In this work, we propose an enhanced motion forecasting model informed by multiple vector map elements, including lane boundaries and road edges, that facilitates a richer and more complete representation of driving environments. An effective feature fusion strategy is developed to merge information in different vector map components, where the model learns holistic information on road structures and their interactions with agents. Since encoding more information about the road environment increases memory usage and is computationally expensive, we developed an effective pruning mechanism that filters the most relevant map connections to the target agent, ensuring computational efficiency while maintaining essential spatial and semantic relationships for accurate trajectory prediction. Overcoming the limitations of lane centerline-based models, our method provides a more informative and efficient representation of the driving environment and advances the state of the art for autonomous vehicle motion forecasting. We verify our approach with extensive experiments on the Argoverse 2 motion forecasting dataset, where our method maintains competitiveness on A V2 while achieving improved performance.


Adapting Probabilistic Risk Assessment for AI

arXiv.org Artificial Intelligence

Modern general-purpose artificial intelligence (AI) systems present an urgent risk management challenge, as their rapidly evolving capabilities and potential for catastrophic harm outpace our ability to reliably assess their risks. Current methods often rely on selective testing and undocumented assumptions about risk priorities, frequently failing to make a serious attempt at assessing the set of pathways through which AI systems pose direct or indirect risks to society and the biosphere. This paper introduces the probabilistic risk assessment (PRA) for AI framework, adapting established PRA techniques from high-reliability industries (e.g., nuclear power, aerospace) for the new challenges of advanced AI. The framework guides assessors in identifying potential risks, estimating likelihood and severity bands, and explicitly documenting evidence, underlying assumptions, and analyses at appropriate granularities. The framework's implementation tool synthesizes the results into a risk report card with aggregated risk estimates from all assessed risks. It introduces three methodological advances: (1) Aspect-oriented hazard analysis provides systematic hazard coverage guided by a first-principles taxonomy of AI system aspects (e.g. capabilities, domain knowledge, affordances); (2) Risk pathway modeling analyzes causal chains from system aspects to societal impacts using bidirectional analysis and incorporating prospective techniques; and (3) Uncertainty management employs scenario decomposition, reference scales, and explicit tracing protocols to structure credible projections with novelty or limited data. Additionally, the framework harmonizes diverse assessment methods by integrating evidence into comparable, quantified absolute risk estimates for lifecycle decisions. We have implemented this as a workbook tool for AI developers, evaluators, and regulators.