AITopics | failure trajectory

Collaborating Authors

failure trajectory

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

c0e19ce0dbabbc0d17a4f8d4324cc8e3-Paper.pdf

Neural Information Processing SystemsFeb-10-2026, 23:56:29 GMT

failure trajectory, safety specification, trajectory, (14 more...)

Neural Information Processing Systems

Country:

Asia > India > West Bengal > Kharagpur (0.04)
North America > United States > Montana (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

SRPO: Self-Referential Policy Optimization for Vision-Language-Action Models

Fei, Senyu, Wang, Siyin, Ji, Li, Li, Ao, Zhang, Shiduo, Liu, Liming, Hou, Jinlong, Gong, Jingjing, Zhao, Xianzhong, Qiu, Xipeng

arXiv.org Artificial IntelligenceDec-2-2025

Vision-Language-Action (VLA) models excel in robotic manipulation but are constrained by their heavy reliance on expert demonstrations, leading to demonstration bias and limiting performance. Reinforcement learning (RL) is a vital post-training strategy to overcome these limits, yet current VLA-RL methods, including group-based optimization approaches, are crippled by severe reward sparsity. Relying on binary success indicators wastes valuable information in failed trajectories, resulting in low training efficiency. To solve this, we propose Self-Referential Policy Optimization (SRPO), a novel VLA-RL framework. SRPO eliminates the need for external demonstrations or manual reward engineering by leveraging the model's own successful trajectories, generated within the current training batch, as a self-reference. This allows us to assign a progress-wise reward to failed attempts. A core innovation is the use of latent world representations to measure behavioral progress robustly. Instead of relying on raw pixels or requiring domain-specific fine-tuning, we utilize the compressed, transferable encodings from a world model's latent space. These representations naturally capture progress patterns across environments, enabling accurate, generalized trajectory comparison. Empirical evaluations on the LIBERO benchmark demonstrate SRPO's efficiency and effectiveness. Starting from a supervised baseline with 48.9% success, SRPO achieves a new state-of-the-art success rate of 99.2% in just 200 RL steps, representing a 103% relative improvement without any extra supervision. Furthermore, SRPO shows substantial robustness, achieving a 167% performance improvement on the LIBERO-Plus benchmark.

large language model, machine learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

2511.15605

Country: North America (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

FailSafe: Reasoning and Recovery from Failures in Vision-Language-Action Models

Lin, Zijun, Duan, Jiafei, Fang, Haoquan, Fox, Dieter, Krishna, Ranjay, Tan, Cheston, Wen, Bihan

arXiv.org Artificial IntelligenceOct-28-2025

Recent advances in robotic manipulation have integrated low-level robotic control into Vision-Language Models (VLMs), extending them into Vision-Language-Action (VLA) models. Although state-of-the-art VLAs achieve strong performance in downstream robotic applications, supported by large-scale crowd-sourced robot training data, they still inevitably encounter failures during execution. Enabling robots to reason and recover from unpredictable and abrupt failures remains a critical challenge. Existing robotic manipulation datasets, collected in either simulation or the real world, primarily provide only ground-truth trajectories, leaving robots unable to recover once failures occur. Moreover, the few datasets that address failure detection typically offer only textual explanations, which are difficult to utilize directly in VLA models. To address this gap, we introduce FailSafe, a novel failure generation and recovery system that automatically produces diverse failure cases paired with executable recovery actions. FailSafe can be seamlessly applied to any manipulation task in any simulator, enabling scalable creation of failure action data. To demonstrate its effectiveness, we fine-tune LLaVa-OneVision-7B (LLaVa-OV-7B) to build FailSafe-VLM. Experimental results show that FailSafe-VLM successfully helps robotic arms detect and recover from potential failures, improving the performance of three state-of-the-art VLA models (pi0-FAST, OpenVLA, OpenVLA-OFT) by up to 22.6% on average across several tasks in Maniskill. Furthermore, FailSafe-VLM could generalize across different spatial configurations, camera viewpoints, object and robotic embodiments. We plan to release the FailSafe code to the community.

failsafe-vlm, large language model, natural language, (15 more...)

arXiv.org Artificial Intelligence

2510.01642

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.47)
Information Technology > Communications > Social Media > Crowdsourcing (0.34)

Add feedback

From Perception Logs to Failure Modes: Language-Driven Semantic Clustering of Failures for Robot Safety

Gupta, Aryaman, Ciftci, Yusuf Umut, Bansal, Somil

arXiv.org Artificial IntelligenceOct-21-2025

As robotic systems become increasingly integrated into real-world environments -- ranging from autonomous vehicles to household assistants -- they inevitably encounter diverse and unstructured scenarios that lead to failures. While such failures pose safety and reliability challenges, they also provide rich perceptual data for improving future performance. However, manually analyzing large-scale failure datasets is impractical. In this work, we present a method for automatically organizing large-scale robotic failure data into semantically meaningful failure clusters, enabling scalable learning from failure without human supervision. Our approach leverages the reasoning capabilities of Multimodal Large Language Models (MLLMs), trained on internet-scale data, to infer high-level failure causes from raw perceptual trajectories and discover interpretable structure within uncurated failure logs. These semantic clusters reveal patterns and hypothesized causes of failure, enabling scalable learning from experience. We demonstrate that the discovered failure modes can guide targeted data collection for policy refinement, accelerating iterative improvement in agent policies and overall safety. Additionally, we show that these semantic clusters can benefit online failure monitoring systems, offering a lightweight yet powerful safeguard for real-time operation. We demonstrate that this framework enhances robot learning and robustness by transforming real-world failures into actionable and interpretable signals for adaptation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.0657

Country: North America > United States > California (0.28)

Genre: Research Report (0.64)

Industry: Transportation > Ground > Road (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
(3 more...)

Add feedback

Counterexample Guided RL Policy Refinement Using Bayesian Optimization

Neural Information Processing SystemsAug-17-2025, 04:44:31 GMT

Constructing Reinforcement Learning (RL) policies that adhere to safety requirements is an emerging field of study.

machine learning, reinforcement learning, trajectory, (14 more...)

Neural Information Processing Systems

Country:

Asia > India > West Bengal > Kharagpur (0.04)
North America > United States > Montana (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Failure Probability Estimation for Black-Box Autonomous Systems using State-Dependent Importance Sampling Proposals

Delecki, Harrison, Katz, Sydney M., Kochenderfer, Mykel J.

arXiv.org Machine LearningDec-2-2024

Estimating the probability of failure is a critical step in developing safety-critical autonomous systems. Direct estimation methods such as Monte Carlo sampling are often impractical due to the rarity of failures in these systems. Existing importance sampling approaches do not scale to sequential decision-making systems with large state spaces and long horizons. We propose an adaptive importance sampling algorithm to address these limitations. Our method minimizes the forward Kullback-Leibler divergence between a state-dependent proposal distribution and a relaxed form of the optimal importance sampling distribution. Our method uses Markov score ascent methods to estimate this objective. We evaluate our approach on four sequential systems and show that it provides more accurate failure probability estimates than baseline Monte Carlo and importance sampling techniques. This work is open sourced.

disturbance, proposal, trajectory, (16 more...)

arXiv.org Machine Learning

2412.02154

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)

Genre: Research Report (1.00)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.62)

Add feedback

Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents

Song, Yifan, Yin, Da, Yue, Xiang, Huang, Jie, Li, Sujian, Lin, Bill Yuchen

arXiv.org Artificial IntelligenceJul-10-2024

Large Language Models (LLMs) have become integral components in various autonomous agent systems. In this study, we present an exploration-based trajectory optimization approach, referred to as ETO. This learning method is designed to enhance the performance of open LLM agents. Contrary to previous studies that exclusively train on successful expert trajectories, our method allows agents to learn from their exploration failures. This leads to improved performance through an iterative optimization framework. During the exploration phase, the agent interacts with the environment while completing given tasks, gathering failure trajectories to create contrastive trajectory pairs. In the subsequent training phase, the agent utilizes these trajectory preference pairs to update its policy using contrastive learning methods like DPO. This iterative cycle of exploration and training fosters continued improvement in the agents. Our experiments on three complex tasks demonstrate that ETO consistently surpasses baseline performance by a large margin. Furthermore, an examination of task-solving efficiency and potential in scenarios lacking expert trajectory underscores the effectiveness of our approach.

agent, expert trajectory, trajectory, (15 more...)

arXiv.org Artificial Intelligence

2403.02502

Country:

North America > United States > Ohio (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Degradation Modeling and Prognostic Analysis Under Unknown Failure Modes

Fu, Ying, Huh, Ye Kwon, Liu, Kaibo

arXiv.org Artificial IntelligenceFeb-29-2024

Operating units often experience various failure modes in complex systems, leading to distinct degradation paths. Relying on a prognostic model trained on a single failure mode may lead to poor generalization performance across multiple failure modes. Therefore, accurately identifying the failure mode is of critical importance. Current prognostic approaches either ignore failure modes during degradation or assume known failure mode labels, which can be challenging to acquire in practice. Moreover, the high dimensionality and complex relations of sensor signals make it challenging to identify the failure modes accurately. To address these issues, we propose a novel failure mode diagnosis method that leverages a dimension reduction technique called UMAP (Uniform Manifold Approximation and Projection) to project and visualize each unit's degradation trajectory into a lower dimension. Then, using these degradation trajectories, we develop a time series-based clustering method to identify the training units' failure modes. Finally, we introduce a monotonically constrained prognostic model to predict the failure mode labels and RUL of the test units simultaneously using the obtained failure modes of the training units. The proposed prognostic model provides failure mode-specific RUL predictions while preserving the monotonic property of the RUL predictions across consecutive time steps. We evaluate the proposed model using a case study with the aircraft gas turbine engine dataset.

failure mode, rul prediction, trajectory, (14 more...)

arXiv.org Artificial Intelligence

2402.19294

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Texas > Dallas County > Dallas (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Aerospace & Defense (1.00)
Energy (0.66)
Transportation > Air (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Model-based Validation as Probabilistic Inference

Delecki, Harrison, Corso, Anthony, Kochenderfer, Mykel J.

arXiv.org Artificial IntelligenceMay-16-2023

Estimating the distribution over failures is a key step in validating autonomous systems. Existing approaches focus on finding failures for a small range of initial conditions or make restrictive assumptions about the properties of the system under test. We frame estimating the distribution over failure trajectories for sequential systems as Bayesian inference. Our model-based approach represents the distribution over failure trajectories using rollouts of system dynamics and computes trajectory gradients using automatic differentiation. Our approach is demonstrated in an inverted pendulum control system, an autonomous vehicle driving scenario, and a partially observable lunar lander. Sampling is performed using an off-the-shelf implementation of Hamiltonian Monte Carlo with multiple chains to capture multimodality and gradient smoothing for safe trajectories. In all experiments, we observed improvements in sample efficiency and parameter space coverage compared to black-box baseline approaches. This work is open sourced.

machine learning, reinforcement learning, trajectory, (19 more...)

arXiv.org Artificial Intelligence

2305.0993

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)

Genre: Research Report (0.64)

Industry: Transportation > Air (0.89)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)

Add feedback

Adaptive Failure Search Using Critical States from Domain Experts

Du, Peter, Driggs-Campbell, Katherine

arXiv.org Artificial IntelligenceApr-1-2023

Uncovering potential failure cases is a crucial step in the validation of safety critical systems such as autonomous vehicles. Failure search may be done through logging substantial vehicle miles in either simulation or real world testing. Due to the sparsity of failure events, naive random search approaches require significant amounts of vehicle operation hours to find potential system weaknesses. As a result, adaptive searching techniques have been proposed to efficiently explore and uncover failure trajectories of an autonomous policy in simulation. Adaptive Stress Testing (AST) is one such method that poses the problem of failure search as a Markov decision process and uses reinforcement learning techniques to find high probability failures. However, this formulation requires a probability model for the actions of all agents in the environment. In systems where the environment actions are discrete and dependencies among agents exist, it may be infeasible to fully characterize the distribution or find a suitable proxy. This work proposes the use of a data driven approach to learn a suitable classifier that tries to model how humans identify {critical states and use this to guide failure search in AST. We show that the incorporation of critical states into the AST framework generates failure scenarios with increased safety violations in an autonomous driving policy with a discrete action space.

artificial intelligence, machine learning, trajectory, (17 more...)

arXiv.org Artificial Intelligence

2304.00365

Country:

North America > United States > Illinois (0.04)
Asia > India (0.04)

Genre: Research Report (0.40)

Industry:

Transportation > Ground > Road (0.90)
Automobiles & Trucks (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback