Goto

Collaborating Authors

 error recovery


Conditional Multi-Stage Failure Recovery for Embodied Agents

arXiv.org Artificial Intelligence

Embodied agents performing complex tasks are susceptible to execution failures, motivating the need for effective failure recovery mechanisms. In this work, we introduce a conditional multistage failure recovery framework that employs zero-shot chain prompting. The framework is structured into four error-handling stages, with three operating during task execution and one functioning as a post-execution reflection phase. Our approach utilises the reasoning capabilities of LLMs to analyse execution challenges within their environmental context and devise strategic solutions. We evaluate our method on the TfD benchmark of the TEACH dataset and achieve state-of-the-art performance, outperforming a baseline without error recovery by 11.5% and surpassing the strongest existing model by 19%.


Human-AI Interaction and User Satisfaction: Empirical Evidence from Online Reviews of AI Products

arXiv.org Artificial Intelligence

Human-AI Interaction (HAI) guidelines and design principles have become increasingly important in both industry and academia to guide the development of AI systems that align with user needs and expectations. However, large-scale empirical evidence on how HAI principles shape user satisfaction in practice remains limited. This study addresses that gap by analyzing over 100,000 user reviews of AI-related products from G2.com, a leading review platform for business software and services. Based on widely adopted industry guidelines, we identify seven core HAI dimensions and examine their coverage and sentiment within the reviews. We find that the sentiment on four HAI dimensions-adaptability, customization, error recovery, and security-is positively associated with overall user satisfaction. Moreover, we show that engagement with HAI dimensions varies by professional background: Users with technical job roles are more likely to discuss system-focused aspects, such as reliability, while non-technical users emphasize interaction-focused features like customization and feedback. Interestingly, the relationship between HAI sentiment and overall satisfaction is not moderated by job role, suggesting that once an HAI dimension has been identified by users, its effect on satisfaction is consistent across job roles.


Learning to Recover from Plan Execution Errors during Robot Manipulation: A Neuro-symbolic Approach

arXiv.org Artificial Intelligence

Automatically detecting and recovering from failures is an important but challenging problem for autonomous robots. Most of the recent work on learning to plan from demonstrations lacks the ability to detect and recover from errors in the absence of an explicit state representation and/or a (sub-) goal check function. We propose an approach (blending learning with symbolic search) for automated error discovery and recovery, without needing annotated data of failures. Central to our approach is a neuro-symbolic state representation, in the form of dense scene graph, structured based on the objects present within the environment. This enables efficient learning of the transition function and a discriminator that not only identifies failures but also localizes them facilitating fast re-planning via computation of heuristic distance function. We also present an anytime version of our algorithm, where instead of recovering to the last correct state, we search for a sub-goal in the original plan minimizing the total distance to the goal given a re-planning budget. Experiments on a physics simulator with a variety of simulated failures show the effectiveness of our approach compared to existing baselines, both in terms of efficiency as well as accuracy of our recovery mechanism.


Dissociation of Faithful and Unfaithful Reasoning in LLMs

arXiv.org Artificial Intelligence

Large language models (LLMs) improve their performance in downstream tasks when they generate Chain of Thought reasoning text before producing an answer. Our research investigates how LLMs recover from errors in Chain of Thought, reaching the correct final answer despite mistakes in the reasoning text. Through analysis of these error recovery behaviors, we find evidence for unfaithfulness in Chain of Thought, but we also identify many clear examples of faithful error recovery behaviors. We identify factors that shift LLM recovery behavior: LLMs recover more frequently from obvious errors and in contexts that provide more evidence for the correct answer. However, unfaithful recoveries show the opposite behavior, occurring more frequently for more difficult error positions. Our results indicate that there are distinct mechanisms driving faithful and unfaithful error recoveries. Our results challenge the view that LLM reasoning is a uniform, coherent process.


The 3 AIs needed to Create truly Intelligent Assistants and Chatbots (Part II)

#artificialintelligence

In Part I, we talked about the state of Artificial Intelligence (AI) and how it is not enough to guarantee success. There are three other types of AIs needed for applications to be truly intelligent. We already covered the first -- Aided Introductions (aka Onboarding). Now, let's look at the second one: At its core, a "conversational experience" is based on the premise of good communication between the system and the user. Since users should always have the freedom to decide their preferred modality -- tapping, typing or speaking -- it is the system's responsibility to always make its current state visible.


Language access to distributed data with error recovery

Classics

This paper discusses an effort in the application of artificial intelligence to the access of data from a large, distributed data base over a computer network. A running system is described that provides real-time access over the ARPANET to a data base distributed over several machines. The system accepts a rather wide range of natural language questions about the data, plans a sequence of appropriate queries to the data base management system to answer the question, determines on which machine(s) to carry out the queries, establishes links to those machines over the ARPANET, monitors the prosecution of the queries and recovers from certain errors in execution, and prepares a relevant answer. In addition to the components that make up the demonstration system, more sophisticated functionally equivalent components are discussed and proposed. The work described in this paper represents the joint efforts of an integrated, energetic group at SRI. Members of this group include Rich Fikes (now at Xerox PARC), Koichi Furukawa (now at ETL).