AITopics | substate

Collaborating Authors

substate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

AutoEval: A Practical Framework for Autonomous Evaluation of Mobile Agents

Sun, Jiahui, Hua, Zhichao, Xia, Yubin

arXiv.org Artificial IntelligenceMar-4-2025

Accurate and systematic evaluation of mobile agents can significantly advance their development and real-world applicability. However, existing benchmarks for mobile agents lack practicality and scalability due to the extensive manual effort required to define task reward signals and implement corresponding evaluation codes. To this end, we propose AutoEval, an autonomous agent evaluation framework that tests a mobile agent without any manual effort. First, we design a Structured Substate Representation to describe the UI state changes while agent execution, such that task reward signals can be automatically generated. Second, we utilize a Judge System that can autonomously evaluate agents' performance given the automatically generated task reward signals. By providing only a task description, our framework evaluates agents with fine-grained performance feedback to that task without any extra manual effort. We implement a prototype of our framework and validate the automatically generated task reward signals, finding over 93% coverage to human-annotated reward signals. Moreover, to prove the effectiveness of our autonomous Judge System, we manually verify its judge results and demonstrate that it achieves 94% accuracy. Finally, we evaluate the state-of-the-art mobile agents using our framework, providing detailed insights into their performance characteristics and limitations.

agent, pagenode, substate, (14 more...)

arXiv.org Artificial Intelligence

2503.02403

Country:

North America > Mexico (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Model Checking of vGOAL

Yang, Yi, Holvoet, Tom

arXiv.org Artificial IntelligenceJun-24-2024

Developing autonomous decision-making requires safety assurance. Agent programming languages like AgentSpeak and Gwendolen provide tools for programming autonomous decision-making. However, despite numerous efforts to apply model checking to these languages, challenges persist such as a faithful semantic mapping between agent programs and the generated models, efficient model generation, and efficient model checking. As an extension of the agent programming language GOAL, vGOAL has been proposed to formally specify autonomous decisions with an emphasis on safety. This paper tackles the mentioned challenges through two automated model-checking processes for vGOAL: one for Computation Tree Logic and another for Probabilistic Computation Tree Logic. Compared with the existing model-checking approaches of agent programming languages, it has three main advantages. First, it efficiently performs automated model-checking analysis for a given vGOAL specification, including efficiently generating input models for NuSMV and Storm and leveraging these efficient model checkers. Second, the semantic equivalence is established for both nondeterministic models and probabilistic models of vGOAL: from vGOAL to transition systems or DTMCs. Third, an algorithm is proposed for efficiently detecting errors, which is particularly useful for vGOAL specifications that describe complex scenarios. Validation and experiments in a real-world autonomous logistic system with three autonomous mobile robots illustrate both the efficiency and practical usability of the automated CTL and PCTL model-checking process for vGOAL.

agent, transition system, vgoal specification, (13 more...)

arXiv.org Artificial Intelligence

2406.17206

Country:

Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Add feedback

Compact Belief State Representation for Task Planning

Safronov, Evgenii, Colledanchise, Michele, Natale, Lorenzo

arXiv.org Artificial IntelligenceAug-21-2020

Task planning in a probabilistic belief state domains allows generating complex and robust execution policies in those domains affected by state uncertainty. The performance of a task planner relies on the belief state representation. However, current belief state representation becomes easily intractable as the number of variables and execution time grows. To address this problem, we developed a novel belief state representation based on cartesian product and union operations over belief substates. These two operations and single variable assignment nodes form And-Or directed acyclic graph of Belief State (AOBS). We show how to apply actions with probabilistic outcomes and measure the probability of conditions holding over belief state. We evaluated AOBS performance in simulated forward state space exploration. We compared the size of AOBS with the size of Binary Decision Diagrams (BDD) that were previously used to represent belief state. We show that AOBS representation is not only much more compact than a full belief state but it also scales better than BDD for most of the cases.

artificial intelligence, belief revision, belief state, (18 more...)

arXiv.org Artificial Intelligence

2008.10386

Country: Europe > Italy > Liguria > Genoa (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (1.00)

Add feedback