Goto

Collaborating Authors

 equivalence query





Active Learning of Symbolic Automata Over Rational Numbers

Hagedorn, Sebastian, Muñoz, Martín, Riveros, Cristian, Icarte, Rodrigo Toro

arXiv.org Artificial Intelligence

Automata learning has many applications in artificial intelligence and software engineering. Central to these applications is the $L^*$ algorithm, introduced by Angluin. The $L^*$ algorithm learns deterministic finite-state automata (DFAs) in polynomial time when provided with a minimally adequate teacher. Unfortunately, the $L^*$ algorithm can only learn DFAs over finite alphabets, which limits its applicability. In this paper, we extend $L^*$ to learn symbolic automata whose transitions use predicates over rational numbers, i.e., over infinite and dense alphabets. Our result makes the $L^*$ algorithm applicable to new settings like (real) RGX, and time series. Furthermore, our proposed algorithm is optimal in the sense that it asks a number of queries to the teacher that is at most linear with respect to the number of transitions, and to the representation size of the predicates.


Towards Verified Code Reasoning by LLMs

Sistla, Meghana, Balakrishnan, Gogul, Rondon, Pat, Cambronero, José, Tufano, Michele, Chandra, Satish

arXiv.org Artificial Intelligence

While LLM-based agents are able to tackle a wide variety of code reasoning questions, the answers are not always correct. This prevents the agent from being useful in situations where high precision is desired: (1) helping a software engineer understand a new code base, (2) helping a software engineer during code review sessions, and (3) ensuring that the code generated by an automated code generation system meets certain requirements (e.g. fixes a bug, improves readability, implements a feature). As a result of this lack of trustworthiness, the agent's answers need to be manually verified before they can be trusted. Manually confirming responses from a code reasoning agent requires human effort and can result in slower developer productivity, which weakens the assistance benefits of the agent. In this paper, we describe a method to automatically validate the answers provided by a code reasoning agent by verifying its reasoning steps. At a very high level, the method consists of extracting a formal representation of the agent's response and, subsequently, using formal verification and program analysis tools to verify the agent's reasoning steps. We applied this approach to a benchmark set of 20 uninitialized variable errors detected by sanitizers and 20 program equivalence queries. For the uninitialized variable errors, the formal verification step was able to validate the agent's reasoning on 13/20 examples, and for the program equivalence queries, the formal verification step successfully caught 6/8 incorrect judgments made by the agent.






On Exact Learning of $d$-Monotone Functions

Bshouty, Nader H.

arXiv.org Artificial Intelligence

In this paper, we study the learnability of the Boolean class of $d$-monotone functions $f:{\cal X}\to\{0,1\}$ from membership and equivalence queries, where $({\cal X},\le)$ is a finite lattice. We show that the class of $d$-monotone functions that are represented in the form $f=F(g_1,g_2,\ldots,g_d)$, where $F$ is any Boolean function $F:\{0,1\}^d\to\{0,1\}$ and $g_1,\ldots,g_d:{\cal X}\to \{0,1\}$ are any monotone functions, is learnable in time $\sigma({\cal X})\cdot (size(f)/d+1)^{d}$ where $\sigma({\cal X})$ is the maximum sum of the number of immediate predecessors in a chain from the largest element to the smallest element in the lattice ${\cal X}$ and $size(f)=size(g_1)+\cdots+size(g_d)$, where $size(g_i)$ is the number of minimal elements in $g_i^{-1}(1)$. For the Boolean function $f:\{0,1\}^n\to\{0,1\}$, the class of $d$-monotone functions that are represented in the form $f=F(g_1,g_2,\ldots,g_d)$, where $F$ is any Boolean function and $g_1,\ldots,g_d$ are any monotone DNF, is learnable in time $O(n^2)\cdot (size(f)/d+1)^{d}$ where $size(f)=size(g_1)+\cdots+size(g_d)$. In particular, this class is learnable in polynomial time when $d$ is constant. Additionally, this class is learnable in polynomial time when $size(g_i)$ is constant for all $i$ and $d=O(\log n)$.