Santolucito, Mark
Research Vision: Multi-Agent Path Planning for Cops And Robbers Via Reactive Synthesis
Fishell, William, Rodriguez, Andoni, Santolucito, Mark
Reactive synthesis is classically modeled as a game, though often applied to domains such as arbiter circuits and communication protocols [1]. We aim to show how reactive synthesis can be applied to a literal game - cops and robbers - to generate strategies for agents in the game. We propose a game that requires the coordination of multiple agents in a space of datatypes and operations that are richer than is easily captured by the traditional Linear Temporal Logic (LTL) approach of synthesis over Boolean streams [2]. In particular, we draw inspiration from prior work on Coordination Synthesis [3], LTL moduluo theories (LTLt) [4], and Temporal Stream Logic Moduluo theories (TSL-MT) [5, 6] to describe our problem and potential solution spaces. The traditional game [7] asks whether K cops can catch a single robber on a graph. In a temporal logic setting, this amounts to a safety condition on the robbers (they are never caught by the cops), and the dual liveness condition for the cops (they eventually catch the robbers). We modify the traditional graph theory focused version of the game to have a more visual game on a grid system, allowing for various configurations, including: An environment with various node types such as walls, safe zones, and open spaces.
Using a Feedback Loop for LLM-based Infrastructure as Code Generation
Palavalli, Mayur Amarnath, Santolucito, Mark
Code generation with Large Language Models (LLMs) has helped to increase software developer productivity in coding tasks, but has yet to have significant impact on the tasks of software developers that surround this code. In particular, the challenge of infrastructure management remains an open question. We investigate the ability of an LLM agent to construct infrastructure using the Infrastructure as Code (IaC) paradigm. We particularly investigate the use of a feedback loop that returns errors and warnings on the generated IaC to allow the LLM agent to improve the code. We find that, for each iteration of the loop, its effectiveness decreases exponentially until it plateaus at a certain point and becomes ineffective.
Combining LLM Code Generation with Formal Specifications and Reactive Program Synthesis
Murphy, William, Holzer, Nikolaus, Qiao, Feitong, Cui, Leyi, Rothkopf, Raven, Koenig, Nathan, Santolucito, Mark
In the past few years, Large Language Models (LLMs) have exploded in usefulness and popularity for code generation tasks. However, LLMs still struggle with accuracy and are unsuitable for high-risk applications without additional oversight and verification. In particular, they perform poorly at generating code for highly complex systems, especially with unusual or out-of-sample logic. For such systems, verifying the code generated by the LLM may take longer than writing it by hand. We introduce a solution that divides the code generation into two parts; one to be handled by an LLM and one to be handled by formal methods-based program synthesis. We develop a benchmark to test our solution and show that our method allows the pipeline to solve problems previously intractable for LLM code generation.
Guiding LLM Temporal Logic Generation with Explicit Separation of Data and Control
Murphy, William, Holzer, Nikolaus, Koenig, Nathan, Cui, Leyi, Rothkopf, Raven, Qiao, Feitong, Santolucito, Mark
Temporal logics are powerful tools that are widely used for the synthesis and verification of reactive systems. The recent progress on Large Language Models (LLMs) has the potential to make the process of writing such specifications more accessible. However, writing specifications in temporal logics remains challenging for all but the most expert users. A key question in using LLMs for temporal logic specification engineering is to understand what kind of guidance is most helpful to the LLM and the users to easily produce specifications. Looking specifically at the problem of reactive program synthesis, we explore the impact of providing an LLM with guidance on the separation of control and data--making explicit for the LLM what functionality is relevant for the specification, and treating the remaining functionality as an implementation detail for a series of pre-defined functions and predicates. We present a benchmark set and find that this separation of concerns improves specification generation. Our benchmark provides a test set against which to verify future work in LLM generation of temporal logic specifications.
Succinct Explanations With Cascading Decision Trees
Zhang, Jialu, Santolucito, Mark, Piskac, Ruzica
Classic decision tree learning is a binary classification algorithm that constructs models with first-class transparency - every classification has a directly derivable explanation. However, learning decision trees on modern datasets generates large trees, which in turn generate decision paths of excessive depth, obscuring the explanation of classifications. To improve the comprehensibility of classifications, we propose a new decision tree model that we call Cascading Decision Trees. Cascading Decision Trees shorten the size of explanations of classifications, without sacrificing model performance overall. Our key insight is to separate the notion of a decision path and an explanation path. Utilizing this insight, instead of having one monolithic decision tree, we build several smaller decision subtrees and cascade them in sequence. Our cascading decision subtrees are designed to specifically target explanations for positive classifications. This way each subtree identifies the smallest set of features that can classify as many positive samples as possible, without misclassifying any negative samples. Applying cascading decision trees to new samples results in a significantly shorter and succinct explanation, if one of the subtrees detects a positive classification. In that case, we immediately stop and report the decision path of only the current subtree to the user as an explanation for the classification. We evaluate our algorithm on standard datasets, as well as new real-world applications and find that our model shortens the explanation depth by over 40.8% for positive classifications compared to the classic decision tree model.