Goto

Collaborating Authors

 Logic & Formal Reasoning






Unifying Inference-Time Planning Language Generation

arXiv.org Artificial Intelligence

A line of work in planning uses LLM not to generate a plan, but to generate a formal representation in some planning language, which can be input into a symbolic solver to deterministically find a plan. While showing improved trust and promising performance, dozens of recent publications have proposed scattered methods on a variety of benchmarks under different experimental settings. We attempt to unify the inference-time LLM-as-formalizer methodology for classical planning by proposing a unifying framework based on intermediate representations. We thus systematically evaluate more than a dozen pipelines that subsume most existing work, while proposing novel ones that involve syntactically similar but high resource intermediate languages (such as a Python wrapper of PDDL). We provide recipes for planning language generation pipelines, draw a series of conclusions showing the efficacy of their various components, and evidence their robustness against problem complexity.


The analogy theorem in Hoare logic

arXiv.org Machine Learning

The introduction of machine learning methods has led to significant advances in automation, optimization, and discoveries in various fields of science and technology. However, their widespread application faces a fundamental limitation: the transfer of models between data domains generally lacks a rigorous mathematical justification. The key problem is the lack of formal criteria to guarantee that a model trained on one type of data will retain its properties on another.This paper proposes a solution to this problem by formalizing the concept of analogy between data sets and models using first-order logic and Hoare logic.We formulate and rigorously prove a theorem that sets out the necessary and sufficient conditions for analogy in the task of knowledge transfer between machine learning models. Practical verification of the analogy theorem on model data obtained using the Monte Carlo method, as well as on MNIST and USPS data, allows us to achieving F1 scores of 0.84 and 0.88 for convolutional neural networks and random forests, respectively.The proposed approach not only allows us to justify the correctness of transfer between domains but also provides tools for comparing the applicability of models to different types of data.The main contribution of the work is a rigorous formalization of analogy at the level of program logic, providing verifiable guarantees of the correctness of knowledge transfer, which opens new opportunities for both theoretical research and the practical use of machine learning models in previously inaccessible areas.


Making Logic a First-Class Citizen in Network Data Generation with ML

arXiv.org Artificial Intelligence

Generative ML models are increasingly popular in networking for tasks such as telemetry imputation, prediction, and synthetic trace generation. Despite their capabilities, they suffer from two shortcomings: (i) their output is often visibly violating well-known networking rules, which undermines their trustworthiness; and (ii) they are difficult to control, frequently requiring retraining even for minor changes. To address these limitations and unlock the benefits of generative models for networking, we propose a new paradigm for integrating explicit network knowledge in the form of first-order logic rules into ML models used for networking tasks. Rules capture well-known relationships among used signals, e.g., that increased latency precedes packet loss. While the idea is conceptually straightforward, its realization is challenging: networking knowledge is rarely formalized into rules, and naively injecting them into ML models often hampers ML's effectiveness. This paper introduces NetNomos a multi-stage framework that (1) learns rules directly from data (e.g., measurements); (2) filters them to distinguish semantically meaningful ones; and (3) enforces them through a collaborative generation between an ML model and an SMT solver.


LLM-Guided Evolutionary Program Synthesis for Quasi-Monte Carlo Design

arXiv.org Artificial Intelligence

Low-discrepancy point sets and digital sequences underpin quasi-Monte Carlo (QMC) methods for high-dimensional integration. We cast two long-standing QMC design problems as program synthesis and solve them with an LLM-guided evolutionary loop that mutates and selects code under task-specific fitness: (i) constructing finite 2D/3D point sets with low star discrepancy, and (ii) choosing Sobol' direction numbers that minimize randomized QMC error on downstream integrands. Our two-phase procedure combines constructive code proposals with iterative numerical refinement. On finite sets, we rediscover known optima in small 2D cases and set new best-known 2D benchmarks for N >= 40, while matching most known 3D optima up to the proven frontier (N <= 8) and reporting improved 3D benchmarks beyond. On digital sequences, evolving Sobol' parameters yields consistent reductions in randomized quasi-Monte Carlo (rQMC) mean-squared error for several 32-dimensional option-pricing tasks relative to widely used Joe--Kuo parameters, while preserving extensibility to any sample size and compatibility with standard randomizations. Taken together, the results demonstrate that LLM-driven evolutionary program synthesis can automate the discovery of high-quality QMC constructions, recovering classical designs where they are optimal and improving them where finite-N structure matters. Data and code are available at https://github.com/hockeyguy123/openevolve-star-discrepancy.git.


Learning Representations Through Contrastive Neural Model Checking

arXiv.org Artificial Intelligence

Model checking is a key technique for verifying safety-critical systems against formal specifications, where recent applications of deep learning have shown promise. However, while ubiquitous for vision and language domains, representation learning remains underexplored in formal verification. We introduce Contrastive Neural Model Checking (CNML), a novel method that leverages the model checking task as a guiding signal for learning aligned representations. CNML jointly embeds logical specifications and systems into a shared latent space through a self-supervised contrastive objective. On industry-inspired retrieval tasks, CNML considerably outperforms both algorithmic and neural baselines in cross-modal and intra-modal settings. We further show that the learned representations effectively transfer to downstream tasks and generalize to more complex formulas. These findings demonstrate that model checking can serve as an objective for learning representations for formal languages.


A Concept of Possibility for Real-World Events

arXiv.org Artificial Intelligence

This paper offers a new concept of {\it possibility} as an alternative to the now-a-days standard concept originally introduced by L.A. Zadeh in 1978. This new version was inspired by the original but, formally, has nothing in common with it other than that they both adopt the Łukasiewicz multivalent interpretation of the logical connectives. Moreover, rather than seeking to provide a general notion of possibility, this focuses specifically on the possibility of a real-world event. An event is viewed as having prerequisites that enable its occurrence and constraints that may impede its occurrence, and the possibility of the event is computed as a function of the probabilities that the prerequisites hold and the constraints do not. This version of possibility might appropriately be applied to problems of planning. When there are multiple plans available for achieving a goal, this theory can be used to determine which plan is most possible, i.e., easiest or most feasible to complete. It is speculated that this model of reasoning correctly captures normal human reasoning about plans. The theory is elaborated and an illustrative example for vehicle route planning is provided. There is also a suggestion of potential future applications.