Goto

Collaborating Authors

 formalisation


Mathematics is undergoing the biggest change in its history

New Scientist

The speed at which artificial intelligence is gaining in mathematical ability has taken many by surprise. Are the days of handwritten mathematics coming to an end? In March 2025, mathematician Daniel Litt made a bet. Despite the march of progress of artificial intelligence in many fields, he believed his subject was safe, wagering with a colleague that there was only a 25 per cent chance an AI could write a mathematical paper at the level of the best human mathematicians by 2030. Only a year later, he thinks he was wrong.



Multi-language Diversity Benefits Autoformalization

Neural Information Processing Systems

Autoformalization is the task of translating natural language materials into machine-verifiable formalisations. Progress in autoformalization research is hindered by the lack of a sizeable dataset consisting of informal-formal pairs expressing the same essence.


The mechanization of science illustrated by the Lean formalization of the multi-graded Proj construction

Mayeux, Arnaud, Zhang, Jujian

arXiv.org Artificial Intelligence

Arnaud Mayeux and Jujian Zhang Efforts to mechanize aspects of scientific reasoning have been intertwined with the development of science from its earliest days. C1 "Whenever we have a long, difficult piece of algebra, and we have them more and more often these days, we could at least get the machine to check that the algebra was right before we went on and built further stages of derivation on top. Some people are working on such programs for algebra checking right now." C2 "Now I leave the region of known processes and enter the land of speculation. We can, I believe, reasonably expect that an algebra checking routine would not be around very long before someone would adapt the methods of heuristics that are presently being developed to the problem of doing algebra in a more creative way. The machine could supply several steps at a time, and be given only a guiding thread of a proof. The more successful the heuristics, the fewer steps we would have to supply."


AI could be about to completely change the way we do mathematics

New Scientist

Is an artificial intelligence revolution about to transform mathematics? Some prominent mathematicians think so, thanks to automated tools that can help write proofs suddenly showing impressive leaps in capability, with the potential to change the way maths research is done. Around 100 of the world's top mathematicians gathered at the University of Cambridge in June for a conference whose theme was based on whether computers might help mathematicians resolve some long-standing problems over how to check that their proofs were correct. This process, known as formalisation, doesn't necessarily have to involve artificial intelligence, and indeed a similar meeting held at Cambridge in 2017 made no mention of AI. But eight years later, AI has come on by leaps and bounds, most notably with the success of large language models powering tools like ChatGPT.


A Reasoning-Based Approach to Cryptic Crossword Clue Solving

Andrews, Martin, Witteveen, Sam

arXiv.org Artificial Intelligence

Cryptic crossword clues are challenging language tasks for which new test sets are released daily by major newspapers on a global basis. Each cryptic clue contains both the definition of the answer to be placed in the crossword grid (in common with regular crosswords), and 'wordplay' that proves that the answer is correct (i.e. a human solver can be confident that an answer is correct without needing crossing words as confirmation). This work describes an LLM-based reasoning system built from open-licensed components that solves cryptic clues by (i) hypothesising answers; (ii) proposing wordplay explanations; and (iii) using a verifier system that operates on codified reasoning steps. Overall, this system establishes a new state-of-the-art performance on the challenging Cryptonite dataset of clues from The Times and The Telegraph newspapers in the UK. Because each proved solution is expressed in Python, interpretable wordplay reasoning for proven answers is available for inspection.


The Strong, Weak and Benign Goodhart's law. An independence-free and paradigm-agnostic formalisation

Majka, Adrien, El-Mhamdi, El-Mahdi

arXiv.org Machine Learning

Goodhart's law is a famous adage in policy-making that states that ``When a measure becomes a target, it ceases to be a good measure''. As machine learning models and the optimisation capacity to train them grow, growing empirical evidence reinforced the belief in the validity of this law without however being formalised. Recently, a few attempts were made to formalise Goodhart's law, either by categorising variants of it, or by looking at how optimising a proxy metric affects the optimisation of an intended goal. In this work, we alleviate the simplifying independence assumption, made in previous works, and the assumption on the learning paradigm made in most of them, to study the effect of the coupling between the proxy metric and the intended goal on Goodhart's law. Our results show that in the case of light tailed goal and light tailed discrepancy, dependence does not change the nature of Goodhart's effect. However, in the light tailed goal and heavy tailed discrepancy case, we exhibit an example where over-optimisation occurs at a rate inversely proportional to the heavy tailedness of the discrepancy between the goal and the metric. %


Formally Verified Neurosymbolic Trajectory Learning via Tensor-based Linear Temporal Logic on Finite Traces

Chevallier, Mark, Smola, Filip, Schmoetten, Richard, Fleuriot, Jacques D.

arXiv.org Artificial Intelligence

We present a novel formalisation of tensor semantics for linear temporal logic on finite traces (LTLf), with formal proofs of correctness carried out in the theorem prover Isabelle/HOL. We demonstrate that this formalisation can be integrated into a neurosymbolic learning process by defining and verifying a differentiable loss function for the LTLf constraints, and automatically generating an implementation that integrates with PyTorch. We show that, by using this loss, the process learns to satisfy pre-specified logical constraints. Our approach offers a fully rigorous framework for constrained training, eliminating many of the inherent risks of ad-hoc, manual implementations of logical aspects directly in an "unsafe" programming language such as Python, while retaining efficiency in implementation.


Discerning and Characterising Types of Competency Questions for Ontologies

Keet, C. Maria, Khan, Zubeida Casmod

arXiv.org Artificial Intelligence

Competency Questions (CQs) are widely used in ontology development by guiding, among others, the scoping and validation stages. However, very limited guidance exists for formulating CQs and assessing whether they are good CQs, leading to issues such as ambiguity and unusable formulations. To solve this, one requires insight into the nature of CQs for ontologies and their constituent parts, as well as which ones are not. We aim to contribute to such theoretical foundations in this paper, which is informed by analysing questions, their uses, and the myriad of ontology development tasks. This resulted in a first Model for Competency Questions, which comprises five main types of CQs, each with a different purpose: Scoping (SCQ), Validating (VCQ), Foundational (FCQ), Relationship (RCQ), and Metaproperty (MpCQ) questions. This model enhances the clarity of CQs and therewith aims to improve on the effectiveness of CQs in ontology development, thanks to their respective identifiable distinct constituent elements. We illustrate and evaluate them with a user story and demonstrate where which type can be used in ontology development tasks. To foster use and research, we created an annotated repository of 438 CQs, the Repository of Ontology Competency QuestionS (ROCQS), incorporating an existing CQ dataset and new CQs and CQ templates, which further demonstrate distinctions among types of CQs.


An action language-based formalisation of an abstract argumentation framework

Munro, Yann, Sarmiento, Camilo, Bloch, Isabelle, Bourgne, Gauvain, Pelachaud, Catherine, Lesot, Marie-Jeanne

arXiv.org Artificial Intelligence

An abstract argumentation framework is a commonly used formalism to provide a static representation of a dialogue. However, the order of enunciation of the arguments in an argumentative dialogue is very important and can affect the outcome of this dialogue. In this paper, we propose a new framework for modelling abstract argumentation graphs, a model that incorporates the order of enunciation of arguments. By taking this order into account, we have the means to deduce a unique outcome for each dialogue, called an extension. We also establish several properties, such as termination and correctness, and discuss two notions of completeness. In particular, we propose a modification of the previous transformation based on a "last enunciated last updated" strategy, which verifies the second form of completeness.