disjoint
- South America > Chile (0.04)
- Europe > Greece (0.04)
Guaranteed Optimal Compositional Explanations for Neurons
La Rosa, Biagio, Gilpin, Leilani H.
While neurons are the basic units of deep neural networks, it is still unclear what they learn and if their knowledge is aligned with that of humans. Compositional explanations aim to answer this question by describing the spatial alignment between neuron activations and concepts through logical rules. These logical descriptions are typically computed via a search over all possible concept combinations. Since computing the spatial alignment over the entire state space is computationally infeasible, the literature commonly adopts beam search to restrict the space. However, beam search cannot provide any theoretical guarantees of optimality, and it remains unclear how close current explanations are to the true optimum. In this theoretical paper, we address this gap by introducing the first framework for computing guaranteed optimal compositional explanations. Specifically, we propose: (i) a decomposition that identifies the factors influencing the spatial alignment, (ii) a heuristic to estimate the alignment at any stage of the search, and (iii) the first algorithm that can compute optimal compositional explanations within a feasible time. Using this framework, we analyze the differences between optimal and non-optimal explanations in the most popular settings for compositional explanations, the computer vision domain and Convolutional Neural Networks. In these settings, we demonstrate that 10-40 percent of explanations obtained with beam search are suboptimal when overlapping concepts are involved. Finally, we evaluate a beam-search variant guided by our proposed decomposition and heuristic, showing that it matches or improves runtime over prior methods while offering greater flexibility in hyperparameters and computational resources.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > District of Columbia > Washington (0.05)
- (2 more...)
- Health & Medicine > Therapeutic Area (1.00)
- Information Technology (0.92)
- South America > Chile (0.04)
- Europe > Greece (0.04)
Saturation-Driven Dataset Generation for LLM Mathematical Reasoning in the TPTP Ecosystem
Quesnel, Valentin, Sileo, Damien
The scarcity of high-quality, logically sound data is a critical bottleneck for advancing the mathematical reasoning of Large Language Models (LLMs). Our work confronts this challenge by turning decades of automated theorem proving research into a scalable data engine. Rather than relying on error-prone LLMs or complex proof-assistant syntax like Lean and Isabelle, our framework leverages E-prover's saturation capabilities on the vast TPTP axiom library to derive a massive, guaranteed-valid corpus of theorems. Our pipeline is principled and simple: saturate axioms, filter for "interesting" theorems, and generate tasks. With no LLMs in the loop, we eliminate factual errors by construction. This purely symbolic data is then transformed into three difficulty-controlled challenges: entailment verification, premise selection, and proof reconstruction. Our zero-shot experiments on frontier models reveal a clear weakness: performance collapses on tasks requiring deep, structural reasoning. Our framework provides both the diagnostic tool to measure this gap and a scalable source of symbolic training data to address it. We make the code and data publicly available. https://github.com/sileod/reasoning_core https://hf.co/datasets/reasoning-core/rc1
Complexity in finitary argumentation (extended version)
Abstract argumentation frameworks (AFs) provide a formal setting to analyze many forms of reasoning with conflicting information. While the expressiveness of general infinite AFs make them a tempting tool for modeling many kinds of reasoning scenarios, the computational intractability of solving infinite AFs limit their use, even in many theoretical applications. We investigate the complexity of computational problems related to infinite but finitary argumentations frameworks, that is, infinite AFs where each argument is attacked by only finitely many others. Our results reveal a surprising scenario. On one hand, we see that the assumption of being finitary does not automatically guarantee a drop in complexity. However, for the admissibility-based semantics, we find a remarkable combinatorial constraint which entails a dramatic decrease in complexity. We conclude that for many forms of reasoning, the finitary infinite AFs provide a natural setting for reasoning which balances well the competing goals of being expressive enough to be applied to many reasoning settings while being computationally tractable enough for the analysis within the framework to be useful.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
An Empirical Investigation of Domain Generalization with Empirical Risk Minimizers (Appendix)
See table 1 for the results. We next perform regression in the Joint setting (Sec.5.3, main paper) where we fit a regression model across all environments, with 5 features instead of 2 reported in the main We find that it is possible to get an Spearman's We considered a set of 40 metrics overall and report only a small subset of them in the main paper. In table 2 we provide detailed results of all the measures we study. Figure 1 provides details of the canonicalization performed on each of the measures as explained in the main paper. In particular, (Ben-David et al., 2007) prove We also develop measures based on follow-up theoretical work in (Ben-David et al., 2010) on divergence measures using the symmetric difference hypothesis space. Here we summarize a result from (Ben-David et al., 2010), This canonicalization is used to report the results in Sec. 5 H: Z P (Y), we follow the steps in algorithm 1. Algorithm 1 Computing H -divergence measure As explained in the main paper, this divergence measure was proposed in (Ben-David et al., 2010).