exception
The Rules-and-Facts Model for Simultaneous Generalization and Memorization in Neural Networks
Farné, Gabriele, Boncoraglio, Fabrizio, Zdeborová, Lenka
A key capability of modern neural networks is their capacity to simultaneously learn underlying rules and memorize specific facts or exceptions. Yet, theoretical understanding of this dual capability remains limited. We introduce the Rules-and-Facts (RAF) model, a minimal solvable setting that enables precise characterization of this phenomenon by bridging two classical lines of work in the statistical physics of learning: the teacher-student framework for generalization and Gardner-style capacity analysis for memorization. In the RAF model, a fraction $1 - \varepsilon$ of training labels is generated by a structured teacher rule, while a fraction $\varepsilon$ consists of unstructured facts with random labels. We characterize when the learner can simultaneously recover the underlying rule - allowing generalization to new data - and memorize the unstructured examples. Our results quantify how overparameterization enables the simultaneous realization of these two objectives: sufficient excess capacity supports memorization, while regularization and the choice of kernel or nonlinearity control the allocation of capacity between rule learning and memorization. The RAF model provides a theoretical foundation for understanding how modern neural networks can infer structure while storing rare or non-compressible information.
- North America (0.14)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Europe > France (0.04)
- Health & Medicine (0.67)
- Education (0.67)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- Asia > China > Hong Kong (0.04)
- Education (0.67)
- Government > Regional Government > North America Government > United States Government (0.46)
A QLoRA vs Standard Finetuning Experimental Setup Details A.1 Hyperparameters for QL
We do a hyperparameter search for LoRA over the following variables: LoRA dropout { 0.0, 0.05, LoRA α is always proportional to the learning rate. We find that LoRA dropout 0.05 is useful for small models (7B, 13B), but not for larger models (33B, We use the same preprocessing of the Super-Natural Instruction dataset as Wang et al. RA finetuning experiments outlined in Section 5. This limits the dataset to 9,209 examples. HH-RLHF This is a human preference dataset about helpfulness and harmlessness.
Rubio rules out military action in Venezuela, with an exception
The Trump administration does not "intend or expect" to again take military action in Venezuela, Secretary of State and National Security Adviser Marco Rubio told the US Congress, but theoretical threats like an "Iranian drone factory" could change the government's thinking. Trump says US ready to attack Iran with'speed and violence'
- North America > United States (1.00)
- Asia > Middle East > Iran (0.27)
Executable Governance for AI: Translating Policies into Rules Using LLMs
Datla, Gautam Varma, Vurity, Anudeep, Dash, Tejaswani, Ahmad, Tazeem, Adnan, Mohd, Rafi, Saima
AI policy guidance is predominantly written as prose, which practitioners must first convert into executable rules before frameworks can evaluate or enforce them. This manual step is slow, error-prone, difficult to scale, and often delays the use of safeguards in real-world deployments. To address this gap, we present Policy-to-Tests (P2T), a framework that converts natural-language policy documents into normalized, machine-readable rules. The framework comprises a pipeline and a compact domain-specific language (DSL) that encodes hazards, scope, conditions, exceptions, and required evidence, yielding a canonical representation of extracted rules. To test the framework beyond a single policy, we apply it across general frameworks, sector guidance, and enterprise standards, extracting obligation-bearing clauses and converting them into executable rules. These AI-generated rules closely match strong human baselines on span-level and rule-level metrics, with robust inter-annotator agreement on the gold set. To evaluate downstream behavioral and safety impact, we add HIPAA-derived safeguards to a generative agent and compare it with an otherwise identical agent without guardrails. An LLM-based judge, aligned with gold-standard criteria, measures violation rates and robustness to obfuscated and compositional prompts. Detailed results are provided in the appendix. We release the codebase, DSL, prompts, and rule sets as open-source resources to enable reproducible evaluation.
- Europe (0.94)
- North America > United States (0.94)
- Health & Medicine (0.88)
- Government (0.69)
- Information Technology > Security & Privacy (0.47)
- Law > Statutes (0.47)
Training and Evaluation of Guideline-Based Medical Reasoning in LLMs
Staniek, Michael, Sokolov, Artem, Riezler, Stefan
Machine learning for early prediction in medicine has recently shown breakthrough performance, however, the focus on improving prediction accuracy has led to a neglect of faithful explanations that are required to gain the trust of medical practitioners. The goal of this paper is to teach LLMs to follow medical consensus guidelines step-by-step in their reasoning and prediction process. Since consensus guidelines are ubiquitous in medicine, instantiations of verbalized medical inference rules to electronic health records provide data for fine-tuning LLMs to learn consensus rules and possible exceptions thereof for many medical areas. Consensus rules also enable an automatic evaluation of the model's inference process regarding its derivation correctness (evaluating correct and faithful deduction of a conclusion from given premises) and value correctness (comparing predicted values against real-world measurements). We exemplify our work using the complex Sepsis-3 consensus definition. Our experiments show that small fine-tuned models outperform one-shot learning of considerably larger LLMs that are prompted with the explicit definition and models that are trained on medical texts including consensus definitions. Since fine-tuning on verbalized rule instantiations of a specific medical area yields nearly perfect derivation correctness for rules (and exceptions) on unseen patient data in that area, the bottleneck for early prediction is not out-of-distribution generalization, but the orthogonal problem of generalization into the future by forecasting sparsely and irregularly sampled clinical variables. We show that the latter results can be improved by integrating the output representations of a time series forecasting model with the LLM in a multimodal setup.
SymLoc: Symbolic Localization of Hallucination across HaluEval and TruthfulQA
Lamba, Naveen, Tiwari, Sanju, Gaur, Manas
LLMs still struggle with hallucination, especially when confronted with symbolic triggers like modifiers, negation, numbers, exceptions, and named entities. Yet, we lack a clear understanding of where these symbolic hallucinations originate, making it crucial to systematically handle such triggers and localize the emergence of hallucination inside the model. While prior work explored localization using statistical techniques like LSC and activation variance analysis, these methods treat all tokens equally and overlook the role symbolic linguistic knowledge plays in triggering hallucinations. So far, no approach has investigated how symbolic elements specifically drive hallucination failures across model layers, nor has symbolic linguistic knowledge been used as the foundation for a localization framework. We propose the first symbolic localization framework that leverages symbolic linguistic and semantic knowledge to meaningfully trace the development of hallucinations across all model layers. By focusing on how models process symbolic triggers, we analyze five models using HaluEval and TruthfulQA. Our symbolic knowledge approach reveals that attention variance for these linguistic elements explodes to critical instability in early layers (2-4), with negation triggering catastrophic variance levels, demonstrating that symbolic semantic processing breaks down from the very beginning. Through the lens of symbolic linguistic knowledge, despite larger model sizes, hallucination rates remain consistently high (78.3%-83.7% across Gemma variants), with steep attention drops for symbolic semantic triggers throughout deeper layers. Our findings demonstrate that hallucination is fundamentally a symbolic linguistic processing failure, not a general generation problem, revealing that symbolic semantic knowledge provides the key to understanding and localizing hallucination mechanisms in LLMs.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Six dead as Russia hits energy and residential sites in Ukraine
At least six people have died after Russia launched hundreds of missile and drone attacks on energy infrastructure and residential targets in Ukraine overnight. A strike on an apartment building in the city of Dnipro killed two people and wounded 12, while three died in Zaporizhzhia. In all, 25 locations across Ukraine, including the capital city Kyiv, were hit, leaving many areas without electricity and heating. Prime Minister Yulia Svyrydenko said on Telegram that major energy facilities were damaged in the Poltava, Kharkiv and Kyiv regions, and work was under way to restore power. In Russia, the defence ministry said its forces had shot down 79 Ukrainian drones overnight. The Ukrainian air force said Russia had launched more than 450 exploding bomber drones and 45 missiles.
- Asia > Russia (1.00)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.48)
- North America > United States (0.30)
- (22 more...)
- Energy (1.00)
- Government > Regional Government > Europe Government (0.73)