Generalization or Memorization: Dynamic Decoding for Mode Steering
–arXiv.org Artificial Intelligence
Large Language Models (LLMs) exhibit a troubling duality, capable of both remarkable generalization and brittle, verbatim memorization of their training data. This unpredictability undermines their reliability in high-stakes applications. In this work, we propose a unified framework to understand, identify, and control these distinct reasoning modes. First, we introduce a theoretical model based on the Information Bottleneck (IB) principle, formalizing generalization as the learning of a compressed, task-relevant representation and memorization as a failure to compress. Building on this theory, we develop Dynamic Mode Steering (DMS), a novel inference-time algorithm which comprises two components: (1) a lightweight, causally-grounded linear probe that identifies the model's instantaneous reliance on memorization, and (2) a dynamic activation steering mechanism that nudges the model's computation towards pre-identified generalization circuits. We frame DMS as a form of adaptive, self-contrastive decoding. Experiments on reasoning and faithfulness tasks demonstrate that DMS significantly improves logical consistency and factual accuracy, thereby offering a principled approach to enhancing LLM reliability.
arXiv.org Artificial Intelligence
Oct-28-2025
- Country:
- North America
- Canada (0.04)
- United States
- California > Santa Clara County
- Palo Alto (0.04)
- Wisconsin > Dane County
- Madison (0.04)
- California > Santa Clara County
- North America
- Genre:
- Overview (0.46)
- Research Report (0.64)
- Technology: