implication
Generalization or Hallucination Understanding Out of Context Reasoning in Transformers
Large language models (LLMs) can acquire new knowledge through fine-tuning, but this process exhibits a puzzling duality: models can generalize remarkably from new facts, yet are also prone to hallucinating incorrect information. However, the reasons for this phenomenon remain poorly understood. In this work, we argue that both behaviors stem from a single mechanism known as out-of-context reasoning (OCR): the ability to deduce implications by associating concepts, even those without a causal link. Our experiments across five prominent LLMs confirm that OCR indeed drives both generalization and hallucination, depending on whether the associated concepts are causally related. To build a rigorous theoretical understanding of this phenomenon, we then formalize OCR as a synthetic factual recall task. We empirically show that a one-layer single-head attention-only transformer with factorized output and value matrices can learn to solve this task, while a model with combined weights cannot, highlighting the crucial role of matrix factorization. Our theoretical analysis shows that the OCR capability can be attributed to the implicit bias of gradient descent, which favors solutions that minimize the nuclear norm of the combined output-value matrix. This structure explains why the model learns to associate facts and implications with high sample efficiency, regardless of whether the correlation is causal or merely spurious. Ultimately, our work provides a theoretical foundation for understanding the OCR phenomenon, offering a new lens for analyzing and mitigating undesirable behaviors from knowledge injection.
Probabilistic Reasoning with LLMs for Privacy Risk Estimation
Probabilistic reasoning is a key aspect of both human and artificial intelligence that allows for handling uncertainty and ambiguity in decision-making. In this paper, we introduce a new numerical reasoning task under uncertainty for large language models, focusing on estimating the privacy risk of user-generated documents containing privacy-sensitive information. We propose BRANCH, a new LLM methodology that estimates the k-privacy value of a text--the size of the population matching the given information.
Self-Adapting Language Models
Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks, knowledge, or examples. We introduce Self-Adapting LLMs (SEAL), a framework that enables LLMs to self-adapt by generating their own finetuning data and update directives. Given a new input, the model produces a self-edit--a generation that may restructure the information in different ways, specify optimization hyperparameters, or invoke tools for data augmentation and gradient-based updates.
What do Ukraine's robot soldiers mean for the future of warfare?
What are Russia's gains from the Iran war? 'We are not losers; we are winners' What do Ukraine's robot soldiers mean for the future of warfare? In a scene reminiscent of a computer war game, three battle-fatigued soldiers, dressed in white snow camouflage, emerge from a war-torn alley with their hands raised above their heads. They crouch down, following the orders being blasted at them, fear and shock etched across their faces as they stare down the barrel of a machinegun mounted on a so-called ground robot. In April, Ukrainian President Volodymyr Zelenskyy said that, for the "first time in the history of this war, an enemy position was taken exclusively by unmanned platforms - ground systems and drones". "Ground robotic systems have already carried out more than 22,000 missions on the front in just three months," he wrote in a post on X, alongside images of green machines with tank tracks and weapons mounted on top.
Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs
Large Language Models (LLMs) have demonstrated impressive performance on multimodal tasks, without any multimodal finetuning. They are the de facto building block for Large Multimodal Models (LMMs), yet, we still lack a proper understanding of their success. In this work, we expose frozen LLMs to image, video, audio and text inputs and analyse their internal representation with the attempt to understand their generalization beyond textual inputs. Our work provides the following findings. Perceptual tokens (1) are easily distinguishable from textual ones inside LLMs, with significantly different representations (e.g.
A Omitted Proofs
Taking = p / gives the desired claim. Claim 2.7, we know that the multicalibration violation for The inequalities follow by Holder's inequality and the assumed bound on the weight of Recall that Cov[ y, z ]= E [ yz ] E [ y ] E [ z ] . Here, we give a high-level overview of the MCBoost algorithm of [ 20 ] and weak agnostic learning. Algorithm 2 MCBoost Parameters: hypothesis class C and > 0 Given: Dataset S sampled from D Initialize: p ( x) 1 / 2 . By Lemma 3.8, we know that In this Appendix, we give a full account of the definitions and results stated in Section 4 .
AI 'vibe-coding' platform's flaws allow BBC reporter to be hacked
AI coding platform's flaws allow BBC reporter to be hacked The BBC has been shown a significant - and unfixed - cyber-security risk in a popular AI coding platform. Orchids is a so-called vibe-coding tool, meaning people without technical skills can use it to build apps and games by typing a text prompt into a chatbot. Such platforms have exploded in popularity in recent months, and are often heralded as an early example of how various professional services could be done quickly and cheaply by AI. But experts say the ease with which Orchids can be hacked demonstrates the risks of allowing AI bots deep access to our computers in exchange for the convenience of allowing them to carry out tasks autonomously. The BBC has repeatedly asked the company for comment but it has not replied.