Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data

May-27-2025, 22:05:16 GMT–Neural Information Processing Systems

One way to address safety risks from large language models (LLMs) is to censor dangerous knowledge from their training data. While this removes the explicit information, implicit information can remain scattered across various training documents. Could an LLM infer the censored knowledge by piecing together these implicit hints? As a step towards answering this question, we study inductive out-of-context reasoning (OOCR), a type of generalization in which LLMs infer latent information from evidence distributed across training documents and apply it to downstream tasks without in-context learning. Using a suite of five tasks, we demonstrate that frontier LLMs can perform inductive OOCR.

artificial intelligence, large language model, natural language, (8 more...)

Neural Information Processing Systems

May-27-2025, 22:05:16 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)