A Unified Representation Underlying the Judgment of Large Language Models

Nov-5-2025–arXiv.org Artificial Intelligence

A central architectural question for both biological and artificial intelligence is whether judgment relies on specialized modules or a unified, domain-general resource. While the discovery of decodable neural representations for distinct concepts in Large Language Models (LLMs) has suggested a modular architecture, whether these representations are truly independent systems remains an open question. Here we provide evidence for a convergent architecture for evaluative judgment. Across a range of LLMs, we find that diverse evaluative judgments are computed along a dominant dimension, which we term the Valence-Assent Axis (VAA). This axis jointly encodes subjective valence ("what is good") and the model's assent to factual claims ("what is true"). Through direct interventions, we demonstrate this axis drives a critical mechanism, which is identified as the subordination of reasoning: the VAA functions as a control signal that steers the generative process to construct a rationale consistent with its evaluative state, even at the cost of factual accuracy. Our discovery offers a mechanistic account for response bias and hallucination, revealing how an architecture that promotes coherent judgment can systematically undermine faithful reasoning.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Nov-5-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (1.00)
- Asia (0.67)

Genre:
- Questionnaire & Opinion Survey (1.00)
- Research Report
  - New Finding (1.00)
  - Experimental Study (1.00)

Industry:
- Law (1.00)
- Education (1.00)
- Banking & Finance (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.92)
- Health & Medicine > Therapeutic Area
  - Psychiatry/Psychology (1.00)
- Government
  - Foreign Policy (0.98)
  - Regional Government > North America Government
    - United States Government (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.68)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found