From Internal Representations to Text Quality: A Geometric Approach to LLM Evaluation

Yusupov, Viacheslav, Maksimov, Danil, Alaeva, Ameliia, Vasileva, Anna, Antipina, Anna, Zaitseva, Tatyana, Ermilova, Alina, Burnaev, Evgeny, Shvetsov, Egor

Oct-1-2025–arXiv.org Artificial Intelligence

This paper bridges internal and external analysis approaches to large language models (LLMs) by demonstrating that geometric properties of internal model representations serve as reliable proxies for evaluating generated text quality. We validate a set of metrics--including Maximum Explainable V ariance, Effective Rank, Intrinsic Dimensionality, MAUVE score, and Schatten Norms measured across different layers of LLMs, demonstrating that Intrinsic Dimensionality and Effective Rank can serve as universal assessments of text naturalness and quality. Our key finding reveals that different models consistently rank text from various sources in the same order based on these geometric properties, indicating that these metrics reflect inherent text characteristics rather than model-specific artifacts. This allows a reference-free text quality evaluation that does not require human-annotated datasets, offering practical advantages for automated evaluation pipelines. The rapid advancement of large language models (LLMs) has necessitated the development of methods for analyzing their internal mechanisms and the properties of generated text. Approaches to studying the geometric properties of representations in language models can be broadly categorized into two categories: internal or mechanistic methods, which investigate the model's intermediate representations, and external methods, which analyze the properties of text embeddings captured via some embedding model. Internal evaluation mainly considers model properties within which, these measures were made (Yin et al., 2024; Viswanathan et al., 2025; Roy & V etterli, 2007), while external measures mainly focus on evaluation of text properties given text embeddings (Zhao et al., 2019; Tulchinskii et al., 2023; Kuznetsov et al., 2024).

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

Oct-1-2025

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.46)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.72)