Goto

Collaborating Authors

 geometric property


Less is More: Local Intrinsic Dimensions of Contextual Language Models

Neural Information Processing Systems

Understanding the internal mechanisms of large language models (LLMs) remains a challenging and complex endeavor. Even fundamental questions, such as how fine-tuning affects model behavior, often require extensive empirical evaluation. In this paper, we introduce a novel perspective based on the geometric properties of contextual latent embeddings to study the effects of training and fine-tuning. To that end, we measure the local dimensions of a contextual language model's latent space and analyze their shifts during training and fine-tuning. We show that the local dimensions provide insights into the model's training dynamics and generalization ability. Specifically, the mean of the local dimensions predicts when the model's training capabilities are exhausted, as exemplified in a dialogue state tracking task, overfitting, as demonstrated in an emotion recognition task, and grokking, as illustrated with an arithmetic task. Furthermore, our experiments suggest a practical heuristic: reductions in the mean local dimension tend to accompany and predict subsequent performance gains. Through this exploration, we aim to provide practitioners with a deeper understanding of the implications of fine-tuning on embedding spaces, facilitating informed decisions when configuring models for specific applications. The results of this work contribute to the ongoing discourse on the interpretability, adaptability, and generalizability of LLMs by bridging the gap between intrinsic model mechanisms and geometric properties in the respective embeddings.


RMLR: Extending Multinomial Logistic Regression into General Geometries

Neural Information Processing Systems

Riemannian neural networks, which extend deep learning techniques to Riemannian spaces, have gained significant attention in machine learning. To better classify the manifold-valued features, researchers have started extending Euclidean multinomial logistic regression (MLR) into Riemannian manifolds. However, existing approaches suffer from limited applicability due to their strong reliance on specific geometric properties. This paper proposes a framework for designing Riemannian MLR over general geometries, referred to as RMLR. Our framework only requires minimal geometric properties, thus exhibiting broad applicability and enabling its use with a wide range of geometries.








From Internal Representations to Text Quality: A Geometric Approach to LLM Evaluation

arXiv.org Artificial Intelligence

This paper bridges internal and external analysis approaches to large language models (LLMs) by demonstrating that geometric properties of internal model representations serve as reliable proxies for evaluating generated text quality. We validate a set of metrics--including Maximum Explainable V ariance, Effective Rank, Intrinsic Dimensionality, MAUVE score, and Schatten Norms measured across different layers of LLMs, demonstrating that Intrinsic Dimensionality and Effective Rank can serve as universal assessments of text naturalness and quality. Our key finding reveals that different models consistently rank text from various sources in the same order based on these geometric properties, indicating that these metrics reflect inherent text characteristics rather than model-specific artifacts. This allows a reference-free text quality evaluation that does not require human-annotated datasets, offering practical advantages for automated evaluation pipelines. The rapid advancement of large language models (LLMs) has necessitated the development of methods for analyzing their internal mechanisms and the properties of generated text. Approaches to studying the geometric properties of representations in language models can be broadly categorized into two categories: internal or mechanistic methods, which investigate the model's intermediate representations, and external methods, which analyze the properties of text embeddings captured via some embedding model. Internal evaluation mainly considers model properties within which, these measures were made (Yin et al., 2024; Viswanathan et al., 2025; Roy & V etterli, 2007), while external measures mainly focus on evaluation of text properties given text embeddings (Zhao et al., 2019; Tulchinskii et al., 2023; Kuznetsov et al., 2024).