LayerFlow: Layer-wise Exploration of LLM Embeddings using Uncertainty-aware Interlinked Projections
Sevastjanova, Rita, Gerling, Robin, Spinner, Thilo, El-Assady, Mennatallah
–arXiv.org Artificial Intelligence
Figure 1: LayerFlow supports the analysis of contextual word embedding properties. T o increase the awareness of the potential uncertainty within the transformation, representation, and interpretation steps of the used processing pipeline, we utilize multiple visual components such as cluster convex-hulls, pairwise distances, cluster summaries, projection quality metrics, and connections of k-nearest neighbors.Abstract Large language models (LLMs) represent words through contextual word embeddings encoding different language properties like semantics and syntax. Understanding these properties is crucial, especially for researchers investigating language model capabilities, employing embeddings for tasks related to text similarity, or evaluating the reasons behind token importance as measured through attribution methods. Applications for embedding exploration frequently involve dimensionality reduction techniques, which reduce high-dimensional vectors to two dimensions used as coordinates in a scatterplot. This data transformation step introduces uncertainty that can be propagated to the visual representation and influence users' interpretation of the data. T o communicate such uncertainties, we present LayerFlow - a visual analytics workspace that displays embeddings in an interlinked projection design and communicates the transformation, representation, and interpretation uncertainty. In particular, to hint at potential data distortions and uncertainties, the workspace includes several visual components, such as convex hulls showing 2D and HD clusters, data point pairwise distances, cluster summaries, and projection quality metrics. W e show the usability of the presented workspace through replication and expert case studies that highlight the need to communicate uncertainty through multiple visual components and different data perspectives. CCS Concepts Human-centered computing Visual analytics; Mathematics of computing Dimensionality reduction;1 Introduction In recent years, a large number of deep-learning-based language models (e.g., BERT [DCL T19]) have emerged, demonstrating remarkable performance in natural language processing (NLP) and understanding tasks. These models learn from large text datasets, acquiring language structures in an unsupervised manner. Thereby, they produce contextual word embeddings, representing words through vectors encoding different language properties. Extensive research has been conducted to understand the linguistic properties embedded in these vectors. For instance, research indicates that BERT's middle layers capture syntactic features like dependency trees while early layers encode lexical features [RKR20]. Analyzing these properties helps researchers better understand how language models process data and aids in developing models that generalize well, reducing biases and improving inclusivity.
arXiv.org Artificial Intelligence
Apr-16-2025
- Country:
- Europe (1.00)
- North America > United States
- Minnesota (0.28)
- Genre:
- Research Report > New Finding (0.48)
- Industry:
- Health & Medicine (0.88)
- Technology: