AttentionViz: A Global View of Transformer Attention

Yeh, Catherine, Chen, Yida, Wu, Aoyu, Chen, Cynthia, Viégas, Fernanda, Wattenberg, Martin

Aug-9-2023–arXiv.org Artificial Intelligence

Figure 1: AttentionViz, our interactive visualization tool, allows users to explore transformer self-attention at scale by creating a joint embedding space for queries and keys. Each point in the scatterplot represents the query or key version of a word, as denoted by point color. Users can explore individual attention heads (left) or zoom out for a "global" view of attention (right). Abstract--Transformer models are revolutionizing machine learning, but their inner workings remain mysterious. In this work, we present a new visualization technique designed to help researchers understand the self-attention mechanism in transformers that allows these models to learn rich, contextual relationships between elements of a sequence. The main idea behind our method is to visualize a joint embedding of the query and key vectors used by transformer models to compute attention. Unlike previous attention visualization techniques, our approach enables the analysis of global patterns across multiple input sequences. We create an interactive visualization tool, AttentionViz (demo: http://attentionviz.com), based on these joint query-key embeddings, and use it to study attention mechanisms in both language and vision transformers. We demonstrate the utility of our approach in improving model understanding and offering new insights about query-key interactions through several application scenarios and expert feedback. The transformer neural network architecture [52] is having a major impact In this work, we describe a new visualization technique aimed at on fields ranging from natural language processing (NLP) [13, 42] better comprehending how transformers operate. Indeed, transformers are now deployed in introduction to transformers in Sec. However, the mechanisms these models to learn and use a rich set of relationships between input behind this success remain somewhat mysterious, especially as elements.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Aug-9-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - New York (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
  - Canada > Quebec
    - Montreal (0.04)
- Europe
  - Switzerland > Zürich
    - Zürich (0.04)
  - Italy > Sardinia
    - Cagliari (0.04)
  - Finland > Uusimaa
    - Helsinki (0.04)
- Africa > Ethiopia
  - Addis Ababa > Addis Ababa (0.04)

Genre:
- Research Report (0.82)
- Personal > Interview (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Vision (1.00)
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found