Visualizing and Measuring the Geometry of BERT

Reif, Emily, Yuan, Ann, Wattenberg, Martin, Viegas, Fernanda B., Coenen, Andy, Pearce, Adam, Kim, Been

Mar-19-2020, 00:03:22 GMT–Neural Information Processing Systems

Transformer architectures show significant promise for natural language processing. Given that a single pretrained model can be fine-tuned to perform well on many different tasks, these networks appear to extract generally useful linguistic features. A natural question is how such networks represent this information internally. This paper describes qualitative and quantitative investigations of one particularly effective model, BERT. At a high level, linguistic features seem to be represented in separate semantic and syntactic subspaces.

artificial intelligence, geometry, natural language, (3 more...)

Neural Information Processing Systems

Mar-19-2020, 00:03:22 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.31)
  - Natural Language > Large Language Model (0.31)