A Variational AutoEncoder for Transformers with Nonparametric Variational Information Bottleneck

Aug-12-2022–arXiv.org Artificial Intelligence

Attention-based deep learning models, such as Transformers (Vaswani et al., 2017; Devlin et al., 2019), have achieved unprecedented empirical success in a wide range of cognitive tasks, in particular in natural language processing (NLP). On the other hand, deep variational Bayesian approaches to representation learning, such as variational autoencoders (VAEs) (Kingma and Welling, 2014), have also been very influential, especially due to their variational information bottleneck (VIB) (Alemi et al., 2017; Kingma and Welling, 2014) for regularising the induced latent representations. Previous VIB methods only apply to a vector space, and Transformers crucially do not use a single vector as their latent representation, instead using a set of vectors (Lin et al., 2020; Fang et al., 2021; Park and Lee, 2021). This allows the number of vectors in a Transformer embedding to grow with the size of the input, which is essential for embedding natural language text (Bahdanau et al., 2015), where the size of the input can range from a single word to thousands of words. In this paper, we propose a variational information bottleneck regulariser for set-of-vector latent representations, and use it to regularise the induced latent representation of a Transformer encoder-decoder variational autoencoder.

posterior, representation, vector, (14 more...)

arXiv.org Artificial Intelligence

Aug-12-2022

arXiv.org PDF

Add feedback

Country:
- North America
  - Dominican Republic (0.04)
  - United States
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - California
      - San Diego County > San Diego (0.04)
      - Los Angeles County > Long Beach (0.04)
  - Canada > Alberta
    - Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
- Europe
  - Switzerland (0.04)
  - France (0.04)
  - Italy (0.04)
  - United Kingdom > England
    - Oxfordshire > Oxford (0.28)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China (0.04)
  - Afghanistan (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language (1.00)
  - Representation & Reasoning > Uncertainty
    - Bayesian Inference (1.00)
  - Machine Learning
    - Neural Networks (1.00)
    - Learning Graphical Models > Directed Networks
      - Bayesian Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found