Partially Randomizing Transformer Weights for Dialogue Response Diversity

Lee, Jing Yang, Lee, Kong Aik, Gan, Woon-Seng

Nov-17-2023–arXiv.org Artificial Intelligence

Despite recent progress in generative open-domain dialogue, the issue of low response diversity persists. Prior works have addressed this issue via either novel objective functions, alternative learning approaches such as variational frameworks, or architectural extensions such as the Randomized Link (RL) Transformer. However, these approaches typically entail either additional difficulties during training/inference, or a significant increase in model size and complexity. Hence, we propose the \underline{Pa}rtially \underline{Ra}ndomized trans\underline{Former} (PaRaFormer), a simple extension of the transformer which involves freezing the weights of selected layers after random initialization. Experimental results reveal that the performance of the PaRaformer is comparable to that of the aforementioned approaches, despite not entailing any additional training difficulty or increase in model complexity.

computational linguistic, initialization, paraformer, (13 more...)

arXiv.org Artificial Intelligence

Nov-17-2023

arXiv.org PDF

Add feedback

Country:
- Asia
  - China > Hong Kong (0.04)
  - Taiwan > Taiwan Province
    - Taipei (0.04)
- Europe
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Germany > North Rhine-Westphalia
    - Cologne Region > Bonn (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
  - Italy
    - Sardinia (0.04)
    - Tuscany > Florence (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - United Kingdom > England (0.04)
- North America
  - Canada > British Columbia
    - Metro Vancouver Regional District > Vancouver (0.04)
  - United States
    - California > San Diego County
      - San Diego (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Michigan (0.04)
    - New York > New York County
      - New York City (0.04)
    - Texas > Travis County
      - Austin (0.04)
- Oceania > Australia
  - Victoria > Melbourne (0.04)

Genre:
- Research Report (1.00)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.46)
  - Natural Language (1.00)
  - Representation & Reasoning (0.93)