Regularizing Recurrent Neural Networks via Sequence Mixup

Karamzade, Armin, Najafi, Amir, Motahari, Seyed Abolfazl

Nov-27-2020–arXiv.org Machine Learning

Recurrent neural networks are the basis of the state-of-the-art models in natural language processing, including language modeling (Mikolov et al., 2011), machine translation (Cho et al., 2014) and named entity recognition (Lample et al., 2016). It is needless to say that complex learning tasks require relatively large networks with millions of parameters to be accomplished. However, large neural networks need more data and/or strong regularization techniques to be trained successfully and avoid overfitting. Without the means to collect more data, which is the case in the majority of real-world problems, data augmentation and regularization methods are standard alternative practices to overcome this barrier. Data augmentation in natural language processing is limited, and often task-specific (Kobayashi, 2018; Kafle et al., 2017). On the other hand, adopting the same regularization methods that are originally proposed for feed-forward (non-recurrent) networks needs to be done with extra care to avoid hurting the network's information flow between consecutive time-steps. To overcome such limitations, we present Sequence Mixup: a set of training methods, regularization techniques, and data augmentation procedures for RNNs. Sequence Mixup can be considered as the RNN-generalization of input mixup (Zhang et al., 2017) and manifold mixup (Verma et al., 2018), which are already introduced for feed-forward neural

deep learning, mixup, neural network, (16 more...)

arXiv.org Machine Learning

Nov-27-2020

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - Iran (0.14)
- Europe
  - Germany (0.14)
  - Portugal (0.14)
  - Spain (0.14)
- North America > United States
  - Louisiana (0.14)

Genre:
- Research Report (0.84)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found