EM-Network: Oracle Guided Self-distillation for Sequence Learning
Yoon, Ji Won, Ahn, Sunghwan, Lee, Hyeonseung, Kim, Minchan, Kim, Seok Min, Kim, Nam Soo
–arXiv.org Artificial Intelligence
We introduce EM-Network, a novel self-distillation approach that effectively leverages target information for supervised sequence-to-sequence (seq2seq) learning. In contrast to conventional methods, it is trained with oracle guidance, which is derived from the target sequence. Since the oracle guidance compactly represents the target-side context that can assist the sequence model in solving the task, the EM-Network achieves a better prediction compared to using only the source input. To allow the sequence model to inherit the promising capability of the EM-Network, we propose a new self-distillation strategy, where the original sequence model can benefit from the knowledge of the EM-Network in a one-stage manner. We conduct comprehensive experiments on two types of seq2seq models: connectionist temporal classification (CTC) for speech recognition and attention-based encoder-decoder (AED) for machine translation. Experimental results demonstrate that the EM-Network significantly advances the current state-of-the-art approaches, improving over the best prior work on speech recognition and establishing state-of-the-art performance on WMT'14 and IWSLT'14.
arXiv.org Artificial Intelligence
Jun-14-2023
- Country:
- North America > United States
- Hawaii > Honolulu County > Honolulu (0.04)
- Asia > South Korea
- North America > United States
- Genre:
- Research Report
- New Finding (0.48)
- Promising Solution (0.34)
- Research Report
- Industry:
- Education (1.00)
- Technology: