Data Augmentation for Neural Machine Translation using Generative Language Model
Oh, Seokjin, Lee, Su Ah, Jung, Woohwan
–arXiv.org Artificial Intelligence
Neural Machine Translation(NMT) is the task of Through experiments, we examine that appropriate converting a sentence written in a source language prompts can reduce the generation cost of the into a target language sentence by using a translation synthetic data and facilitate the easy transfer of model. NMT models usually require vast knowledge from large-scale language models. We amounts of parallel data for training, but highquality also validate the effectiveness of the proposed 3 parallel data is often scarce. Since generating prompts through measure the diversity of generated parallel synthetic data demands substantial time synthetic data by each method. Via comparing and cost, especially for low-resource languages or the diversity, we demonstrate that generating domains, the problem becomes particularly severe various data is a crucial factor in synthetic data in such cases.
arXiv.org Artificial Intelligence
Nov-13-2023
- Country:
- North America > Dominican Republic (0.04)
- Oceania > Australia
- Europe
- Germany > Berlin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.05)
- Asia > China
- Hong Kong (0.04)
- Genre:
- Research Report (0.64)
- Technology: