TransLLaMa: LLM-based Simultaneous Translation System
Koshkin, Roman, Sudoh, Katsuhito, Nakamura, Satoshi
–arXiv.org Artificial Intelligence
Decoder-only large language models (LLMs) have recently demonstrated impressive capabilities in text generation and reasoning. Nonetheless, they have limited applications in simultaneous machine translation (SiMT), currently dominated by encoder-decoder transformers. This study demonstrates that, after fine-tuning on a small dataset comprising causally aligned source and target sentence pairs, a pre-trained open-source LLM can control input segmentation directly by generating a special "wait" token. This obviates the need for a separate policy and enables the LLM to perform English-German and English-Russian SiMT tasks with BLEU scores that are comparable to those of specific state-of-the-art baselines. We also evaluated closed-source models such as GPT-4, which displayed encouraging results in performing the SiMT task without prior training (zero-shot), indicating a promising avenue for enhancing future SiMT systems.
arXiv.org Artificial Intelligence
Feb-7-2024
- Country:
- Antarctica (0.04)
- North America
- Dominican Republic (0.04)
- United States
- Pennsylvania (0.04)
- Minnesota > Hennepin County
- Minneapolis (0.04)
- Louisiana > Orleans Parish
- New Orleans (0.04)
- Canada > Ontario
- Toronto (0.04)
- Europe
- France (0.04)
- Spain
- Galicia > Madrid (0.04)
- Valencian Community > Valencia Province
- Valencia (0.04)
- Italy > Tuscany
- Florence (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Belgium > Brussels-Capital Region
- Brussels (0.04)
- Asia
- South Korea (0.04)
- Middle East
- Japan > Kyūshū & Okinawa
- Okinawa (0.04)
- India > Karnataka
- Bengaluru (0.04)
- Genre:
- Research Report (1.00)
- Technology: