MERLIN: Multi-Stage Curriculum Alignment for Multilingual Encoder-LLM Integration in Cross-Lingual Reasoning
Uemura, Kosei, Guzmán, David, Nguyen, Quang Phuoc, Alabi, Jesujoba Oluwadara, Lee, En-shiun Annie, Adelani, David Ifeoluwa
–arXiv.org Artificial Intelligence
Large language models excel in English but still struggle with complex reasoning in many low-resource languages (LRLs). Existing encoder-plus-decoder methods such as LangBridge and MindMerger raise accuracy on mid and high-resource languages, yet they leave a large gap on LRLs. We present MERLIN, a two-stage model-stacking framework that applies a curriculum learning strategy -- from general bilingual bitext to task-specific data -- and adapts only a small set of DoRA weights. On the AfriMGSM benchmark MERLIN improves exact-match accuracy by +12.9 pp over MindMerger and outperforms GPT-4o-mini. It also yields consistent gains on MGSM and MSVAMP (+0.9 and +2.8 pp), demonstrating effectiveness across both low and high-resource settings.
arXiv.org Artificial Intelligence
Nov-12-2025
- Country:
- Asia
- Europe
- Austria > Vienna (0.14)
- Germany > Saarland (0.04)
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Middle East > Malta
- Eastern Region > Northern Harbour District > St. Julian's (0.04)
- North America
- Canada
- Mexico > Mexico City
- Mexico City (0.04)
- United States > Florida
- Miami-Dade County > Miami (0.04)
- Genre:
- Research Report (0.40)
- Technology: