MindMerger: Efficiently Boosting LLM Reasoning in non-English Languages

May-26-2025, 21:58:29 GMT–Neural Information Processing Systems

Reasoning capabilities are crucial for Large Language Models (LLMs), yet a notable gap exists between English and non-English languages. To bridge this disparity, some works fine-tune LLMs to relearn reasoning capabilities in non-English languages, while others replace non-English inputs with an external model's outputs such as English translation text to circumvent the challenge of LLM understanding non-English. In order to better utilize the minds of reasoning and language understanding in LLMs, we propose a new method, namely MergeMinds, which merges LLMs with the external language understanding capabilities from multilingual models to boost the multilingual reasoning performance. Furthermore, a two-step training scheme is introduced to first train to embeded the external capabilities into LLMs and then train the collaborative utilization of the external capabilities and the built-in capabilities in LLMs. Experiments on three multilingual reasoning datasets and a language understanding dataset demonstrate that MergeMinds consistently outperforms all baselines, especially in low-resource languages.

artificial intelligence, large language model, natural language, (4 more...)

Neural Information Processing Systems

May-26-2025, 21:58:29 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)