Efficient Training of Robust Traditional Chinese LLaMA-1B on a Single Consumer GPU: Continual Pre-training, SFT, and DPO

Chih, Yu-Cheng, Duan, Ming-Tao, Hou, Yong-Hao

Oct-3-2025–arXiv.org Artificial Intelligence

Small Language Models (SLMs) enable cost - effective, on - device and latency - sensitive AI applications, yet their deployment in Traditional Chinese (TC) remains hindered by token - level instability -- models unpredictably emit non - TC characters or code - switch into othe r languages. We address this practical reliability gap by creating PureTC - 1B, a three - stage stabilization pipeline for Llama - 3.2 - 1B - Instruct (an open - weight, instruction - tuned model released by Meta) [1] using parameter - efficient LoRA adapters [2] . Our met hod combines Continual Pre - Training (CPT) on TC - centric corpora, Supervised Fine - Tuning (SFT) with instruction data, and Direct Preference Optimization (DPO) [3] using TC - adherence preferences to improve monolingual robustness without full - model retraining. On a benchmark designed to simulate real - world usage, PureTC - 1B achieves a 51.3% relative reduction (micro - average) in non - TC output tokens versus the base model. On a Named Entity Translation (NET) task, PureTC - 1B further reduces incorrect - language tokens by 77.2% relative to Llama - 3B and 57.2% relative to Qwen - 1.5B, indicating that robust 2 of 17 TC adherence is attainable even at the 1B scale. The pipeline is reproducible, adapter - only, and hardware - friendly, offering practitioners a practical recipe to enhance language stability for TC and potentially other non - English languages.

large language model, llama, machine learning, (20 more...)

arXiv.org Artificial Intelligence

Oct-3-2025

arXiv.org PDF

Add feedback

Country:
- Asia > Taiwan (0.16)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Law > Statutes (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (0.91)
  - Machine Learning > Neural Networks
    - Deep Learning (0.69)