Tele-FLM Technical Report

Li, Xiang, Yao, Yiqun, Jiang, Xin, Fang, Xuezhi, Wang, Chao, Liu, Xinzhang, Wang, Zihan, Zhao, Yu, Wang, Xin, Huang, Yuyao, Song, Shuangyong, Li, Yongxiang, Zhang, Zheng, Zhao, Bo, Sun, Aixin, Wang, Yequan, He, Zhongjiang, Wang, Zhongyuan, Li, Xuelong, Huang, Tiejun

Apr-25-2024–arXiv.org Artificial Intelligence

Large language models (LLMs) have showcased profound capabilities in language understanding and generation, facilitating a wide array of applications. However, there is a notable paucity of detailed, open-sourced methodologies on efficiently scaling LLMs beyond 50 billion parameters with minimum trial-and-error cost and computational resources. In this report, we introduce Tele-FLM (aka FLM-2), a 52B open-sourced multilingual large language model that features a stable, efficient pre-training paradigm and enhanced factual judgment capabilities. Tele-FLM demonstrates superior multilingual language modeling abilities, measured by BPB on textual corpus. Besides, in both English and Chinese foundation model evaluation, it is comparable to strong open-sourced models that involve larger pre-training FLOPs, such as Llama2-70B and DeepSeek-67B. In addition to the model weights, we share the core designs, engineering practices, and training details, which we expect to benefit both the academic and industrial communities.

arxiv preprint arxiv, language model, tele-flm, (14 more...)

arXiv.org Artificial Intelligence

Apr-25-2024

arXiv.org PDF

Add feedback

Country:
- North America
  - United States > Minnesota
    - Hennepin County > Minneapolis (0.14)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - Austria (0.04)
  - Italy
    - Tuscany > Florence (0.04)
    - Calabria > Catanzaro Province
      - Catanzaro (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - Singapore (0.04)
  - China > Beijing
    - Beijing (0.04)
- Africa > Rwanda
  - Kigali > Kigali (0.04)

Genre:
- Research Report (0.64)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found