ChuXin: 1.6B Technical Report

Zhuang, Xiaomin, Jiang, Yufan, He, Qiaozhi, Wu, Zhihua

arXiv.org Artificial Intelligence 

Unlike the majority of works that only opensourced the model weights and architecture, we have made everything needed to train a model available, including the training data, the training process, and the evaluation code. Our goal is to empower and strengthen the open research community, fostering transparency and enabling a new wave of innovation in the field of language modeling. Furthermore, we extend the context length to 1M tokens through lightweight continual pretraining and demonstrate strong needlein-a-haystack retrieval performance. Countless models have been opensourced on AI communities like HuggingFace to facilitate their use by researchers (Bai et al., 2023; Singer et al., 2024; Zhang et al., 2024). These models can broadly be divided into two categories: 1) Open source model weights and data sources, which constitute the vast majority.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found