Panda LLM: Training Data and Evaluation for Open-Sourced Chinese Instruction-Following Large Language Models

Jiao, Fangkai, Ding, Bosheng, Luo, Tianze, Mo, Zhanfeng

May-4-2023–arXiv.org Artificial Intelligence

It due to their exceptional versatility in various is also the first released LLM of the Dandelion natural language proccessing tasks such as code Project. Our Panda LLM model has been writing and article editing, making them ubiquitous trained on Chinese-Wiki-2019, Chinese-News-in various industries and significantly enhancing 2016, Chinese-Baike-2018, Chinese-Webtext-2019 people's productivity (Ding et al., 2022; Zhao et al., and Translation-2019 (Xu, 2019) and COIG 2023). However, there are limitations to current datasets (Zhang et al., 2023) with instructiontuning off-the-shelf instruction-following large language (Wei et al., 2021) based on the LLaMA models, including the lack of trustworthiness in model (Touvron et al., 2023). Anticipated future releases generated results, lack of transparency in the model include progressively larger models such as used which raises concerns about data security, and Panda-13B and Panda-33B, with expected release the unknown training recipe, making it difficult to dates in the near future. Equal contribution, order decided by coin flip.

artificial intelligence, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

May-4-2023

arXiv.org PDF

Add feedback

Country:
- North America > United States (0.14)

Genre:
- Research Report (0.50)

Industry:
- Information Technology > Security & Privacy (0.88)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.49)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found