Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data

Xu, Canwen, Guo, Daya, Duan, Nan, McAuley, Julian

Dec-2-2023–arXiv.org Artificial Intelligence

Chat models, such as ChatGPT, have shown impressive capabilities and have been rapidly adopted across numerous domains. However, these models are only accessible through a restricted API, creating barriers for new research and progress in the field. We propose a pipeline that can automatically generate a high-quality multi-turn chat corpus by leveraging ChatGPT to engage in a conversation with itself. Subsequently, we employ parameter-efficient tuning to enhance LLaMA, an open-source large language model. The resulting model, named Baize, demonstrates good performance in multi-turn dialogues with guardrails that minimize potential risks. Furthermore, we propose a new technique called Self-Distill with Feedback, to further improve the performance of the Baize models with feedback from ChatGPT. The Baize models and data are released for research purposes only at https://github.com/project-baize/baize-chatbot. An online demo is also available at https://huggingface.co/spaces/project-baize/chat-with-baize.

baize, chatgpt, language model, (15 more...)

arXiv.org Artificial Intelligence

Dec-2-2023

arXiv.org PDF

Add feedback

Country:
- Asia (0.04)
- North America > United States
  - New York (0.04)
  - California > San Diego County
    - San Diego (0.04)
- Europe > Romania
  - Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre:
- Research Report (0.64)

Industry:
- Banking & Finance (1.00)
- Law (0.93)
- Health & Medicine > Therapeutic Area (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Chatbot (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)