FoodGPT: A Large Language Model in Food Testing Domain with Incremental Pre-training and Knowledge Graph Prompt

Qi, Zhixiao, Yu, Yijiong, Tu, Meiqi, Tan, Junyi, Huang, Yongfeng

arXiv.org Artificial Intelligence 

Large language models (LLM) [1] have gained significant research importance in the field of natural language processing. Models such as ChatGPT, LLaMA [2], GPT-4, ChatGLM [3], and PaLM [4] have demonstrated outstanding performance in downstream tasks. The powerful ability of LLM in understanding human instructions has led to continuous research on LLMs in various vertical domains. ChatLaw [5] is based on Ziya-LLaMA-13B and utilizes legal data for instruction fine-tuning, incorporating vector database retrieval to create a legal LLM. DoctorGLM [6] is built upon ChatGLM-6B and fine-tuned using Chinese medical dialogue datasets to create a Chinese medical consultation model. BenTsao is based on LLaMA-7B and constructs a Chinese medical LLM by leveraging a medical knowledge graph and the GPT-3.5 API to build a Chinese medical instruction dataset. Cornucopia, on the other hand, is based on LLaMA-7B and constructs an instruction dataset using Chinese financial public data and crawled financial data, focusing on question-answering in the financial domain. Previous research assume that the base models have already injected the corresponding domain knowledge, hence no incremental pre-training is performed on the base models.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found