FoodGPT: A Large Language Model in Food Testing Domain with Incremental Pre-training and Knowledge Graph Prompt

Qi, Zhixiao, Yu, Yijiong, Tu, Meiqi, Tan, Junyi, Huang, Yongfeng

Aug-20-2023–arXiv.org Artificial Intelligence

Large language models (LLM) [1] have gained significant research importance in the field of natural language processing. Models such as ChatGPT, LLaMA [2], GPT-4, ChatGLM [3], and PaLM [4] have demonstrated outstanding performance in downstream tasks. The powerful ability of LLM in understanding human instructions has led to continuous research on LLMs in various vertical domains. ChatLaw [5] is based on Ziya-LLaMA-13B and utilizes legal data for instruction fine-tuning, incorporating vector database retrieval to create a legal LLM. DoctorGLM [6] is built upon ChatGLM-6B and fine-tuned using Chinese medical dialogue datasets to create a Chinese medical consultation model. BenTsao is based on LLaMA-7B and constructs a Chinese medical LLM by leveraging a medical knowledge graph and the GPT-3.5 API to build a Chinese medical instruction dataset. Cornucopia, on the other hand, is based on LLaMA-7B and constructs an instruction dataset using Chinese financial public data and crawled financial data, focusing on question-answering in the financial domain. Previous research assume that the base models have already injected the corresponding domain knowledge, hence no incremental pre-training is performed on the base models.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

Aug-20-2023

arXiv.org PDF

Add feedback

Country:
- Asia > China > Beijing > Beijing (0.06)

Genre:
- Research Report (0.50)

Industry:
- Law (0.55)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found