KAG: Boosting LLMs in Professional Domains via Knowledge Augmented Generation

Liang, Lei, Sun, Mengshu, Gui, Zhengke, Zhu, Zhongshu, Jiang, Zhouyu, Zhong, Ling, Qu, Yuan, Zhao, Peilong, Bo, Zhongpu, Yang, Jin, Xiong, Huaidong, Yuan, Lin, Xu, Jun, Wang, Zaoyang, Zhang, Zhiqiang, Zhang, Wen, Chen, Huajun, Chen, Wenguang, Zhou, Jun

Sep-26-2024–arXiv.org Artificial Intelligence

The recently developed retrieval-augmented generation (RAG) technology has enabled the efficient construction of domain-specific applications. However, it also has limitations, including the gap between vector similarity and the relevance of knowledge reasoning, as well as insensitivity to knowledge logic, such as numerical values, temporal relations, expert rules, and others, which hinder the effectiveness of professional knowledge services. In this work, we introduce a professional domain knowledge service framework called Knowledge Augmented Generation (KAG). KAG is designed to address the aforementioned challenges with the motivation of making full use of the advantages of knowledge graph(KG) and vector retrieval, and to improve generation and reasoning performance by bidirectionally enhancing large language models (LLMs) and KGs through five key aspects: (1) LLM-friendly knowledge representation, (2) mutual-indexing between knowledge graphs and original chunks, (3) logical-form-guided hybrid reasoning engine, (4) knowledge alignment with semantic reasoning, and (5) model capability enhancement for KAG. We compared KAG with existing RAG methods in multihop question answering and found that it significantly outperforms state-of-the-art methods, achieving a relative improvement of 19.6% on hotpotQA and 33.5% on 2wiki in terms of F1 score. We have successfully applied KAG to two professional knowledge Q&A tasks of Ant Group, including E-Government Q&A and E-Health Q&A, achieving significant improvement in professionalism compared to RAG methods. Furthermore, we will soon natively support KAG on the opensource KG engine OpenSPG, allowing developers to more easily build rigorous knowledge decision-making or convenient information retrieval services. This will facilitate the localized development of KAG, enabling developers to build domain knowledge services with higher accuracy and efficiency.

large language model, machine learning, question answering, (18 more...)

arXiv.org Artificial Intelligence

Sep-26-2024

arXiv.org PDF

Add feedback

Country:
- Asia > Middle East
  - UAE (0.14)
- Europe > Austria
  - Vienna (0.14)
- North America > United States
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
  - Washington > King County
    - Seattle (0.14)

Genre:
- Research Report (1.00)

Industry:
- Government (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- Leisure & Entertainment (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (0.88)
  - Natural Language
    - Information Retrieval (1.00)
    - Large Language Model (1.00)