Enhancing LLM's Cognition via Structurization

Liu, Kai, Fu, Zhihang, Chen, Chao, Zhang, Wei, Jiang, Rongxin, Zhou, Fan, Chen, Yaowu, Wu, Yue, Ye, Jieping

Jul-23-2024–arXiv.org Artificial Intelligence

When reading long-form text, human cognition is complex and structurized. While large language models (LLMs) process input contexts through a causal and sequential perspective, this approach can potentially limit their ability to handle intricate and complex inputs effectively. To enhance LLM's cognition capability, this paper presents a novel concept of context structurization. Specifically, we transform the plain, unordered contextual sentences into well-ordered and hierarchically structurized elements. By doing so, LLMs can better grasp intricate and extended contexts through precise attention and information-seeking along the organized structures. Extensive evaluations are conducted across various model architectures and sizes (including several 7B- to 72B-size auto-regressive LLMs as well as BERT-like masking models) on a diverse set of NLP tasks (e.g., context-based question-answering, exhaustive hallucination evaluation, and passage-level dense retrieval). Empirical results show consistent and significant performance gains afforded by a single-round structurization. In particular, we boost a 72B-parameter open-source model to achieve comparable performance against GPT-3.5-Turbo as the hallucination evaluator. Besides, we show the feasibility of distilling advanced LLMs' language processing abilities to a smaller yet effective StruXGPT-7B to execute structurization, addressing the practicality of our approach. Code will be made public soon.

arxiv preprint arxiv, language model, structurization, (15 more...)

arXiv.org Artificial Intelligence

Jul-23-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - California > Santa Clara County > Palo Alto (0.04)
- Europe > Italy
  - Calabria > Catanzaro Province > Catanzaro (0.04)
- Asia
  - Singapore (0.04)
  - Middle East > Jordan (0.04)
  - Indonesia > Bali (0.04)
  - China (0.04)

Genre:
- Research Report > New Finding (0.66)

Industry:
- Information Technology > Security & Privacy (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found