Guiding Large Language Models to Generate Computer-Parsable Content

Apr-21-2024–arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated remarkable capabilities in learning patterns from massive text corpora, including word relationships, sentence structures, and even complex semantic and pragmatic information. However, it remains challenging to induce pre-trained language models to generate structured content that strictly follows specific conventions. We propose a scheme for guiding LLMs to generate highly usable content for computers without the need for fine-tuning and additional neural network inference, by introducing coroutine-based content generation constraints through a pre-agreed context-free grammar (CFG), which guides the autoregressive model Transformer to sample the correct tokens during its decoding phase to form a program-compliant form in the decoding phase of the autoregressive model Transformer to form a formal language that conforms to the program conventions. This will effectively improve the stability and consistency of LLMs in generating target data structures, types or instructions, and reduce the difficulty of application development and integration. We first conducted the matching bracket pairs experiment to verify that the error rate of models like GPT-2 and Gemma reaches 95% when the generated DSLs exceed lengths of 36 and 282 characters, respectively.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

Apr-21-2024

arXiv.org PDF

Add feedback

Country:
- Asia > China (0.04)
- North America > United States
  - Illinois > Cook County > Chicago (0.04)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found