Generative AI Toolkit -- a framework for increasing the quality of LLM-based applications over their whole life cycle

Kohl, Jens, Gloger, Luisa, Costa, Rui, Kruse, Otto, Luitz, Manuel P., Katz, David, Barbeito, Gonzalo, Schweier, Markus, French, Ryan, Schroeder, Jonas, Riedl, Thomas, Perri, Raphael, Mostafa, Youssef

Dec-18-2024–arXiv.org Artificial Intelligence

Since their introduction LLM have gained widespread traction in different domains. They can be used as stand-alone products, but also to augment existing software products such as applications (also called agentic functions) or machine learning agents (also called LLM-based agents) to increase their capabilities. In this section, we show challenges during development and operation of LLM-based applications on three examples. Users interact with LLM-based applications by entering input into the LLM, the so-called prompt. Jang et al. showed in 2023 that the LLM's output is very sensitive to variations of the prompt [1]. Thus, the task of finding the best prompt to generate expected or best output leads to manual, trial-and-error-prompt experimenting - a method well known as prompt-engineering (cf. White et al. in 2023 for ChatGPT [2] or a survey of prompt techniques by Schulhoff et al. in 2024 [3]). Additionally, the outputs of an LLM-based application can not only vary, but also be wrong without telling a user ("hallucination", cf.

large language model, machine learning, natural language, (13 more...)

arXiv.org Artificial Intelligence

Dec-18-2024

arXiv.org PDF

Add feedback

Country:
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:
- Overview (1.00)

Industry:
- Information Technology (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning > Generative AI (0.54)