Generative AI Toolkit -- a framework for increasing the quality of LLM-based applications over their whole life cycle
Kohl, Jens, Gloger, Luisa, Costa, Rui, Kruse, Otto, Luitz, Manuel P., Katz, David, Barbeito, Gonzalo, Schweier, Markus, French, Ryan, Schroeder, Jonas, Riedl, Thomas, Perri, Raphael, Mostafa, Youssef
–arXiv.org Artificial Intelligence
Since their introduction LLM have gained widespread traction in different domains. They can be used as stand-alone products, but also to augment existing software products such as applications (also called agentic functions) or machine learning agents (also called LLM-based agents) to increase their capabilities. In this section, we show challenges during development and operation of LLM-based applications on three examples. Users interact with LLM-based applications by entering input into the LLM, the so-called prompt. Jang et al. showed in 2023 that the LLM's output is very sensitive to variations of the prompt [1]. Thus, the task of finding the best prompt to generate expected or best output leads to manual, trial-and-error-prompt experimenting - a method well known as prompt-engineering (cf. White et al. in 2023 for ChatGPT [2] or a survey of prompt techniques by Schulhoff et al. in 2024 [3]). Additionally, the outputs of an LLM-based application can not only vary, but also be wrong without telling a user ("hallucination", cf.
arXiv.org Artificial Intelligence
Dec-18-2024
- Country:
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Genre:
- Overview (1.00)
- Industry:
- Information Technology (0.67)
- Technology: