Optimizing Token Usage on Large Language Model Conversations Using the Design Structure Matrix

Alarcia, Ramon Maria Garcia, Golkar, Alessandro

arXiv.org Artificial Intelligence 

The recent, rapid development and popularization of Large Language Models (LLM) have transformed the panorama of Natural Language Processing (NLP) and, more generally, of Artificial Intelligence (AI), permeating into society and transforming the way many tasks are performed, being now either supported or automated with the help of LLM-based tools. Along with the challenges of hallucinations, lack of reasoning capabilities, inability to perform numerical calculations, natural aging of the training data, and improper traceability and citation of information sources, another intrinsic challenge of LLMs, tightly related to their architecture and training, concerns their limited context window and maximum token output (Kaddour et al., 2023). Indeed, the context window is the cornerstone for LLM-based applications which require the previous interactions in the conversation to be preserved and considered by the LLM. This, being true for long conversations, is of particular importance in the engineering design field when an LLM is used to support engineers in the design of a system, going from high-level concept generation to lower-level system requirements or technical specifications generation. This application requires previous decisions as well as the decision-making process to be considered in later stages.