Faster, Cheaper, Better: Multi-Objective Hyperparameter Optimization for LLM and RAG Systems

Barker, Matthew, Bell, Andrew, Thomas, Evan, Carr, James, Andrews, Thomas, Bhatt, Umang

Feb-25-2025–arXiv.org Artificial Intelligence

While Retrieval Augmented Generation (RAG) has emerged as a popular technique for improving Large Language Model (LLM) systems, it introduces a large number of choices, parameters and hyperparameters that must be made or tuned. This includes the LLM, embedding, and ranker models themselves, as well as hy-perparameters governing individual RAG components. Y et, collectively optimizing the entire configuration in a RAG or LLM system remains under-explored-- especially in multi-objective settings--due to intractably large solution spaces, noisy objective evaluations, and the high cost of evaluations. In this work, we introduce the first approach for multi-objective parameter optimization of cost, latency, safety and alignment over entire LLM and RAG systems. We find that Bayesian optimization methods significantly outperform baseline approaches, obtaining a superior Pareto front on two new RAG benchmark tasks. We conclude our work with important considerations for practitioners who are designing multi-objective RAG systems, highlighting nuances such as how optimal configurations may not generalize across tasks and objectives. Retrieval Augmented Generation (RAG) has emerged as a popular technique for improving the performance of Large Language Models (LLMs) on question-answering tasks over specific datasets. A benefit of using RAG pipelines is that they can often achieve high performance on specific tasks without the need for extensive alignment and fine-tuning (Gupta et al., 2024), a costly and time-consuming process. However, the end-to-end pipeline of a RAG system is dependent on many parameters that span different components (or modules) of the system, such as the choice of LLM, the embedding model used in retrieval, the number of chunks retrieved and hyperparameters governing a reranking model. Examples of choices, parameters, and hyperparameters that are often made or tuned when implementing a RAG pipeline are listed in Table 1.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Feb-25-2025

arXiv.org PDF

Add feedback

Country:
- Europe (0.28)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Banking & Finance (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)