A bi-objective $\epsilon$-constrained framework for quality-cost optimization in language model ensembles

Singla, Aditi, Singh, Aditya, Kukreja, Kanishk

Dec-26-2023–arXiv.org Artificial Intelligence

We propose an ensembling framework that uses diverse open-sourced Large Language Models (LLMs) to achieve high response quality while maintaining cost efficiency. We formulate a bi-objective optimization problem to represent the quality-cost tradeoff and then introduce an additional budget constraint that reduces the problem to a straightforward 0/1 knapsack problem. We empirically demonstrate that our framework outperforms the existing ensembling approaches in response quality while significantly reducing costs. Large Language Models (LLMs) excel in traditional NLP problems (OpenAI (2023)), but their high inference costs hinder deployment in high-throughput applications (Anonymous (2023a)). Meanwhile, opensource models are less performant than their closed-source counterparts (Beeching et al. (2023)), but they typically offer lower inference costs (Kaplan et al. (2020)).

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

Dec-26-2023

arXiv.org PDF

Add feedback

Country:
- Asia > India (0.14)

Genre:
- Research Report (0.50)

Industry:
- Health & Medicine (0.96)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language > Large Language Model (1.00)