A bi-objective $\epsilon$-constrained framework for quality-cost optimization in language model ensembles

Open in new window