Determine-Then-Ensemble: Necessity of Top-k Union for Large Language Model Ensembling

Yao, Yuxuan, Wu, Han, Liu, Mingyang, Luo, Sichun, Han, Xiongwei, Liu, Jie, Guo, Zhijiang, Song, Linqi

Oct-3-2024–arXiv.org Artificial Intelligence

Large language models (LLMs) exhibit varying strengths and weaknesses across different tasks, prompting recent studies to explore the benefits of ensembling models to leverage their complementary advantages. However, existing LLM ensembling methods often overlook model compatibility and struggle with inefficient alignment of probabilities across the entire vocabulary. In this study, we empirically investigate the factors influencing ensemble performance, identifying model performance, vocabulary size, and response style as key determinants, revealing that compatibility among models is essential for effective ensembling. This analysis leads to the development of a simple yet effective model selection strategy that identifies compatible models. TE), a novel approach that efficiently combines models by focusing on the union of the top-k tokens from each model, thereby avoiding the need for full vocabulary alignment and reducing computational overhead. TE significantly enhances performance compared to existing methods, offering a more efficient framework for LLM ensembling. Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks and have shown promising results in real-world applications (OpenAI, 2023; Yang et al., 2024; Dubey et al., 2024). Given the diversity in data sources, model architectures, and training methods, LLMs exhibit varying strengths and weaknesses depending on the task at hand. Consequently, rather than relying solely on training an LLM from scratch, an alternative approach is to create an ensemble of LLMs. This method allows for leveraging the complementary advantages of different LLMs (Jiang et al., 2023b; Lu et al., 2024; Yu et al., 2024b). Existing model ensembling methods can be broadly categorized into three types: output-level, probability-level, and training-level approaches.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

Oct-3-2024

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - Victoria > Melbourne (0.04)
- North America
  - United States
    - Texas > Travis County
      - Austin (0.04)
    - New York > New York County
      - New York City (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
  - Canada
    - Ontario > Toronto (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.14)
- Europe
  - Austria > Vienna (0.14)
  - United Kingdom > England (0.04)
- Asia
  - Middle East > Jordan (0.04)
  - China > Hong Kong (0.04)
  - British Indian Ocean Territory > Diego Garcia (0.04)
- Africa > Middle East
  - Libya (0.04)

Genre:
- Research Report > New Finding (0.34)
- Overview > Innovation (0.34)

Industry:
- Leisure & Entertainment (1.00)
- Media > Music (0.93)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language > Large Language Model (1.00)
  - Machine Learning > Neural Networks
    - Deep Learning (0.71)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found