MERGE$^3$: Efficient Evolutionary Merging on Consumer-grade GPUs
Mencattini, Tommaso, Minut, Adrian Robert, Crisostomi, Donato, Santilli, Andrea, Rodolà, Emanuele
–arXiv.org Artificial Intelligence
Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for consumer hardware. We introduce MERGE$^3$, an efficient framework that makes evolutionary merging feasible on a single GPU by reducing fitness computation costs 50$\times$ while preserving performance. MERGE$^3$ achieves this by Extracting a reduced dataset for evaluation, Estimating model abilities using Item Response Theory (IRT), and Evolving optimal merges via IRT-based performance estimators. Our method enables state-of-the-art multilingual and cross-lingual merging, transferring knowledge across languages with significantly lower computational overhead. We provide theoretical guarantees and an open-source library, democratizing high-quality model merging.
arXiv.org Artificial Intelligence
Feb-9-2025
- Country:
- Asia > Middle East
- Jordan (0.04)
- Europe
- Ireland > Leinster
- County Dublin > Dublin (0.04)
- Italy > Tuscany
- Florence (0.04)
- Switzerland > Vaud
- Lausanne (0.04)
- Ireland > Leinster
- North America > United States
- Florida > Miami-Dade County
- Miami (0.04)
- New York > New York County
- New York City (0.04)
- Florida > Miami-Dade County
- South America > Colombia
- Meta Department > Villavicencio (0.04)
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.93)
- Industry:
- Education (0.46)
- Technology: