AITopics | mergekit

Collaborating Authors

mergekit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mergenetic: a Simple Evolutionary Model Merging Library

Minut, Adrian Robert, Mencattini, Tommaso, Santilli, Andrea, Crisostomi, Donato, Rodolà, Emanuele

arXiv.org Artificial IntelligenceAug-25-2025

Model merging allows combining the capabilities of existing models into a new one - post hoc, without additional training. This has made it increasingly popular thanks to its low cost and the availability of libraries that support merging on consumer GPUs. Recent work shows that pairing merging with evolutionary algorithms can boost performance, but no framework currently supports flexible experimentation with such strategies in language models. We introduce Mergenetic, an open-source library for evolutionary model merging. Mergenetic enables easy composition of merging methods and evolutionary algorithms while incorporating lightweight fitness estimators to reduce evaluation costs. We describe its design and demonstrate that Mergenetic produces competitive results across tasks and languages using modest hardware.

evolutionary algorithm, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.acl-demo.55

2505.11427

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)

Add feedback

Arcee's MergeKit: A Toolkit for Merging Large Language Models

Goddard, Charles, Siriwardhana, Shamane, Ehghaghi, Malikeh, Meyers, Luke, Karpukhin, Vlad, Benedict, Brian, McQuade, Mark, Solawetz, Jacob

arXiv.org Artificial IntelligenceMar-20-2024

The rapid expansion of the open-source language model landscape presents an opportunity to merge the competencies of these model checkpoints by combining their parameters. Advances in transfer learning, the process of fine-tuning pretrained models for specific tasks, has resulted in the development of vast amounts of task-specific models, typically specialized in individual tasks and unable to utilize each other's strengths. Model merging facilitates the creation of multitask models without the need for additional training, offering a promising avenue for enhancing model performance and versatility. By preserving the intrinsic capabilities of the original models, model merging addresses complex challenges in AI - including the difficulties of catastrophic forgetting and multitask learning. To support this expanding area of research, we introduce MergeKit, a comprehensive, open-source library designed to facilitate the application of model merging strategies. MergeKit offers an extensible framework to efficiently merge models on any hardware, providing utility to researchers and practitioners. To date, thousands of models have been merged by the open-source community, leading to the creation of some of the worlds most powerful open-source model checkpoints, as assessed by the Open LLM Leaderboard. The library is accessible at https://github.com/arcee-ai/MergeKit.

arxiv preprint arxiv, checkpoint, mergekit, (12 more...)

arXiv.org Artificial Intelligence

2403.13257

Country:

Asia > Middle East > Jordan (0.04)
North America > United States (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Promising Solution (0.66)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback