AITopics | bitdelta

Collaborating Authors

bitdelta

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Neural Information Processing SystemsMar-18-2026, 13:55:01 GMT

Large Language Models (LLMs) are typically trained in two phases: pre-training on large internet-scale datasets, and fine-tuning for downstream tasks. Given the higher computational demand of pre-training, it is intuitive to assume that fine-tuning adds less new information to the model, and is thus more compressible. We explore this assumption by decomposing the weights of fine-tuned models into their pre-trained components and an additional delta. We introduce a simple method, BitDelta, which successfully quantizes this delta down to 1 bit without compromising performance. This interesting finding not only highlights the potential redundancy of information added during fine-tuning, but also has significant implications for the multi-tenant serving and multi-tenant storage of fine-tuned models. By enabling the use of a single high-precision base model accompanied by multiple 1-bit deltas, BitDelta dramatically reduces GPU memory requirements by more than 10x, thus reducing per-user generation latency by more than 10x in multi-tenant settings. We validate BitDelta through experiments across Llama-2, Mistral and MPT model families, and on models up to 70B parameters, showcasing minimal performance degradation in all tested settings.

large language model, machine learning, natural language, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)

Add feedback

Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models Bowen Ping

Neural Information Processing SystemsFeb-11-2026, 03:39:13 GMT

Recent studies suggest decomposing a fine-tuned LLM into a base model and corresponding delta weights, which are then compressed using low-rank or low-bit approaches to reduce costs. In this work, we observe that existing low-rank and low-bit compression methods can significantly harm the model performance for task-specific fine-tuned LLMs (e.g., WizardMath for math problems).

delta-compression method, large language model, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

BitDelta: YourFine-TuneMayOnlyBeWorthOneBit

Neural Information Processing SystemsFeb-8-2026, 15:46:13 GMT

Thisinteresting finding notonlyhighlights the potential redundancy of information added during fine-tuning, but also has significant implications for the multi-tenant serving and multi-tenant storage of fine-tuned models.

bitdelta, large language model, machine learning, (22 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

37664246a1e07e212ddacea6e5a523f2-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 23:16:53 GMT

delta-compression method, experiment, llm, (14 more...)

Neural Information Processing Systems

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Thailand > Bangkok > Bangkok (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

BitDelta: Y our Fine-Tune May Only Be Worth One Bit James Liu

Neural Information Processing SystemsOct-9-2025, 19:43:18 GMT

Realizing this vision is challenging due to two key reasons: 1) Expensive Storage.

base model, bitdelta, fine-tuned model, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)

Add feedback

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Neural Information Processing SystemsMay-26-2025, 17:27:55 GMT

bitdelta, large language model, machine learning, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.85)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.81)

Add feedback

A Unified View of Delta Parameter Editing in Post-Trained Large-Scale Models

Tang, Qiaoyu, Yu, Le, Yu, Bowen, Lin, Hongyu, Lu, Keming, Lu, Yaojie, Han, Xianpei, Sun, Le

arXiv.org Artificial IntelligenceOct-17-2024

Post-training has emerged as a crucial paradigm for adapting large-scale pretrained models to various tasks, whose effects are fully reflected by delta parameters (i.e., the disparity between post-trained and pre-trained parameters). While numerous studies have explored delta parameter properties via operations like pruning, quantization, low-rank approximation, and extrapolation, a unified framework for systematically examining these characteristics has been lacking. In this paper, we propose a novel perspective based on Riemann sum approximation of the loss function to elucidate delta parameter editing operations. Our analysis categorizes existing methods into three classes based on their post-editing performance: competitive, decreased, and improved, explaining how they are expressed by the Riemann sum approximation term and how they alter the model performance. Extensive experiments on both visual and language models, including ViT, LLaMA 3, Qwen 2, and Mistral, corroborate our theoretical findings. Furthermore, we introduce extensions to existing techniques like DARE and BitDelta, highlighting their limitations in leveraging the properties of delta parameters and reorganizing them into general expressions to enhance the applicability and effectiveness of delta parameter editing in post-trained models. With the remarkable success of large-scale pre-trained models, post-training has emerged as the de facto standard paradigm for effective adaptations to various tasks (Han et al., 2024; Xin et al., 2024; Dodge et al., 2020; Zhao et al., 2023).

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.13841

Country:

North America > United States > Virginia (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Asia > British Indian Ocean Territory > Diego Garcia (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Delta-CoMe: Training-Free Delta-Compression with Mixed-Precision for Large Language Models

Ping, Bowen, Wang, Shuo, Wang, Hanqing, Han, Xu, Xu, Yuzhuang, Yan, Yukun, Chen, Yun, Chang, Baobao, Liu, Zhiyuan, Sun, Maosong

arXiv.org Artificial IntelligenceJun-13-2024

Fine-tuning is a crucial process for adapting large language models (LLMs) to diverse applications. In certain scenarios, such as multi-tenant serving, deploying multiple LLMs becomes necessary to meet complex demands. Recent studies suggest decomposing a fine-tuned LLM into a base model and corresponding delta weights, which are then compressed using low-rank or low-bit approaches to reduce costs. In this work, we observe that existing low-rank and low-bit compression methods can significantly harm the model performance for task-specific fine-tuned LLMs (e.g., WizardMath for math problems). Motivated by the long-tail distribution of singular values in the delta weights, we propose a delta quantization approach using mixed-precision. This method employs higher-bit representation for singular vectors corresponding to larger singular values. We evaluate our approach on various fine-tuned LLMs, including math LLMs, code LLMs, chat LLMs, and even VLMs. Experimental results demonstrate that our approach performs comparably to full fine-tuned LLMs, surpassing both low-rank and low-bit baselines by a considerable margin. Additionally, we show that our method is compatible with various backbone LLMs, such as Llama-2, Llama-3, and Mistral, highlighting its generalizability.

delta weight, delta-compression method, llm, (12 more...)

arXiv.org Artificial Intelligence

2406.08903

Country:

Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Liu, James, Xiao, Guangxuan, Li, Kai, Lee, Jason D., Han, Song, Dao, Tri, Cai, Tianle

arXiv.org Artificial IntelligenceFeb-27-2024

Large Language Models (LLMs) are typically trained in two phases: pre-training on large internet-scale datasets, and fine-tuning for downstream tasks. Given the higher computational demand of pre-training, it's intuitive to assume that fine-tuning adds less new information to the model, and is thus more compressible. We explore this assumption by decomposing the weights of fine-tuned models into their pre-trained components and an additional delta. We introduce a simple method, BitDelta, which successfully quantizes this delta down to 1 bit without compromising performance. This interesting finding not only highlights the potential redundancy of information added during fine-tuning, but also has significant implications for the multi-tenant serving and multi-tenant storage of fine-tuned models. By enabling the use of a single high-precision base model accompanied by multiple 1-bit deltas, BitDelta dramatically reduces GPU memory requirements by more than 10x, which can also be translated to enhanced generation latency in multi-tenant settings. We validate BitDelta through experiments across Llama-2 and Mistral model families, and on models up to 70B parameters, showcasing minimal performance degradation over all tested settings.

bitdelta, delta, fine-tuned model, (15 more...)

arXiv.org Artificial Intelligence

2402.10193

Country:

Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback