On Giant's Shoulders: Effortless Weak to Strong by Dynamic Logits Fusion

Neural Information Processing Systems 

Efficient fine-tuning of large language models for task-specific applications is imperative, yet the vast number of parameters in these models makes their training increasingly challenging.Despite numerous proposals for effective methods, a substantial memory overhead remains for gradient computations during updates.