Steering Information Utility in Key-Value Memory for Language Model Post-Training

Jun-13-2026, 08:08:25 GMT–Neural Information Processing Systems

Recent advancements in language models (LMs) have marked a shift toward the growing importance of post-training. Yet, post-training approaches such as supervised fine-tuning (SFT) do not guarantee the effective use of knowledge acquired during pretraining. We therefore introduce infosteer, a lightweight method that encourages parametric information utilization in LMs during post-training. Specifically, Infosteer treats the feed-forward network (FFN) layer as associate key-value memory and promotes the use of stored memory vectors via forward-pass interventions or regularization during backpropagation.

artificial intelligence, machine learning, proceedings, (3 more...)

Neural Information Processing Systems

Jun-13-2026, 08:08:25 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.60)