Curvature Tuning: Provable Training-free Model Steering From a Single Parameter
Hu, Leyang, Balestriero, Randall
–arXiv.org Artificial Intelligence
The scaling of model size and data size has reshaped the paradigm of AI. As a result, the common protocol to leverage the latest models is to steer them towards a specific downstream task of interest through {\em fine-tuning}. Despite its importance, the main methods for fine-tuning remain limited to full or low-rank adapters--containing countless hyper-parameters and lacking interpretability. In this paper, we take a step back and demonstrate how novel and explainable post-training steering solutions can be derived theoretically from {\em spline operators}, a rich mathematical framing of Deep Networks that was recently developed. Our method--coined \textbf{Curvature Tuning (CT)}--has a single parameter that provably modulates the curvature of the model's decision boundary henceforth allowing training-free steering. This makes CT both more efficient and interpretable than conventional fine-tuning methods. We empirically validate its effectiveness in improving generalization and robustness of pretrained models. For example, CT improves out-of-distribution transfer performances of ResNet-18/50 by 2.57\%/1.74\% across seventeen downstream datasets, and improves RobustBench robust accuracy by 11.76\%/348.44\%. Additionally, we apply CT to ReLU-based Swin-T/S, improving their generalization on nine downstream datasets by 2.43\%/3.33\%. Our code is available at \href{https://github.com/Leon-Leyang/curvature-tuning}{https://github.com/Leon-Leyang/curvature-tuning}.
arXiv.org Artificial Intelligence
Feb-11-2025
- Country:
- North America > United States (0.28)
- Genre:
- Research Report > New Finding (0.67)
- Technology:
- Information Technology > Artificial Intelligence
- Machine Learning
- Neural Networks > Deep Learning (0.68)
- Statistical Learning (0.88)
- Natural Language (1.00)
- Vision (1.00)
- Machine Learning
- Information Technology > Artificial Intelligence