Does Instruction Tuning Make LLMs More Consistent?
Fierro, Constanza, Li, Jiaang, Søgaard, Anders
–arXiv.org Artificial Intelligence
The purpose of instruction tuning is enabling zero-shot performance, but instruction tuning has also been shown to improve chain-of-thought reasoning and value alignment (Si et al., 2023). Here we consider the impact on $\textit{consistency}$, i.e., the sensitivity of language models to small perturbations in the input. We compare 10 instruction-tuned LLaMA models to the original LLaMA-7b model and show that almost across-the-board they become more consistent, both in terms of their representations and their predictions in zero-shot and downstream tasks. We explain these improvements through mechanistic analyses of factual recall.
arXiv.org Artificial Intelligence
Apr-30-2024
- Country:
- Europe (0.93)
- North America > United States
- Minnesota > Hennepin County > Minneapolis (0.14)
- Genre:
- Research Report (0.50)
- Technology: