Evaluating the Prompt Steerability of Large Language Models