Stress-Testing Model Specs Reveals Character Differences among Language Models