Beyond Instruction Following: Evaluating Rule Following of Large Language Models