Beyond Instruction Following: Evaluating Rule Following of Large Language Models

Open in new window