Towards Better Evaluation of Instruction-Following: A Case-Study in Summarization