COMPASS: A Multi-Dimensional Benchmark for Evaluating Code Generation in Large Language Models