C$^2$LEVA: Toward Comprehensive and Contamination-Free Language Model Evaluation