Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization