GAOKAO-Eval: Does high scores truly reflect strong capabilities in LLMs?