L-CiteEval: Do Long-Context Models Truly Leverage Context for Responding?