CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding