CFSum: A Coarse-to-Fine Contribution Network for Multimodal Summarization