Learning Summary-Worthy Visual Representation for Abstractive Summarization in Video