Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning