Beyond N-Grams: Rethinking Evaluation Metrics and Strategies for Multilingual Abstractive Summarization