Contextual Similarity Aggregation with Self-attention for Visual Re-ranking