Augmented Transformers with Adaptive n-grams Embedding for Multilingual Scene Text Recognition