Improving Length-Generalization in Transformers via Task Hinting