Why do language models perform worse for morphologically complex languages?