What Is Missing in Multilingual Visual Reasoning and How to Fix It