Behind Maya: Building a Multilingual Vision Language Model