Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder