ILLUME: Rationalizing Vision-Language Models through Human Interactions