Vision Language Model for Interpretable and Fine-grained Detection of Safety Compliance in Diverse Workplaces