On Domain-Specific Post-Training for Multimodal Large Language Models