Test-Time Warmup for Multimodal Large Language Models