Investigating Inference-time Scaling for Chain of Multi-modal Thought: A Preliminary Study