MM-ACT: Learn from Multimodal Parallel Generation to Act