MATT-Diff: Multimodal Active Target Tracking by Diffusion Policy