MMUTF: Multimodal Multimedia Event Argument Extraction with Unified Template Filling