Multimedia Verification Through Multi-Agent Deep Research Multimodal Large Language Models

Le, Huy Hoan, Nguyen, Van Sy Thinh, Dang, Thi Le Chi, Nguyen, Vo Thanh Khang, Nguyen, Truong Thanh Hung, Cao, Hung

Jul-8-2025–arXiv.org Artificial Intelligence

This paper presents our submission to the ACMMM25 - Grand Challenge on Multimedia Verification. We developed a multi-agent verification system that combines Multimodal Large Language Models (MLLMs) with specialized verification tools to detect multimedia misinformation. Our system operates through six stages: raw data processing, planning, information extraction, deep research, evidence collection, and report generation. The core Deep Researcher Agent employs four tools: reverse image search, metadata analysis, fact-checking databases, and verified news processing that extracts spatial, temporal, attribution, and motivational context. We demonstrate our approach on a challenge dataset sample involving complex multimedia content. Our system successfully verified content authenticity, extracted precise geolocation and timing information, and traced source attribution across multiple platforms, effectively addressing real-world multimedia verification scenarios.

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

Jul-8-2025

arXiv.org PDF

Add feedback

Country:
- North America > Canada > New Brunswick (0.29)

Genre:
- Research Report (1.00)

Industry:
- Information Technology > Security & Privacy (0.91)
- Media > News (0.70)

Technology:
- Information Technology > Artificial Intelligence
  - Representation & Reasoning > Agents (1.00)
  - Natural Language > Large Language Model (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found