CoMix: A Comprehensive Benchmark for Multi-Task Comic Understanding Marco Bertini 2
–Neural Information Processing Systems
The comic domain is rapidly advancing with the development of single-page analysis and synthesis models. However, evaluation metrics and datasets lag behind, often limited to small-scale or single-style test sets. We introduce a novel benchmark, CoMix, designed to evaluate the multi-task capabilities of models in comic analysis. Unlike existing benchmarks that focus on isolated tasks such as object detection or text recognition, CoMix addresses a broader range of tasks including object detection, speaker identification, character re-identification, reading order, and multi-modal reasoning tasks like character naming and dialogue generation. Our benchmark comprises three existing datasets with expanded annotations to support multi-task evaluation.
Neural Information Processing Systems
May-25-2025, 21:40:01 GMT
- Country:
- Europe > Italy (0.14)
- North America > United States (0.14)
- Genre:
- Research Report > Experimental Study (0.93)
- Industry:
- Government (0.46)
- Media (0.46)
- Technology: