NOVA: A Benchmark for Rare Anomaly Localization and Clinical Reasoning in Brain MRI
–Neural Information Processing Systems
In many real-world applications, deployed models encounter inputs that differ from the data seen during training. Open-world recognition ensures that such systems remain robust as ever-emerging, previously _unknown_ categories appear and must be addressed without retraining.Foundation and vision-language models are pre-trained on large and diverse datasets with the expectation of broad generalization across domains, including medical imaging.However, benchmarking these models on test sets with only a few common outlier types silently collapses the evaluation back to a closed-set problem, masking failures on rare or truly novel conditions encountered in clinical use.We therefore present NOVA, a challenging, real-life _evaluation-only_ benchmark of $\sim$900 brain MRI scans that span 281 rare pathologies and heterogeneous acquisition protocols. Each case includes rich clinical narratives and double-blinded expert bounding-box annotations. Together, these enable joint assessment of anomaly localisation, visual captioning, and diagnostic reasoning. Because NOVA is never used for training, it serves as an _extreme_ stress-test of out-of-distribution generalisation: models must bridge a distribution gap both in sample appearance and in semantic space.
Neural Information Processing Systems
Jun-13-2026, 09:17:20 GMT
- Industry:
- Health & Medicine > Diagnostic Medicine > Imaging (0.97)
- Technology:
- Information Technology > Artificial Intelligence
- Natural Language (0.62)
- Vision (0.61)
- Information Technology > Artificial Intelligence