Increasing Textual Context Size Boosts Medical Image-Text Matching
–arXiv.org Artificial Intelligence
Pretrained image-text matching models, such as OpenAI's CLIP [1], use natural language processing (NLP) approaches to find semantic relations between images and textual descriptions. This emerging technology has seen rapid adoption in the general domain, and increasing interest in the medical domain [2, 3] where medical imaging data often includes images paired with textual descriptions. For example, MIMIC-CXR[4] is a dataset that consists of chest radiographs along with free-text radiology reports. This dataset paved the way for works like BioViL [2] which used the images and the captions provided in the dataset to train an image-text matching model for chest X-Rays and chest related diseases. ROCO [5] is a dataset containing radiology images from publications available in the PubMed biomedical paper repository. ROCO includes several medical imaging modalities beyond X-Ray, such as CT, Ultrasound and MRI.
arXiv.org Artificial Intelligence
Mar-23-2023
- Country:
- Asia > Middle East
- Israel > Jerusalem District > Jerusalem (0.05)
- Europe
- Italy > Calabria
- Catanzaro Province > Catanzaro (0.04)
- Slovenia > Drava
- Municipality of Benedikt > Benedikt (0.04)
- Switzerland (0.04)
- Italy > Calabria
- Asia > Middle East
- Genre:
- Research Report > New Finding (0.31)
- Industry:
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Technology: