PARROT: An Open Multilingual Radiology Reports Dataset
Guellec, Bastien Le, Adambounou, Kokou, Adams, Lisa C, Agripnidis, Thibault, Ahn, Sung Soo, Chalal, Radhia Ait, Antonoli, Tugba Akinci D, Amouyel, Philippe, Andersson, Henrik, Bentegeac, Raphael, Benzoni, Claudio, Blandino, Antonino Andrea, Busch, Felix, Can, Elif, Cau, Riccardo, Cavallo, Armando Ugo, Chavihot, Christelle, Chiquete, Erwin, Cuocolo, Renato, Divjak, Eugen, Ivanac, Gordana, Macek, Barbara Dziadkowiec, Elogne, Armel, Fanni, Salvatore Claudio, Ferrarotti, Carlos, Fossataro, Claudia, Fossataro, Federica, Fulek, Katarzyna, Fulek, Michal, Gac, Pawel, Gachowska, Martyna, Juarez, Ignacio Garcia, Gatti, Marco, Gorelik, Natalia, Goulianou, Alexia Maria, Hamroun, Aghiles, Herinirina, Nicolas, Kraik, Krzysztof, Krupka, Dominik, Holay, Quentin, Kitamura, Felipe, Klontzas, Michail E, Kompanowska, Anna, Kompanowski, Rafal, Lefevre, Alexandre, Lemke, Tristan, Lindholz, Maximilian, Muller, Lukas, Macek, Piotr, Makowski, Marcus, Mannacio, Luigi, Meddeb, Aymen, Natale, Antonio, Edzang, Beatrice Nguema, Ojeda, Adriana, Park, Yae Won, Piccione, Federica, Ponsiglione, Andrea, Poreba, Malgorzata, Poreba, Rafal, Prucker, Philipp, Pruvo, Jean Pierre, Pugliesi, Rosa Alba, Rabemanorintsoa, Feno Hasina, Rafailidis, Vasileios, Resler, Katarzyna, Rotkegel, Jan, Saba, Luca, Siebert, Ezann, Stanzione, Arnaldo, Tekin, Ali Fuat, Yanchapaxi, Liz Toapanta, Triantafyllou, Matthaios, Tsaoulia, Ekaterini, Vassalou, Evangelia, Vernuccio, Federica, Wasselius, Johan, Wang, Weilang, Urban, Szymon, Wlodarczak, Adrian, Wlodarczak, Szymon, Wysocki, Andrzej, Xu, Lina, Zatonski, Tomasz, Zhang, Shuhang, Ziegelmayer, Sebastian, Kuchcinski, Gregory, Bressem, Keno K
–arXiv.org Artificial Intelligence
Rationale and Objectives: To develop and validate PARROT (Polyglottal Annotated Radiology Reports for Open Testing), a large, multicentric, open-access dataset of fictional radiology reports spanning multiple languages for testing natural language processing applications in radiology. Materials and Methods: From May to September 2024, radiologists were invited to contribute fictional radiology reports following their standard reporting practices. Contributors provided at least 20 reports with associated metadata including anatomical region, imaging modality, clinical context, and for non-English reports, English translations. All reports were assigned ICD-10 codes. A human vs. AI report differentiation study was conducted with 154 participants (radiologists, healthcare professionals, and non-healthcare professionals) assessing whether reports were human-authored or AI-generated. Results: The dataset comprises 2,658 radiology reports from 76 authors across 21 countries and 13 languages. Reports cover multiple imaging modalities (CT: 36.1%, MRI: 22.8%, radiography: 19.0%, ultrasound: 16.8%) and anatomical regions, with chest (19.9%), abdomen (18.6%), head (17.3%), and pelvis (14.1%) being most prevalent. In the differentiation study, participants achieved 53.9% accuracy (95% CI: 50.7%-57.1%) in distinguishing between human and AI-generated reports, with radiologists performing significantly better (56.9%, 95% CI: 53.3%-60.6%, p<0.05) than other groups. Conclusion: PARROT represents the largest open multilingual radiology report dataset, enabling development and validation of natural language processing applications across linguistic, geographic, and clinical boundaries without privacy constraints.
arXiv.org Artificial Intelligence
Aug-26-2025
- Country:
- Africa
- Côte d'Ivoire > Abidjan
- Abidjan (0.04)
- Gabon > Estuaire
- Libreville (0.04)
- Madagascar
- Atsinanana > Toamasina (0.04)
- Diana > Antsiranana (0.04)
- Middle East > Algeria
- Algiers Province > Algiers (0.04)
- El Oued Province > El Oued (0.04)
- Togo > Maritime Region
- Lome (0.04)
- Côte d'Ivoire > Abidjan
- Asia
- China > Jiangsu Province
- Nanjing (0.04)
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- South Korea > Seoul
- Seoul (0.04)
- China > Jiangsu Province
- Europe
- Middle East > Republic of Türkiye
- Istanbul Province > Istanbul (0.04)
- Poland > Lower Silesia Province
- Wroclaw (0.07)
- Belgium > Flanders
- West Flanders > Bruges (0.04)
- Switzerland > Basel-City
- Basel (0.05)
- Italy
- France
- Hauts-de-France > Nord
- Lille (0.05)
- Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
- Marseille (0.04)
- Hauts-de-France > Nord
- Sweden > Skåne County
- Lund (0.04)
- Germany
- Baden-Württemberg > Freiburg (0.04)
- Bavaria > Upper Bavaria
- Munich (0.05)
- Berlin (0.04)
- Rheinland-Pfalz > Mainz (0.04)
- Saxony > Dresden (0.04)
- Croatia > Zagreb County
- Zagreb (0.04)
- Greece > Central Macedonia
- Thessaloniki (0.04)
- Middle East > Republic of Türkiye
- North America
- Canada > Quebec
- Montreal (0.14)
- Mexico > Mexico City
- Mexico City (0.04)
- United States > California
- San Francisco County > San Francisco (0.14)
- Canada > Quebec
- Oceania > Australia
- Western Australia > Perth (0.04)
- South America
- Argentina > Pampas
- Buenos Aires F.D. > Buenos Aires (0.04)
- Brazil > São Paulo (0.04)
- Argentina > Pampas
- Africa
- Genre:
- Research Report
- Experimental Study (1.00)
- New Finding (1.00)
- Research Report
- Industry:
- Health & Medicine
- Diagnostic Medicine > Imaging (1.00)
- Nuclear Medicine (1.00)
- Health & Medicine
- Technology: