Estuaire
PARROT: An Open Multilingual Radiology Reports Dataset
Guellec, Bastien Le, Adambounou, Kokou, Adams, Lisa C, Agripnidis, Thibault, Ahn, Sung Soo, Chalal, Radhia Ait, Antonoli, Tugba Akinci D, Amouyel, Philippe, Andersson, Henrik, Bentegeac, Raphael, Benzoni, Claudio, Blandino, Antonino Andrea, Busch, Felix, Can, Elif, Cau, Riccardo, Cavallo, Armando Ugo, Chavihot, Christelle, Chiquete, Erwin, Cuocolo, Renato, Divjak, Eugen, Ivanac, Gordana, Macek, Barbara Dziadkowiec, Elogne, Armel, Fanni, Salvatore Claudio, Ferrarotti, Carlos, Fossataro, Claudia, Fossataro, Federica, Fulek, Katarzyna, Fulek, Michal, Gac, Pawel, Gachowska, Martyna, Juarez, Ignacio Garcia, Gatti, Marco, Gorelik, Natalia, Goulianou, Alexia Maria, Hamroun, Aghiles, Herinirina, Nicolas, Kraik, Krzysztof, Krupka, Dominik, Holay, Quentin, Kitamura, Felipe, Klontzas, Michail E, Kompanowska, Anna, Kompanowski, Rafal, Lefevre, Alexandre, Lemke, Tristan, Lindholz, Maximilian, Muller, Lukas, Macek, Piotr, Makowski, Marcus, Mannacio, Luigi, Meddeb, Aymen, Natale, Antonio, Edzang, Beatrice Nguema, Ojeda, Adriana, Park, Yae Won, Piccione, Federica, Ponsiglione, Andrea, Poreba, Malgorzata, Poreba, Rafal, Prucker, Philipp, Pruvo, Jean Pierre, Pugliesi, Rosa Alba, Rabemanorintsoa, Feno Hasina, Rafailidis, Vasileios, Resler, Katarzyna, Rotkegel, Jan, Saba, Luca, Siebert, Ezann, Stanzione, Arnaldo, Tekin, Ali Fuat, Yanchapaxi, Liz Toapanta, Triantafyllou, Matthaios, Tsaoulia, Ekaterini, Vassalou, Evangelia, Vernuccio, Federica, Wasselius, Johan, Wang, Weilang, Urban, Szymon, Wlodarczak, Adrian, Wlodarczak, Szymon, Wysocki, Andrzej, Xu, Lina, Zatonski, Tomasz, Zhang, Shuhang, Ziegelmayer, Sebastian, Kuchcinski, Gregory, Bressem, Keno K
Rationale and Objectives: To develop and validate PARROT (Polyglottal Annotated Radiology Reports for Open Testing), a large, multicentric, open-access dataset of fictional radiology reports spanning multiple languages for testing natural language processing applications in radiology. Materials and Methods: From May to September 2024, radiologists were invited to contribute fictional radiology reports following their standard reporting practices. Contributors provided at least 20 reports with associated metadata including anatomical region, imaging modality, clinical context, and for non-English reports, English translations. All reports were assigned ICD-10 codes. A human vs. AI report differentiation study was conducted with 154 participants (radiologists, healthcare professionals, and non-healthcare professionals) assessing whether reports were human-authored or AI-generated. Results: The dataset comprises 2,658 radiology reports from 76 authors across 21 countries and 13 languages. Reports cover multiple imaging modalities (CT: 36.1%, MRI: 22.8%, radiography: 19.0%, ultrasound: 16.8%) and anatomical regions, with chest (19.9%), abdomen (18.6%), head (17.3%), and pelvis (14.1%) being most prevalent. In the differentiation study, participants achieved 53.9% accuracy (95% CI: 50.7%-57.1%) in distinguishing between human and AI-generated reports, with radiologists performing significantly better (56.9%, 95% CI: 53.3%-60.6%, p<0.05) than other groups. Conclusion: PARROT represents the largest open multilingual radiology report dataset, enabling development and validation of natural language processing applications across linguistic, geographic, and clinical boundaries without privacy constraints.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > Canada > Quebec > Montreal (0.14)
- Europe > Poland > Lower Silesia Province > Wroclaw (0.07)
- (34 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Nuclear Medicine (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Do Large Language Models Latently Perform Multi-Hop Reasoning?
Yang, Sohee, Gribovskaya, Elena, Kassner, Nora, Geva, Mor, Riedel, Sebastian
We study whether Large Language Models (LLMs) latently perform multi-hop reasoning with complex prompts such as "The mother of the singer of 'Superstition' is". We look for evidence of a latent reasoning pathway where an LLM (1) latently identifies "the singer of 'Superstition'" as Stevie Wonder, the bridge entity, and (2) uses its knowledge of Stevie Wonder's mother to complete the prompt. We analyze these two hops individually and consider their co-occurrence as indicative of latent multi-hop reasoning. For the first hop, we test if changing the prompt to indirectly mention the bridge entity instead of any other entity increases the LLM's internal recall of the bridge entity. For the second hop, we test if increasing this recall causes the LLM to better utilize what it knows about the bridge entity. We find strong evidence of latent multi-hop reasoning for the prompts of certain relation types, with the reasoning pathway used in more than 80% of the prompts. However, the utilization is highly contextual, varying across different types of prompts. Also, on average, the evidence for the second hop and the full multi-hop traversal is rather moderate and only substantial for the first hop. Moreover, we find a clear scaling trend with increasing model size for the first hop of reasoning but not for the second hop. Our experimental findings suggest potential challenges and opportunities for future development and applications of LLMs.
- Africa > Middle East > Somalia (0.14)
- North America > Nicaragua (0.14)
- Europe > Finland (0.14)
- (30 more...)
- Research Report > New Finding (0.66)
- Research Report > Experimental Study (0.46)
- Media (1.00)
- Leisure & Entertainment (1.00)
- Government > Regional Government > North America Government (0.46)
Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data
Naggita, Keziah, LaChance, Julienne, Xiang, Alice
Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is required to capture image data representative of African people and their environments and, ultimately, to improve the applicability of computer vision models in a global context.
- Asia > Brunei (0.14)
- North America > Canada > Quebec > Montreal (0.06)
- Africa > Sierra Leone (0.06)
- (142 more...)
- Health & Medicine (0.92)
- Information Technology > Services (0.75)
- Government > Regional Government (0.46)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
Modelling spatio-temporal trends of air pollution in Africa
Gahungu, Paterne, Kubwimana, Jean Remy, Muhimpundu, Lionel Jean Marie Benjamin, Ndamuzi, Egide
Atmospheric pollution remains one of the major public health threat worldwide with an estimated 7 millions deaths annually. In Africa, rapid urbanization and poor transport infrastructure are worsening the problem. In this paper, we have analysed spatio-temporal variations of PM2.5 across different geographical regions in Africa. The West African region remains the most affected by the high levels of pollution with a daily average of 40.856 $\mu g/m^3$ in some cities like Lagos, Abuja and Bamako. In East Africa, Uganda is reporting the highest pollution level with a daily average concentration of 56.14 $\mu g/m^3$ and 38.65 $\mu g/m^3$ for Kigali. In countries located in the central region of Africa, the highest daily average concentration of PM2.5 of 90.075 $\mu g/m^3$ was recorded in N'Djamena. We compare three data driven models in predicting future trends of pollution levels. Neural network is outperforming Gaussian processes and ARIMA models.
- North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.30)
- Africa > Mali > Bamako > Bamako (0.26)
- Africa > Chad > Chari-Baguirmi > N'Djamena (0.26)
- (29 more...)
- Energy (0.93)
- Law > Environmental Law (0.86)
- Health & Medicine > Public Health (0.68)
- Transportation > Ground > Road (0.46)