Africa
Dealing with the Hard Facts of Low-Resource African NLP
Diarra, Yacouba, Coulibaly, Nouhoum Souleymane, Kamaté, Panga Azazia, Tall, Madani Amadou, Koné, Emmanuel Élisé, Dembélé, Aymane, Leventhal, Michael
Creating speech datasets, models, and evaluation frameworks for low-resource languages remains challenging given the lack of a broad base of pertinent experience to draw from. This paper reports on the field collection of 612 hours of spontaneous speech in Bambara, a low-resource West African language; the semi-automated annotation of that dataset with transcriptions; the creation of several monolingual ultra-compact and small models using the dataset; and the automatic and human evaluation of their output. We offer practical suggestions for data collection protocols, annotation, and model design, as well as evidence for the importance of performing human evaluation. In addition to the main dataset, multiple evaluation datasets, models, and code are made publicly available.
The SA-FARI Dataset: Segment Anything in Footage of Animals for Recognition and Identification
Wasmuht, Dante Francisco, Brookes, Otto, Schall, Maximillian, Palencia, Pablo, Beirne, Chris, Burghardt, Tilo, Mirmehdi, Majid, Kühl, Hjalmar, Arandjelovic, Mimi, Pottie, Sam, Bermant, Peter, Asheim, Brandon, Toh, Yi Jin, Elzinga, Adam, Holmberg, Jason, Whitworth, Andrew, Flatt, Eleanor, Gustafson, Laura, Ryali, Chaitanya, Hu, Yuan-Ting, Guo, Baishan, Westbury, Andrew, Saenko, Kate, Suris, Didac
Automated video analysis is critical for wildlife conservation. A foundational task in this domain is multi-animal tracking (MAT), which underpins applications such as individual re-identification and behavior recognition. However, existing datasets are limited in scale, constrained to a few species, or lack sufficient temporal and geographical diversity - leaving no suitable benchmark for training general-purpose MAT models applicable across wild animal populations. To address this, we introduce SA-FARI, the largest open-source MAT dataset for wild animals. It comprises 11,609 camera trap videos collected over approximately 10 years (2014-2024) from 741 locations across 4 continents, spanning 99 species categories. Each video is exhaustively annotated culminating in ~46 hours of densely annotated footage containing 16,224 masklet identities and 942,702 individual bounding boxes, segmentation masks, and species labels. Alongside the task-specific annotations, we publish anonymized camera trap locations for each video. Finally, we present comprehensive benchmarks on SA-FARI using state-of-the-art vision-language models for detection and tracking, including SAM 3, evaluated with both species-specific and generic animal prompts. We also compare against vision-only methods developed specifically for wildlife analysis. SA-FARI is the first large-scale dataset to combine high species diversity, multi-region coverage, and high-quality spatio-temporal annotations, offering a new foundation for advancing generalizable multianimal tracking in the wild. The dataset is available at https://www.conservationxlabs.com/sa-fari.
ReefNet: A Large scale, Taxonomically Enriched Dataset and Benchmark for Hard Coral Classification
Battach, Yahia, Felemban, Abdulwahab, Khan, Faizan Farooq, Radwan, Yousef A., Li, Xiang, Marchese, Fabio, Beery, Sara, Jones, Burton H., Benzoni, Francesca, Elhoseiny, Mohamed
Coral reefs are rapidly declining due to anthropogenic pressures such as climate change, underscoring the urgent need for scalable, automated monitoring. We introduce ReefNet, a large public coral reef image dataset with point-label annotations mapped to the World Register of Marine Species (WoRMS). ReefNet aggregates imagery from 76 curated CoralNet sources and an additional site from Al Wajh in the Red Sea, totaling approximately 925000 genus-level hard coral annotations with expert-verified labels. Unlike prior datasets, which are often limited by size, geography, or coarse labels and are not ML-ready, ReefNet offers fine-grained, taxonomically mapped labels at a global scale to WoRMS. We propose two evaluation settings: (i) a within-source benchmark that partitions each source's images for localized evaluation, and (ii) a cross-source benchmark that withholds entire sources to test domain generalization. We analyze both supervised and zero-shot classification performance on ReefNet and find that while supervised within-source performance is promising, supervised performance drops sharply across domains, and performance is low across the board for zero-shot models, especially for rare and visually similar genera. This provides a challenging benchmark intended to catalyze advances in domain generalization and fine-grained coral classification. We will release our dataset, benchmarking code, and pretrained models to advance robust, domain-adaptive, global coral reef monitoring and conservation.
PsychiatryBench: A Multi-Task Benchmark for LLMs in Psychiatry
Fouda, Aya E., Hassan, Abdelrahamn A., Hanafy, Radwa J., Fouda, Mohammed E.
Large language models (LLMs) offer significant potential in enhancing psychiatric practice, from improving diagnostic accuracy to streamlining clinical documentation and therapeutic support. However, existing evaluation resources heavily rely on small clinical interview corpora, social media posts, or synthetic dialogues, which limits their clinical validity and fails to capture the full complexity of diagnostic reasoning. In this work, we introduce PsychiatryBench, a rigorously curated benchmark grounded exclusively in authoritative, expert-validated psychiatric textbooks and casebooks. PsychiatryBench comprises eleven distinct question-answering tasks ranging from diagnostic reasoning and treatment planning to longitudinal follow-up, management planning, clinical approach, sequential case analysis, and multiple-choice/extended matching formats totaling 5,188 expert-annotated items. {\color{red}We evaluate a diverse set of frontier LLMs (including Google Gemini, DeepSeek, Sonnet 4.5, and GPT 5) alongside leading open-source medical models such as MedGemma using both conventional metrics and an "LLM-as-judge" similarity scoring framework. Our results reveal substantial gaps in clinical consistency and safety, particularly in multi-turn follow-up and management tasks, underscoring the need for specialized model tuning and more robust evaluation paradigms. PsychiatryBench offers a modular, extensible platform for benchmarking and improving LLM performance in mental health applications.
Learning to Call: A Field Trial of a Collaborative Bandit Algorithm for Improved Message Delivery in Mobile Maternal Health
Dasgupta, Arpan, Maniyar, Mizhaan, Srivastava, Awadhesh, Kumar, Sanat, Mahale, Amrita, Hegde, Aparna, Suggala, Arun, Shanmugam, Karthikeyan, Taneja, Aparna, Tambe, Milind
Mobile health (mHealth) programs utilize automated voice messages to deliver health information, particularly targeting underserved communities, demonstrating the effectiveness of using mobile technology to disseminate crucial health information to these populations, improving health outcomes through increased awareness and behavioral change. India's Kilkari program delivers vital maternal health information via weekly voice calls to millions of mothers. However, the current random call scheduling often results in missed calls and reduced message delivery. This study presents a field trial of a collaborative bandit algorithm designed to optimize call timing by learning individual mothers' preferred call times. We deployed the algorithm with around $6500$ Kilkari participants as a pilot study, comparing its performance to the baseline random calling approach. Our results demonstrate a statistically significant improvement in call pick-up rates with the bandit algorithm, indicating its potential to enhance message delivery and impact millions of mothers across India. This research highlights the efficacy of personalized scheduling in mobile health interventions and underscores the potential of machine learning to improve maternal health outreach at scale.
A Cross-Cultural Assessment of Human Ability to Detect LLM-Generated Fake News about South Africa
Schlippe, Tim, Wölfel, Matthias, Mabokela, Koena Ronny
This study investigates how cultural proximity affects the ability to detect AI-generated fake news by comparing South African participants with those from other nationalities. As large language models increasingly enable the creation of sophisticated fake news, understanding human detection capabilities becomes crucial, particularly across different cultural contexts. We conducted a survey where 89 participants (56 South Africans, 33 from other nationalities) evaluated 10 true South African news articles and 10 AI-generated fake versions. Results reveal an asymmetric pattern: South Africans demonstrated superior performance in detecting true news about their country (40% deviation from ideal rating) compared to other participants (52%), but performed worse at identifying fake news (62% vs. 55%). This difference may reflect South Africans' higher overall trust in news sources. Our analysis further shows that South Africans relied more on content knowledge and contextual understanding when judging credibility, while participants from other countries emphasised formal linguistic features such as grammar and structure. Overall, the deviation from ideal rating was similar between groups (51% vs. 53%), suggesting that cultural familiarity appears to aid verification of authentic information but may also introduce bias when evaluating fabricated content. These insights contribute to understanding cross-cultural dimensions of misinformation detection and inform strategies for combating AI-generated fake news in increasingly globalised information ecosystems where content crosses cultural and geographical boundaries.
An improved clustering-based multi-swarm PSO using local diversification and topology information
Matanga, Yves, Sun, Yanxia, Wang, Zenghui
Multi-swarm particle optimisation algorithms are gaining popularity due to their ability to locate multiple optimum points concurrently. In this family of algorithms, clustering-based multi-swarm algorithms are among the most effective techniques that join the closest particles together to form independent niche swarms that exploit potential promising regions. However, most clustering-based multi-swarms are Euclidean distance-based and only inquire about the potential of one peak within a cluster and thus can lose multiple peaks due to poor resolution. In a bid to improve the peak detection ratio, the current study proposes two enhancements. First, a preliminary local search across initial particles is proposed to ensure that each local region is sufficiently scouted prior to particle collaboration. Secondly, an investigative clustering approach that performs concavity analysis is proposed to evaluate the potential for several sub-niches within a single cluster. An improved clustering-based multi-swarm PSO (TImPSO) has resulted from these enhancements and has been tested against three competing algorithms in the same family using the IEEE CEC2013 niching datasets, resulting in an improved peak ratio for almost all the test functions.
Zelensky warns against giving away territory as latest Ukraine talks end
Talks in Geneva between the US and Ukraine aimed at ending the war with Russia have concluded, with officials from both sides reporting progress and an intention to continue working. However, no details have emerged on how to bridge the considerable divide between Moscow and Kyiv over territorial issues and security guarantees for Ukraine. Ukraine's president Volodymyr Zelensky welcomed the important steps that had been made but warned that the main problem facing the peace talks was Vladimir Putin's demand for legal recognition of Russian-occupied territories in eastern Ukraine. This would break the principle of territorial integrity and sovereignty, he said, highlighting concerns that Moscow could be rewarded for its aggression with land it seized by force. Meanwhile, President Donald Trump suggested on social media that something good just may be happening, but with the caveat: Don't believe it until you see it.
US and Ukraine announce revised peace plan: this is what we know
What is in the 28-point US plan for Ukraine? Why is Europe opposing Trump's peace plan? Is the fall of Pokrovsk inevitable? 'A corruption scandal may well end the Ukraine war' Russian drones attacked targets in Ukraine hours after the US and Kyiv announced revisions to a controversial peace plan proposed by Donald Trump. Speaking after talks in Geneva, US and Ukrainian officials agreed any deal should "fully uphold" Ukraine's sovereignty.
The Download: how to fix a tractor, and living among conspiracy theorists
You live in a house you designed and built yourself. You rely on the sun for power, heat your home with a woodstove, and farm your own fish and vegetables. This is the life of Marcin Jakubowski, the 53-year-old founder of Open Source Ecology, an open collaborative of engineers, producers, and builders developing what they call the Global Village Construction Set (GVCS). It's a set of 50 machines--everything from a tractor to an oven to a circuit maker--that are capable of building civilization from scratch and can be reconfigured however you see fit. It's all part of his ethos that life-changing technology should be available to all, not controlled by a select few. What it's like to find yourself in the middle of a conspiracy theory Last week, we held a subscribers-only Roundtables discussion exploring how to cope in this new age of conspiracy theories.