Goto

Collaborating Authors

 malawi


Application and Validation of Geospatial Foundation Model Data for the Prediction of Health Facility Programmatic Outputs -- A Case Study in Malawi

Metz, Lynn, Haggard, Rachel, Moszczynski, Michael, Asbah, Samer, Mwase, Chris, Khomani, Patricia, Smith, Tyler, Cooper, Hannah, Mwale, Annie, Muslim, Arbaaz, Prasad, Gautam, Sun, Mimi, Shekel, Tomer, Paul, Joydeep, Carter, Anna, Shetty, Shravya, Green, Dylan

arXiv.org Artificial Intelligence

The reliability of routine health data in low and middle-income countries (LMICs) is often constrained by reporting delays and incomplete coverage, necessitating the exploration of novel data sources and analytics. Geospatial Foundation Models (GeoFMs) offer a promising avenue by synthesizing diverse spatial, temporal, and behavioral data into mathematical embeddings that can be efficiently used for downstream prediction tasks. This study evaluated the predictive performance of three GeoFM embedding sources - Google Population Dynamics Foundation Model (PDFM), Google AlphaEarth (derived from satellite imagery), and mobile phone call detail records (CDR) - for modeling 15 routine health programmatic outputs in Malawi, and compared their utility to traditional geospatial interpolation methods. We used XGBoost models on data from 552 health catchment areas (January 2021-May 2023), assessing performance with R2, and using an 80/20 training and test data split with 5-fold cross-validation used in training. While predictive performance was mixed, the embedding-based approaches improved upon baseline geostatistical methods in 13 of 15 (87%) indicators tested. A Multi-GeoFM model integrating all three embedding sources produced the most robust predictions, achieving average 5-fold cross validated R2 values for indicators like population density (0.63), new HIV cases (0.57), and child vaccinations (0.47) and test set R2 of 0.64, 0.68, and 0.55, respectively. Prediction was poor for prediction targets with low primary data availability, such as TB and malnutrition cases. These results demonstrate that GeoFM embeddings imbue a modest predictive improvement for select health and demographic outcomes in an LMIC context. We conclude that the integration of multiple GeoFM sources is an efficient and valuable tool for supplementing and strengthening constrained routine health information systems.


mwBTFreddy: A Dataset for Flash Flood Damage Assessment in Urban Malawi

Chapuma, Evelyn, Mengezi, Grey, Msasa, Lewis, Taylor, Amelia

arXiv.org Artificial Intelligence

This paper describes the mwBTFreddy dataset, a resource developed to support flash flood damage assessment in urban Malawi, specifically focusing on the impacts of Cyclone Freddy in 2023. The dataset comprises paired pre- and post-disaster satellite images sourced from Google Earth Pro, accompanied by JSON files containing labelled building annotations with geographic coordinates and damage levels (no damage, minor, major, or destroyed). Developed by the Kuyesera AI Lab at the Malawi University of Business and Applied Sciences, this dataset is intended to facilitate the development of machine learning models tailored to building detection and damage classification in African urban contexts. It also supports flood damage visualisation and spatial analysis to inform decisions on relocation, infrastructure planning, and emergency response in climate-vulnerable regions.


Amplify Initiative: Building A Localized Data Platform for Globalized AI

Rashid, Qazi Mamunur, van Liemt, Erin, Shih, Tiffany, Ebinama, Amber, Ramos, Karla Barrios, Maji, Madhurima, Verma, Aishwarya, Kalia, Charu, Smith-Loud, Jamila, Nakatumba-Nabende, Joyce, Baguma, Rehema, Katumba, Andrew, Mutebi, Chodrine, Marvin, Jagen, Wairagala, Eric Peter, Bruce, Mugizi, Oketta, Peter, Nderu, Lawrence, Obiajunwa, Obichi, Oppong, Abigail, Zimba, Michael, Authors, Data

arXiv.org Artificial Intelligence

Current AI models often fail to account for local context and language, given the predominance of English and Western internet content in their training data. This hinders the global relevance, usefulness, and safety of these models as they gain more users around the globe. Amplify Initiative, a data platform and methodology, leverages expert communities to collect diverse, high-quality data to address the limitations of these models. The platform is designed to enable co-creation of datasets, provide access to high-quality multilingual datasets, and offer recognition to data authors. This paper presents the approach to co-creating datasets with domain experts (e.g., health workers, teachers) through a pilot conducted in Sub-Saharan Africa (Ghana, Kenya, Malawi, Nigeria, and Uganda). In partnership with local researchers situated in these countries, the pilot demonstrated an end-to-end approach to co-creating data with 155 experts in sensitive domains (e.g., physicians, bankers, anthropologists, human and civil rights advocates). This approach, implemented with an Android app, resulted in an annotated dataset of 8,091 adversarial queries in seven languages (e.g., Luganda, Swahili, Chichewa), capturing nuanced and contextual information related to key themes such as misinformation and public interest topics. This dataset in turn can be used to evaluate models for their safety and cultural relevance within the context of these languages.


Using Machine Learning to Detect Fraudulent SMSs in Chichewa

Taylor, Amelia, Robert, Amoss

arXiv.org Artificial Intelligence

SMS enabled fraud is of great concern globally. Building classifiers based on machine learning for SMS fraud requires the use of suitable datasets for model training and validation. Most research has centred on the use of datasets of SMSs in English. This paper introduces a first dataset for SMS fraud detection in Chichewa, a major language in Africa, and reports on experiments with machine learning algorithms for classifying SMSs in Chichewa as fraud or non-fraud. We answer the broader research question of how feasible it is to develop machine learning classification models for Chichewa SMSs. To do that, we created three datasets. A small dataset of SMS in Chichewa was collected through primary research from a segment of the young population. We applied a label-preserving text transformations to increase its size. The enlarged dataset was translated into English using two approaches: human translation and machine translation. The Chichewa and the translated datasets were subjected to machine classification using random forest and logistic regression. Our findings indicate that both models achieved a promising accuracy of over 96% on the Chichewa dataset. There was a drop in performance when moving from the Chichewa to the translated dataset. This highlights the importance of data preprocessing, especially in multilingual or cross-lingual NLP tasks, and shows the challenges of relying on machine-translated text for training machine learning models. Our results underscore the importance of developing language specific models for SMS fraud detection to optimise accuracy and performance. Since most machine learning models require data preprocessing, it is essential to investigate the impact of the reliance on English-specific tools for data preprocessing.


How AI monitoring is cutting stillbirths and neonatal deaths in a clinic in Malawi

The Guardian

When Ellen Kaphamtengo felt a sharp pain in her lower abdomen, she thought she might be in labour. It was the ninth month of her first pregnancy and she wasn't taking any chances. With the help of her mother, the 18-year-old climbed on to a motorcycle taxi and rushed to a hospital in Malawi's capital, Lilongwe, a 20-minute ride away. At the Area 25 health centre, they told her it was a false alarm and took her to the maternity ward. But things escalated quickly when a routine ultrasound revealed that her baby was much smaller than expected for her pregnancy stage, which can cause asphyxia – a condition that limits blood flow and oxygen to the baby.


Chilly Drone Supply: Swoop Aero in Malawi - Channel969

#artificialintelligence

Australian drone-based logistics firm Swoop Aero has succeeded in transporting crucial Pfizer vaccines in Malawi. The air supply of the vaccines, which require ultra-cold chain situations, marks a milestone for Malawi, in addition to for Swoop Aero and medical air deliveries usually, showcasing the potential the know-how has to help public well being. Over 17,280 COVID-19 vaccine doses have been efficiently delivered throughout the Southern districts of Malawi thus far, with producers corresponding to AstraZeneca and Johnson and Johnson making use of the prevailing Swoop Aero drone community to shortly distribute crucial vaccines to distant communities. Swoop Aero intends to ship hundreds extra vaccines as they develop into out there. "The supply of Pfizer COVID-19 vaccines underscores the novel worth of bi-directional drone networks in Malawi," mentioned Swoop Aero CEO Eric Peck.


Using AI In Malawi To Save Elephants

NPR Technology

Poachers killed almost a third of the African elephant population between 2007 and 2014, a recent census found. Researchers hope artificial intelligence can help stop poachers and other threats, too.


Facebook uses AI to create density maps of Africa

#artificialintelligence

Facebook is working closely with key non-profit and research partners to use artificial intelligence (AI) and big data to address large-scale social, health and infrastructure challenges in sub Saharan Africa. These efforts range from rural electrification in Tanzania to vaccinating people in remote corners of Malawi. Facebook is applying the processing muscle of its compute power, its extensive data science skills and its expertise in AI and machine learning to create the world's most detailed and accurate maps of local populations. Facebook also partners with Columbia University's Center for International Earth Science Information Network (CIESIN (http://www.ciesin.org/)) to ensure that this effort leverages the best available administrative data for all countries involved. The Boston-based Facebook team uses advanced computer vision and machine learning to combine satellite imagery from Digital Globe with public census data and other sources to create detailed population density maps of Africa.


Drone rangers: Thousands of lives will be saved by drones in the next five years

#artificialintelligence

ONCE THOUGHT OF AS A NICHE TOY for early adopters, drones can now be found buzzing over parks, in select cities, and are even being increasingly used for video production as the popularity of aerial photography soars. However, drones aren't only for fun and entertainment, and the high-pitched hum of their spinning propellers could replace the wail of ambulance sirens for global citizens as drones are put to work for humanitarian purposes. In March of 2017, DJI, the manufacturers of the most popular commercial drones, published a report about drones' life-saving capabilities, citing cases in which drones manned by volunteers or bystanders were used in emergency situations like floods and avalanches, resulting in 59 life-saving rescues in China, Canada, the U.S., and Turkey. Given that it takes 25 people 35 hours to search one square mile for missing persons, compared to the 30 minutes it takes a drone to cover the same area, regardless of treacherous conditions on the ground, drones are uniquely suited for search and rescue, even when piloted by hobbyists. Based on the increasing trend of drone use in the last 10 months covered by the report, DJI estimated that drones would be directly responsible for saving at least one person per week in the future.


Towards Inference-Oriented Reading Comprehension: ParallelQA

Wadhwa, Soumya, Embar, Varsha, Grabmair, Matthias, Nyberg, Eric

arXiv.org Artificial Intelligence

In this paper, we investigate the tendency of end-to-end neural Machine Reading Comprehension (MRC) models to match shallow patterns rather than perform inference-oriented reasoning on RC benchmarks. We aim to test the ability of these systems to answer questions which focus on referential inference. We propose ParallelQA, a strategy to formulate such questions using parallel passages. We also demonstrate that existing neural models fail to generalize well to this setting.