Building a usable radio monitoring automatic speech recognition (ASR) system is a challenging task for under-resourced languages and yet this is paramount in societies where radio is the main medium of public communication and discussions. Initial efforts by the United Nations in Uganda have proved how understanding the perceptions of rural people who are excluded from social media is important in national planning. However, these efforts are being challenged by the absence of transcribed speech datasets. In this paper, The Makerere Artificial Intelligence research lab releases a Luganda radio speech corpus of 155 hours. To our knowledge, this is the first publicly available radio dataset in sub-Saharan Africa.
It has been estimated that 1.7 million people die from Tuberculosis (TB), and more than 10.4 million new cases are reported every year worldwide. The global'End TB' strategy aims to eliminate the disease by 2030. However, realizing this goal would be challenging if there were to be a gap in treatment adherence to prescribed medication. In the context of TB and HIV coinfection, non-adherence to the medication has been associated with the incidence of drug resistance, prolonged infection, unsuccessful treatments, and death. Africa experiences a severe shortage of healthcare workers, making delivering proper healthcare difficult.
Microsoft continues to make significant investments in deep learning, computer vision, and AI. The Microsoft Maps Team has been leveraging that investment to identify map features at scale and produce high-quality building footprint data sets with the overall goal to add to the OpenStreetMap and MissingMaps humanitarian efforts. As of this post, the following locations are available and Microsoft offers access to this data under the Open Data Commons Open Database License (ODbL). Country/Region Million buildings United States of America 129.6 Nigeria and Kenya 50.5 South America 44.5 Uganda and Tanzania 17.9 Canada 11.8 Australia 11.3 As you might expect, the vintage of the footprints depends on the collection date of the underlying imagery. Bing Maps Imagery is a composite of multiple sources with different capture dates (ranging 2012 to 2021).
When Shamim Nabuuma Kaliisa first had chest pain, she was in the second year of her medical degree at Makerere University (Kampala, Uganda). She was diagnosed with breast cancer when she was barely in her 20s. "Being told that you have cancer is one of the worst things anyone can hear", she told The Lancet Oncology. "It comes with a feeling of not having a future, with the imagination of pain until death." Luckily, at stage I, her breast cancer was treatable, but the pain she went through during the long treatment process was unbearable.
It was while fleeing the civil war in South Sudan that Lual Mayen's mother gave birth to him 28 years ago. She had four children in tow and was near to the border with Uganda, in a town called Aswa. The journey was difficult; Mayen's two sisters died on the way and he became sick. No one thought he would survive. "I can't imagine what she had to go through. There was no food, no water, nothing," says Mayen. "I remember she said she was not the only woman who gave birth on the way. Other women abandoned their children because they didn't want them to suffer. But my mother thought: "He is a gift for me, I have to keep him."' Mayen's mother made it to northern Uganda with her newborn son and reunited with her husband in a refugee camp that remained their home for the next 22 years. Mayen grew up there, and although life was a struggle, he was happy and grateful for what he had. There wasn't much to do but Mayen says he found creative ways to keep himself entertained. Then, one day he had the chance to play the video game Grand Theft Auto, which mostly revolves around driving and shooting. "While I was playing, this thought came into my mind," he remembers. "In South Sudan, most of the population is under 30.
The quest for national AI success has electrified the world--at last count, 44 countries have entered the race by creating their own national AI strategic plan. While the inclusion of countries like China, India, and the U.S. are expected, unexpected countries, including Uganda, Armenia, and Latvia, have also drafted national plans in hopes of realizing the promise. Our earlier posts, entitled "How different countries view artificial intelligence" and "Analyzing artificial intelligence plans in 34 countries" detailed how countries are approaching national AI plans, as well as how to interpret those plans. In this piece, we go a step further by examining indicators of future AI needs. Clearly, having a national AI plan is a necessary but not sufficient condition to achieve the goals of the various AI plans circulating around the world; 44 countries currently have such plans. In previous posts, we noted how AI plans were largely aspirational, and that moving from this aspiration to successful implementation required substantial public-private investments and efforts.
We demonstrate how advancements in satellite imagery and machine learning can help ameliorate these data and inference challenges. In the context of an expansion of the electrical grid across Uganda, we show how a combination of satellite imagery and computer vision can be used to develop local-level livelihood measurements appropriate for inferring the causal impact of electricity access on livelihoods. We then show how ML-based inference techniques deliver more reliable estimates of the causal impact of electrification than traditional alternatives when applied to these data. We estimate that grid access improves village-level asset wealth in rural Uganda by 0.17 standard deviations, more than doubling the growth rate over our study period relative to untreated areas. Our results provide country-scale evidence on the impact of a key infrastructure investment, and provide a low-cost, generalizable approach to future policy evaluation in data sparse environments.
Africa has over 2000 languages, but these languages are not well-represented in the existing Natural Language Processing ecosystem. One challenge is the lack of useful African language datasets that we can use to solve different social and economic problems. In this article, I have compiled a list of African language datasets from across the web. You can use these datasets in various NLP tasks such as text classification, named entity recognition, machine translation, sentiment analysis, speech recognition, and topic modeling. I've made this collection of datasets public to give you an opportunity to use your skills and help solve different challenges.
Africa has over 2000 languages however, these languages are not well represented in the existing Natural language processing (NLP) ecosystem. One of the challenges is the lack of useful African language datasets that can be used to solve different social and economical problems. In this article, I have compiled a list of African language datasets from across the web. These datasets can be used in numerous NLP tasks such as text classification, named entity recognition, machine translation, sentiment analysis, speech recognition, and topic modeling. This collection of datasets have been made public to give you an opportunity to use your skills and help solving different challenges.
Devices and tools activated through speaking will soon be the primary way people interact with technology, yet none of the main voice assistants, including Amazon's Alexa, Apple's Siri and Google Assistant, support a single native African language. Mozilla has sought to address this problem through the Common Voice project, which is now working to expand voice technology to the 100 million people who speak Kiswahili across Kenya, Uganda, Tanzania, Rwanda, Burundi and South Sudan. The open source project makes it easy for anyone to donate their voice to a publicly available database that can then be used to train voice-enabled devices, and over the past two years, more than 840 Rwandans have donated over 1,700 hours of voice data in Kinyarwanda, a language with over 12 million speakers. That voice data is now being used to help train voice chatbots with speech-to-text and text-to-speech functionality that has important information about COVID-19, according to Chenai Chair, special advisor for Africa Innovation at the Mozilla Foundation. A handful of major tech companies control the voice data that is currently used to train machine learning algorithms, posing a challenge for companies seeking to develop high-quality speech recognition technologies while also exacerbating the voice recognition divide between English speakers and the rest of the world.