Olaleye, Kayode
AI and the Future of Work in Africa White Paper
O'Neill, Jacki, Marivate, Vukosi, Glover, Barbara, Karanu, Winnie, Tadesse, Girmaw Abebe, Gyekye, Akua, Makena, Anne, Rosslyn-Smith, Wesley, Grollnek, Matthew, Wayua, Charity, Baguma, Rehema, Maduke, Angel, Spencer, Sarah, Kandie, Daniel, Maari, Dennis Ndege, Mutangana, Natasha, Axmed, Maxamed, Kamau, Nyambura, Adamu, Muhammad, Swaniker, Frank, Gatuguti, Brian, Donner, Jonathan, Graham, Mark, Mumo, Janet, Mbindyo, Caroline, N'Guessan, Charlette, Githinji, Irene, Makhafola, Lesego, Kruger, Sean, Etyang, Olivia, Onando, Mulang, Sevilla, Joe, Sambuli, Nanjira, Mbaya, Martin, Breloff, Paul, Anapey, Gideon M., Mogaleemang, Tebogo L., Nghonyama, Tiyani, Wanyoike, Muthoni, Mbuli, Bhekani, Nderu, Lawrence, Nyabero, Wambui, Alam, Uzma, Olaleye, Kayode, Njenga, Caroline, Sellen, Abigail, Kairo, David, Chabikwa, Rutendo, Abdulhamid, Najeeb G., Kubasu, Ketry, Okolo, Chinasa T., Akpo, Eugenia, Budu, Joel, Karambal, Issa, Berkoh, Joseph, Wasswa, William, Njagwi, Muchai, Burnet, Rob, Ochanda, Loise, de Bod, Hanlie, Ankrah, Elizabeth, Kinyunyu, Selemani, Kariuki, Mutembei, Maduke, Angel, Kiyimba, Kizito, Eleshin, Farida, Madeje, Lillian Secelela, Muraga, Catherine, Nganga, Ida, Gichoya, Judy, Maina, Tabbz, Maina, Samuel, Mercy, Muchai, Ochieng, Millicent, Nyairo, Stephanie
This white paper is the output of a multidisciplinary workshop in Nairobi (Nov 2023). Led by a cross-organisational team including Microsoft Research, NEPAD, Lelapa AI, and University of Oxford. The workshop brought together diverse thought-leaders from various sectors and backgrounds to discuss the implications of Generative AI for the future of work in Africa. Discussions centred around four key themes: Macroeconomic Impacts; Jobs, Skills and Labour Markets; Workers' Perspectives and Africa-Centris AI Platforms. The white paper provides an overview of the current state and trends of generative AI and its applications in different domains, as well as the challenges and risks associated with its adoption and regulation. It represents a diverse set of perspectives to create a set of insights and recommendations which aim to encourage debate and collaborative action towards creating a dignified future of work for everyone across Africa.
1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis
Ogun, Sewade, Owodunni, Abraham T., Olatunji, Tobi, Alese, Eniola, Oladimeji, Babatunde, Afonja, Tejumade, Olaleye, Kayode, Etori, Naome A., Adewumi, Tosin
Recent advances in speech synthesis have enabled many useful applications like audio directions in Google Maps, screen readers, and automated content generation on platforms like TikTok. However, these systems are mostly dominated by voices sourced from data-rich geographies with personas representative of their source data. Although 3000 of the world's languages are domiciled in Africa, African voices and personas are under-represented in these systems. As speech synthesis becomes increasingly democratized, it is desirable to increase the representation of African English accents. We present Afro-TTS, the first pan-African accented English speech synthesis system able to generate speech in 86 African accents, with 1000 personas representing the rich phonological diversity across the continent for downstream application in Education, Public Health, and Automated Content Creation. Speaker interpolation retains naturalness and accentedness, enabling the creation of new voices.
Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT as a Pivot
Terblanche, Michelle, Olaleye, Kayode, Marivate, Vukosi
Many multilingual communities, including numerous in Africa, frequently engage in code-switching during conversations. This behaviour stresses the need for natural language processing technologies adept at processing code-switched text. However, data scarcity, particularly in African languages, poses a significant challenge, as many are low-resourced and under-represented. In this study, we prompted GPT 3.5 to generate Afrikaans--English and Yoruba--English code-switched sentences, enhancing diversity using topic-keyword pairs, linguistic guidelines, and few-shot examples. Our findings indicate that the quality of generated sentences for languages using non-Latin scripts, like Yoruba, is considerably lower when compared with the high Afrikaans-English success rate. There is therefore a notable opportunity to refine prompting guidelines to yield sentences suitable for the fine-tuning of language models. We propose a framework for augmenting the diversity of synthetically generated code-switched data using GPT and propose leveraging this technology to mitigate data scarcity in low-resourced languages, underscoring the essential role of native speakers in this process.