AITopics | Olatunji, Tobi

Collaborating Authors

Olatunji, Tobi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Afrispeech-Dialog: A Benchmark Dataset for Spontaneous English Conversations in Healthcare and Beyond

Sanni, Mardhiyah, Abdullahi, Tassallah, Kayande, Devendra D., Ayodele, Emmanuel, Etori, Naome A., Mollel, Michael S., Yekini, Moshood, Okocha, Chibuzor, Ismaila, Lukman E., Omofoye, Folafunmi, Adewale, Boluwatife A., Olatunji, Tobi

arXiv.org Artificial IntelligenceFeb-6-2025

Speech technologies are transforming interactions across various sectors, from healthcare to call centers and robots, yet their performance on African-accented conversations remains underexplored. We introduce Afrispeech-Dialog, a benchmark dataset of 50 simulated medical and non-medical African-accented English conversations, designed to evaluate automatic speech recognition (ASR) and related technologies. We assess state-of-the-art (SOTA) speaker diarization and ASR systems on long-form, accented speech, comparing their performance with native accents and discover a 10%+ performance degradation. Additionally, we explore medical conversation summarization capabilities of large language models (LLMs) to demonstrate the impact of ASR errors on downstream medical summaries, providing insights into the challenges and opportunities for speech technologies in the Global South. Our work highlights the need for more inclusive datasets to advance conversational AI in low-resource settings.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.03945

Country:

Africa (0.93)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Health Care Technology (0.68)
Information Technology > Security & Privacy (0.68)
Health & Medicine > Therapeutic Area > Gastroenterology (0.48)
Health & Medicine > Therapeutic Area > Oncology (0.47)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Multicultural Medical Assistant: Can LLMs Improve Medical ASR Errors Across Borders?

Adedeji, Ayo, Sanni, Mardhiyah, Ayodele, Emmanuel, Joshi, Sarita, Olatunji, Tobi

arXiv.org Artificial IntelligenceJan-25-2025

The global adoption of Large Language Models (LLMs) in healthcare shows promise to enhance clinical workflows and improve patient outcomes. However, Automatic Speech Recognition (ASR) errors in critical medical terms remain a significant challenge. These errors can compromise patient care and safety if not detected. This study investigates the prevalence and impact of ASR errors in medical transcription in Nigeria, the United Kingdom, and the United States. By evaluating raw and LLM-corrected transcriptions of accented English in these regions, we assess the potential and limitations of LLMs to address challenges related to accents and medical terminology in ASR. Our findings highlight significant disparities in ASR accuracy across regions and identify specific conditions under which LLM corrections are most effective.

gemini 1, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2501.1531

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset

Olatunji, Tobi, Nimo, Charles, Owodunni, Abraham, Abdullahi, Tassallah, Ayodele, Emmanuel, Sanni, Mardhiyah, Aka, Chinemelu, Omofoye, Folafunmi, Yuehgoh, Foutse, Faniran, Timothy, Dossou, Bonaventure F. P., Yekini, Moshood, Kemp, Jonas, Heller, Katherine, Omeke, Jude Chidubem, MD, Chidi Asuzu, Etori, Naome A., Ndiaye, Aimérou, Okoh, Ifeoma, Ocansey, Evans Doe, Kinara, Wendy, Best, Michael, Essa, Irfan, Moore, Stephen Edward, Fourie, Chris, Asiedu, Mercy Nyamewaa

arXiv.org Artificial IntelligenceJan-14-2025

Recent advancements in large language model(LLM) performance on medical multiple choice question (MCQ) benchmarks have stimulated interest from healthcare providers and patients globally. Particularly in low-and middle-income countries (LMICs) facing acute physician shortages and lack of specialists, LLMs offer a potentially scalable pathway to enhance healthcare access and reduce costs. However, their effectiveness in the Global South, especially across the African continent, remains to be established. In this work, we introduce AfriMed-QA, the first large scale Pan-African English multi-specialty medical Question-Answering (QA) dataset, 15,000 questions (open and closed-ended) sourced from over 60 medical schools across 16 countries, covering 32 medical specialties. We further evaluate 30 LLMs across multiple axes including correctness and demographic bias. Our findings show significant performance variation across specialties and geographies, MCQ performance clearly lags USMLE (MedQA). We find that biomedical LLMs underperform general models and smaller edge-friendly LLMs struggle to achieve a passing score. Interestingly, human evaluations show a consistent consumer preference for LLM answers and explanations when compared with clinician answers.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.1564

Country:

Africa (1.00)
Asia > China (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

1000 African Voices: Advancing inclusive multi-speaker multi-accent speech synthesis

Ogun, Sewade, Owodunni, Abraham T., Olatunji, Tobi, Alese, Eniola, Oladimeji, Babatunde, Afonja, Tejumade, Olaleye, Kayode, Etori, Naome A., Adewumi, Tosin

arXiv.org Artificial IntelligenceJun-27-2024

Recent advances in speech synthesis have enabled many useful applications like audio directions in Google Maps, screen readers, and automated content generation on platforms like TikTok. However, these systems are mostly dominated by voices sourced from data-rich geographies with personas representative of their source data. Although 3000 of the world's languages are domiciled in Africa, African voices and personas are under-represented in these systems. As speech synthesis becomes increasingly democratized, it is desirable to increase the representation of African English accents. We present Afro-TTS, the first pan-African accented English speech synthesis system able to generate speech in 86 African accents, with 1000 personas representing the rich phonological diversity across the continent for downstream application in Education, Public Health, and Automated Content Creation. Speaker interpolation retains naturalness and accentedness, enabling the creation of new voices.

african voice, artificial intelligence, inclusive multi-speaker multi-accent speech synthesis

arXiv.org Artificial Intelligence

2406.11727

Country: Africa (0.24)

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Speech > Speech Synthesis (1.00)

Add feedback

AccentFold: A Journey through African Accents for Zero-Shot ASR Adaptation to Target Accents

Owodunni, Abraham Toluwase, Yadavalli, Aditya, Emezue, Chris Chinenye, Olatunji, Tobi, Mbataku, Clinton C

arXiv.org Artificial IntelligenceFeb-5-2024

Despite advancements in speech recognition, accented speech remains challenging. While previous approaches have focused on modeling techniques or creating accented speech datasets, gathering sufficient data for the multitude of accents, particularly in the African context, remains impractical due to their sheer diversity and associated budget constraints. To address these challenges, we propose AccentFold, a method that exploits spatial relationships between learned accent embeddings to improve downstream Automatic Speech Recognition (ASR). Our exploratory analysis of speech embeddings representing 100+ African accents reveals interesting spatial accent relationships highlighting geographic and genealogical similarities, capturing consistent phonological, and morphological regularities, all learned empirically from speech. Furthermore, we discover accent relationships previously uncharacterized by the Ethnologue. Through empirical evaluation, we demonstrate the effectiveness of AccentFold by showing that, for out-of-distribution (OOD) accents, sampling accent subsets for training based on AccentFold information outperforms strong baselines a relative WER improvement of 4.6%. AccentFold presents a promising approach for improving ASR performance on accented speech, particularly in the context of African accents, where data scarcity and budget constraints pose significant challenges. Our findings emphasize the potential of leveraging linguistic relationships to improve zero-shot ASR adaptation to target accents.

artificial intelligence, natural language, speech recognition, (14 more...)

arXiv.org Artificial Intelligence

2402.01152

Country:

Africa (1.00)
North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Adapting Pretrained ASR Models to Low-resource Clinical Speech using Epistemic Uncertainty-based Data Selection

Dossou, Bonaventure F. P., Tonja, Atnafu Lambebo, Emezue, Chris Chinenye, Olatunji, Tobi, Etori, Naome A, Osei, Salomey, Adewumi, Tosin, Singh, Sahib

arXiv.org Artificial IntelligenceOct-8-2023

While there has been significant progress in ASR, African-accented clinical ASR has been understudied due to a lack of training datasets. Building robust ASR systems in this domain requires large amounts of annotated or labeled data, for a wide variety of linguistically and morphologically rich accents, which are expensive to create. Our study aims to address this problem by reducing annotation expenses through informative uncertainty-based data selection. We show that incorporating epistemic uncertainty into our adaptation rounds outperforms several baseline results, established using state-of-the-art (SOTA) ASR models, while reducing the required amount of labeled data, and hence reducing annotation costs. Our approach also improves out-of-distribution generalization for very low-resource accents, demonstrating the viability of our approach for building generalizable ASR models in the context of accented African clinical ASR, where training datasets are predominantly scarce.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.02105

Country:

Africa (1.00)
North America > United States (0.46)
North America > Canada > Quebec (0.14)

Genre: Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (0.68)
Health & Medicine > Consumer Health (0.68)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR

Olatunji, Tobi, Afonja, Tejumade, Yadavalli, Aditya, Emezue, Chris Chinenye, Singh, Sahib, Dossou, Bonaventure F. P., Osuchukwu, Joanne, Osei, Salomey, Tonja, Atnafu Lambebo, Etori, Naome, Mbataku, Clinton

arXiv.org Artificial IntelligenceSep-30-2023

Africa has a very low doctor-to-patient ratio. At very busy clinics, doctors could see 30+ patients per day -- a heavy patient burden compared with developed countries -- but productivity tools such as clinical automatic speech recognition (ASR) are lacking for these overworked clinicians. However, clinical ASR is mature, even ubiquitous, in developed nations, and clinician-reported performance of commercial clinical ASR systems is generally satisfactory. Furthermore, the recent performance of general domain ASR is approaching human accuracy. However, several gaps exist. Several publications have highlighted racial bias with speech-to-text algorithms and performance on minority accents lags significantly. To our knowledge, there is no publicly available research or benchmark on accented African clinical ASR, and speech data is non-existent for the majority of African accents. We release AfriSpeech, 200hrs of Pan-African English speech, 67,577 clips from 2,463 unique speakers across 120 indigenous accents from 13 countries for clinical and general domain ASR, a benchmark test set, with publicly available pre-trained models with SOTA performance on the AfriSpeech benchmark.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2310.00274

Country:

Africa (1.00)
North America > United States > Colorado (0.14)
North America > Canada > Quebec (0.14)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Health Care Providers & Services (0.46)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AfriNames: Most ASR models "butcher" African Names

Olatunji, Tobi, Afonja, Tejumade, Dossou, Bonaventure F. P., Tonja, Atnafu Lambebo, Emezue, Chris Chinenye, Rufai, Amina Mardiyyah, Singh, Sahib

arXiv.org Artificial IntelligenceJun-2-2023

Useful conversational agents must accurately capture named entities to minimize error for downstream tasks, for example, asking a voice assistant to play a track from a certain artist, initiating navigation to a specific location, or documenting a laboratory result for a patient. However, where named entities such as ``Ukachukwu`` (Igbo), ``Lakicia`` (Swahili), or ``Ingabire`` (Rwandan) are spoken, automatic speech recognition (ASR) models' performance degrades significantly, propagating errors to downstream systems. We model this problem as a distribution shift and demonstrate that such model bias can be mitigated through multilingual pre-training, intelligent data augmentation strategies to increase the representation of African-named entities, and fine-tuning multilingual ASR models on multiple African accents. The resulting fine-tuned models show an 81.5\% relative WER improvement compared with the baseline on samples with African-named entities.

artificial intelligence, natural language, recognition, (18 more...)

arXiv.org Artificial Intelligence

2306.00253

Country:

Africa (0.48)
North America > Canada > Quebec (0.14)
North America > United States (0.14)
Europe > France (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Media (0.46)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback