AITopics | Marivate, Vukosi

Collaborating Authors

Marivate, Vukosi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

HausaNLP at SemEval-2025 Task 2: Entity-Aware Fine-tuning vs. Prompt Engineering in Entity-Aware Machine Translation

Abubakar, Abdulhamid, Abdulkadir, Hamidatu, Abdullahi, Ibrahim Rabiu, Khalid, Abubakar Auwal, Wali, Ahmad Mustapha, Umar, Amina Aminu, Bala, Maryam, Sani, Sani Abdullahi, Ahmad, Ibrahim Said, Muhammad, Shamsuddeen Hassan, Abdulmumin, Idris, Marivate, Vukosi

arXiv.org Artificial IntelligenceMar-25-2025

This paper presents our findings for SemEval 2025 Task 2, a shared task on entity-aware machine translation (EA-MT). The goal of this task is to develop translation models that can accurately translate English sentences into target languages, with a particular focus on handling named entities, which often pose challenges for MT systems. The task covers 10 target languages with English as the source. In this paper, we describe the different systems we employed, detail our results, and discuss insights gained from our experiments.

large language model, natural language, translation, (17 more...)

arXiv.org Artificial Intelligence

2503.19702

Country:

North America > United States (0.47)
Asia (0.46)

Genre: Research Report > New Finding (0.55)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

MAGE: Multi-Head Attention Guided Embeddings for Low Resource Sentiment Classification

Vashisht, Varun, Singh, Samar, Konduskar, Mihir, Walia, Jaskaran Singh, Marivate, Vukosi

arXiv.org Artificial IntelligenceFeb-25-2025

Due to the lack of quality data for low-resource Bantu languages, significant challenges are presented in text classification and other practical implementations. In this paper, we introduce an advanced model combining Language-Independent Data Augmentation (LiDA) with Multi-Head Attention based weighted embeddings to selectively enhance critical data points and improve text classification performance. This integration allows us to create robust data augmentation strategies that are effective across various linguistic contexts, ensuring that our model can handle the unique syntactic and semantic features of Bantu languages. This approach not only addresses the data scarcity issue but also sets a foundation for future research in low-resource language processing and classification tasks.

computational linguistic, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2502.17987

Country:

North America > United States > Texas (0.14)
Asia > Middle East > UAE (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Promising Solution (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

AfroXLMR-Comet: Multilingual Knowledge Distillation with Attention Matching for Low-Resource languages

Raju, Joshua Sakthivel, S, Sanjay, Walia, Jaskaran Singh, Raghav, Srinivas, Marivate, Vukosi

arXiv.org Artificial IntelligenceFeb-25-2025

Language model compression through knowledge distillation has emerged as a promising approach for deploying large language models in resource-constrained environments. However, existing methods often struggle to maintain performance when distilling multilingual models, especially for low-resource languages. In this paper, we present a novel hybrid distillation approach that combines traditional knowledge distillation with a simplified attention matching mechanism, specifically designed for multilingual contexts. Our method introduces an extremely compact student model architecture, significantly smaller than conventional multilingual models. We evaluate our approach on five African languages: Kinyarwanda, Swahili, Hausa, Igbo, and Yoruba. The distilled student model; AfroXLMR-Comet successfully captures both the output distribution and internal attention patterns of a larger teacher model (AfroXLMR-Large) while reducing the model size by over 85%. Experimental results demonstrate that our hybrid approach achieves competitive performance compared to the teacher model, maintaining an accuracy within 85% of the original model's performance while requiring substantially fewer computational resources. Our work provides a practical framework for deploying efficient multilingual models in resource-constrained environments, particularly benefiting applications involving African languages.

large language model, machine learning, student model, (21 more...)

arXiv.org Artificial Intelligence

2502.1802

Country:

Europe (0.68)
North America > United States (0.25)

Genre: Research Report > Promising Solution (0.68)

Industry: Education (0.62)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages

Rajab, Jenalea, Aremu, Anuoluwapo, Chimoto, Everlyn Asiko, Dunbar, Dale, Morrissey, Graham, Thior, Fadel, Potgieter, Luandrie, Ojo, Jessico, Tonja, Atnafu Lambebo, Chetty, Maushami, Nekoto, Onyothi, Moiloa, Pelonomi, Abbott, Jade, Marivate, Vukosi, Rosman, Benjamin

arXiv.org Artificial IntelligenceFeb-21-2025

This paper presents the Esethu Framework, a sustainable data curation framework specifically designed to empower local communities and ensure equitable benefit-sharing from their linguistic resources. This framework is supported by the Esethu license, a novel community-centric data license. As a proof of concept, we introduce the Vuk'uzenzele isiXhosa Speech Dataset (ViXSD), an open-source corpus developed under the Esethu Framework and License. The dataset, containing read speech from native isiXhosa speakers enriched with demographic and linguistic metadata, demonstrates how community-driven licensing and curation principles can bridge resource gaps in automatic speech recognition (ASR) for African languages while safeguarding the interests of data creators. We describe the framework guiding dataset development, outline the Esethu license provisions, present the methodology for ViXSD, and present ASR experiments validating ViXSD's usability in building and refining voice-driven applications for isiXhosa.

artificial intelligence, natural language, speech recognition, (12 more...)

arXiv.org Artificial Intelligence

2502.15916

Country: Africa > South Africa > Gauteng (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government > Africa Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages

Muhammad, Shamsuddeen Hassan, Ousidhoum, Nedjma, Abdulmumin, Idris, Wahle, Jan Philip, Ruas, Terry, Beloucif, Meriem, de Kock, Christine, Surange, Nirmal, Teodorescu, Daniela, Ahmad, Ibrahim Said, Adelani, David Ifeoluwa, Aji, Alham Fikri, Ali, Felermino D. M. A., Alimova, Ilseyar, Araujo, Vladimir, Babakov, Nikolay, Baes, Naomi, Bucur, Ana-Maria, Bukula, Andiswa, Cao, Guanqun, Cardenas, Rodrigo Tufino, Chevi, Rendi, Chukwuneke, Chiamaka Ijeoma, Ciobotaru, Alexandra, Dementieva, Daryna, Gadanya, Murja Sani, Geislinger, Robert, Gipp, Bela, Hourrane, Oumaima, Ignat, Oana, Lawan, Falalu Ibrahim, Mabuya, Rooweither, Mahendra, Rahmad, Marivate, Vukosi, Piper, Andrew, Panchenko, Alexander, Ferreira, Charles Henrique Porto, Protasov, Vitaly, Rutunda, Samuel, Shrivastava, Manish, Udrea, Aura Cristina, Wanzare, Lilian Diana Awuor, Wu, Sophie, Wunderlich, Florian Valentin, Zhafran, Hanif Muhammad, Zhang, Tianhui, Zhou, Yi, Mohammad, Saif M.

arXiv.org Artificial IntelligenceFeb-17-2025

People worldwide use language in subtle and complex ways to express emotions. While emotion recognition -- an umbrella term for several NLP tasks -- significantly impacts different applications in NLP and other fields, most work in the area is focused on high-resource languages. Therefore, this has led to major disparities in research and proposed solutions, especially for low-resource languages that suffer from the lack of high-quality datasets. In this paper, we present BRIGHTER-- a collection of multilabeled emotion-annotated datasets in 28 different languages. BRIGHTER covers predominantly low-resource languages from Africa, Asia, Eastern Europe, and Latin America, with instances from various domains annotated by fluent speakers. We describe the data collection and annotation processes and the challenges of building these datasets. Then, we report different experimental results for monolingual and crosslingual multi-label emotion identification, as well as intensity-level emotion recognition. We investigate results with and without using LLMs and analyse the large variability in performance across languages and text domains. We show that BRIGHTER datasets are a step towards bridging the gap in text-based emotion recognition and discuss their impact and utility.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.11926

Country:

North America > United States (1.00)
Europe (1.00)
Asia (0.88)
(2 more...)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

Analysing Public Transport User Sentiment on Low Resource Multilingual Data

Myoya, Rozina L., Marivate, Vukosi, Abdulmumin, Idris

arXiv.org Artificial IntelligenceDec-9-2024

Public transport systems in many Sub-Saharan countries often receive less attention compared to other sectors, underscoring the need for innovative solutions to improve the Quality of Service (QoS) and overall user experience. This study explored commuter opinion mining to understand sentiments toward existing public transport systems in Kenya, Tanzania, and South Africa. We used a qualitative research design, analysing data from X (formerly Twitter) to assess sentiments across rail, mini-bus taxis, and buses. By leveraging Multilingual Opinion Mining techniques, we addressed the linguistic diversity and code-switching present in our dataset, thus demonstrating the application of Natural Language Processing (NLP) in extracting insights from under-resourced languages. We employed PLMs such as AfriBERTa, AfroXLMR, AfroLM, and PuoBERTa to conduct the sentiment analysis. The results revealed predominantly negative sentiments in South Africa and Kenya, while the Tanzanian dataset showed mainly positive sentiments due to the advertising nature of the tweets. Furthermore, feature extraction using the Word2Vec model and K-Means clustering illuminated semantic relationships and primary themes found within the different datasets. By prioritising the analysis of user experiences and sentiments, this research paves the way for developing more responsive, user-centered public transport systems in Sub-Saharan countries, contributing to the broader goal of improving urban mobility and sustainability.

artificial intelligence, dataset, natural language, (16 more...)

arXiv.org Artificial Intelligence

2412.06951

Country: Africa > South Africa (1.00)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Rail (1.00)
Transportation > Ground > Road (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.92)

Add feedback

AI and the Future of Work in Africa White Paper

O'Neill, Jacki, Marivate, Vukosi, Glover, Barbara, Karanu, Winnie, Tadesse, Girmaw Abebe, Gyekye, Akua, Makena, Anne, Rosslyn-Smith, Wesley, Grollnek, Matthew, Wayua, Charity, Baguma, Rehema, Maduke, Angel, Spencer, Sarah, Kandie, Daniel, Maari, Dennis Ndege, Mutangana, Natasha, Axmed, Maxamed, Kamau, Nyambura, Adamu, Muhammad, Swaniker, Frank, Gatuguti, Brian, Donner, Jonathan, Graham, Mark, Mumo, Janet, Mbindyo, Caroline, N'Guessan, Charlette, Githinji, Irene, Makhafola, Lesego, Kruger, Sean, Etyang, Olivia, Onando, Mulang, Sevilla, Joe, Sambuli, Nanjira, Mbaya, Martin, Breloff, Paul, Anapey, Gideon M., Mogaleemang, Tebogo L., Nghonyama, Tiyani, Wanyoike, Muthoni, Mbuli, Bhekani, Nderu, Lawrence, Nyabero, Wambui, Alam, Uzma, Olaleye, Kayode, Njenga, Caroline, Sellen, Abigail, Kairo, David, Chabikwa, Rutendo, Abdulhamid, Najeeb G., Kubasu, Ketry, Okolo, Chinasa T., Akpo, Eugenia, Budu, Joel, Karambal, Issa, Berkoh, Joseph, Wasswa, William, Njagwi, Muchai, Burnet, Rob, Ochanda, Loise, de Bod, Hanlie, Ankrah, Elizabeth, Kinyunyu, Selemani, Kariuki, Mutembei, Maduke, Angel, Kiyimba, Kizito, Eleshin, Farida, Madeje, Lillian Secelela, Muraga, Catherine, Nganga, Ida, Gichoya, Judy, Maina, Tabbz, Maina, Samuel, Mercy, Muchai, Ochieng, Millicent, Nyairo, Stephanie

arXiv.org Artificial IntelligenceNov-15-2024

This white paper is the output of a multidisciplinary workshop in Nairobi (Nov 2023). Led by a cross-organisational team including Microsoft Research, NEPAD, Lelapa AI, and University of Oxford. The workshop brought together diverse thought-leaders from various sectors and backgrounds to discuss the implications of Generative AI for the future of work in Africa. Discussions centred around four key themes: Macroeconomic Impacts; Jobs, Skills and Labour Markets; Workers' Perspectives and Africa-Centris AI Platforms. The white paper provides an overview of the current state and trends of generative AI and its applications in different domains, as well as the challenges and risks associated with its adoption and regulation. It represents a diverse set of perspectives to create a set of insights and recommendations which aim to encourage debate and collaborative action towards creating a dignified future of work for everyone across Africa.

generative ai, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.10091

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.34)
Africa > Kenya > Nairobi City County > Nairobi (0.24)

Genre: Research Report (1.00)

Industry:

Social Sector (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.96)

Add feedback

From N-grams to Pre-trained Multilingual Models For Language Identification

Sindane, Thapelo, Marivate, Vukosi

arXiv.org Artificial IntelligenceOct-11-2024

In this paper, we investigate the use of N-gram models and Large Pre-trained Multilingual models for Language Identification (LID) across 11 South African languages. For N-gram models, this study shows that effective data size selection remains crucial for establishing effective frequency distributions of the target languages, that efficiently model each language, thus, improving language ranking. For pre-trained multilingual models, we conduct extensive experiments covering a diverse set of massively pre-trained multilingual (PLM) models -- mBERT, RemBERT, XLM-r, and Afri-centric multilingual models -- AfriBERTa, Afro-XLMr, AfroLM, and Serengeti. We further compare these models with available large-scale Language Identification tools: Compact Language Detector v3 (CLD V3), AfroLID, GlotLID, and OpenLID to highlight the importance of focused-based LID. From these, we show that Serengeti is a superior model across models: N-grams to Transformers on average. Moreover, we propose a lightweight BERT-based LID model (za_BERT_lid) trained with NHCLT + Vukzenzele corpus, which performs on par with our best-performing Afri-centric models.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.08728

Country: Asia > Middle East (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.31)

Add feedback

Prompting Towards Alleviating Code-Switched Data Scarcity in Under-Resourced Languages with GPT as a Pivot

Terblanche, Michelle, Olaleye, Kayode, Marivate, Vukosi

arXiv.org Artificial IntelligenceApr-26-2024

Many multilingual communities, including numerous in Africa, frequently engage in code-switching during conversations. This behaviour stresses the need for natural language processing technologies adept at processing code-switched text. However, data scarcity, particularly in African languages, poses a significant challenge, as many are low-resourced and under-represented. In this study, we prompted GPT 3.5 to generate Afrikaans--English and Yoruba--English code-switched sentences, enhancing diversity using topic-keyword pairs, linguistic guidelines, and few-shot examples. Our findings indicate that the quality of generated sentences for languages using non-Latin scripts, like Yoruba, is considerably lower when compared with the high Afrikaans-English success rate. There is therefore a notable opportunity to refine prompting guidelines to yield sentences suitable for the fine-tuning of language models. We propose a framework for augmenting the diversity of synthetically generated code-switched data using GPT and propose leveraging this technology to mitigate data scarcity in low-resourced languages, underscoring the essential role of native speakers in this process.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2404.17216

Country:

Africa > South Africa (0.14)
Asia > Japan (0.14)

Genre:

Overview (0.93)
Research Report > New Finding (0.68)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Multimodal Misinformation Detection in a South African Social Media Environment

De Jager, Amica, Marivate, Vukosi, Modupe, Abioudun

arXiv.org Artificial IntelligenceDec-7-2023

With the constant spread of misinformation on social media networks, a need has arisen to continuously assess the veracity of digital content. This need has inspired numerous research efforts on the development of misinformation detection (MD) models. However, many models do not use all information available to them and existing research contains a lack of relevant datasets to train the models, specifically within the South African social media environment. The aim of this paper is to investigate the transferability of knowledge of a MD model between different contextual environments. This research contributes a multimodal MD model capable of functioning in the South African social media environment, as well as introduces a South African misinformation dataset. The model makes use of multiple sources of information for misinformation detection, namely: textual and visual elements. It uses bidirectional encoder representations from transformers (BERT) as the textual encoder and a residual network (ResNet) as the visual encoder. The model is trained and evaluated on the Fakeddit dataset and a South African misinformation dataset. Results show that using South African samples in the training of the model increases model performance, in a South African contextual environment, and that a multimodal model retains significantly more knowledge than both the textual and visual unimodal models. Our study suggests that the performance of a misinformation detection model is influenced by the cultural nuances of its operating environment and multimodal models assist in the transferability of knowledge between different contextual environments. Therefore, local data should be incorporated into the training process of a misinformation detection model in order to optimize model performance.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-49002-6_19

2312.04052

Country:

Europe (0.46)
Africa > South Africa (0.30)
North America > United States (0.28)

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.48)

Industry: Media > News (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback