AITopics | Uganda

Collaborating Authors

Uganda

PaliGemma-CXR: A Multi-task Multimodal Model for TB Chest X-ray Interpretation

Musinguzi, Denis, Katumba, Andrew, Murindanyi, Sudi

arXiv.org Artificial IntelligenceFeb-28-2025

Tuberculosis (TB) is a infectious global health challenge. Chest X-rays are a standard method for TB screening, yet many countries face a critical shortage of radiologists capable of interpreting these images. Machine learning offers an alternative, as it can automate tasks such as disease diagnosis, and report generation. However, traditional approaches rely on task-specific models, which cannot utilize the interdependence between tasks. Building a multi-task model capable of performing multiple tasks poses additional challenges such as scarcity of multimodal data, dataset imbalance, and negative transfer. To address these challenges, we propose PaliGemma-CXR, a multi-task multimodal model capable of performing TB diagnosis, object detection, segmentation, report generation, and VQA. Starting with a dataset of chest X-ray images annotated with TB diagnosis labels and segmentation masks, we curated a multimodal dataset to support additional tasks. By finetuning PaliGemma on this dataset and sampling data using ratios of the inverse of the size of task datasets, we achieved the following results across all tasks: 90.32% accuracy on TB diagnosis and 98.95% on close-ended VQA, 41.3 BLEU score on report generation, and a mAP of 19.4 and 16.0 on object detection and segmentation, respectively. These results demonstrate that PaliGemma-CXR effectively leverages the interdependence between multiple image interpretation tasks to enhance performance.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2503.00171

Country: Africa > Uganda (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.90)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Deep Learning-Based Transfer Learning for Classification of Cassava Disease

Junior, Ademir G. Costa, da Silva, Fábio S., Rios, Ricardo

arXiv.org Artificial IntelligenceFeb-26-2025

This paper presents a performance comparison among four Convolutional Neural Network architectures (EfficientNet-B3, InceptionV3, ResNet50, and VGG16) for classifying cassava disease images. The images were sourced from an imbalanced dataset from a competition. Appropriate metrics were employed to address class imbalance. The results indicate that EfficientNet-B3 achieved on this task accuracy of 87.7%, precision of 87.8%, revocation of 87.8% and F1-Score of 87.7%. These findings suggest that EfficientNet-B3 could be a valuable tool to support Digital Agriculture.

artificial intelligence, conjunto, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.5753/eniac.2024.244378

2502.19351

Country:

South America > Brazil (0.15)
Africa > Uganda (0.14)

Genre: Research Report (0.70)

Industry: Food & Agriculture > Agriculture (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-task Learning for Aggregated Data using Gaussian Processes

Fariba Yousefi, Michael T. Smith, Mauricio Álvarez

Neural Information Processing SystemsJan-24-2025, 06:24:34 GMT

Aggregated data is commonplace in areas such as epidemiology and demography. For example, census data for a population is usually given as averages defined over time periods or spatial resolutions (cities, regions or countries). In this paper, we present a novel multi-task learning model based on Gaussian processes for joint learning of variables that have been aggregated at different input scales. Our model represents each task as the linear combination of the realizations of latent processes that are integrated at a different scale per task. We are then able to compute the cross-covariance between the different tasks either analytically or numerically. We also allow each task to have a potentially different likelihood model and provide a variational lower bound that can be optimised in a stochastic fashion making our model suitable for larger datasets. We show examples of the model in a synthetic example, a fertility dataset and an air pollution prediction application.

artificial intelligence, inductive learning, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada (0.14)
Africa > Uganda (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Add feedback

Optimizing Vital Sign Monitoring in Resource-Constrained Maternal Care: An RL-Based Restless Bandit Approach

Boehmer, Niclas, Zhao, Yunfan, Xiong, Guojun, Rodriguez-Diaz, Paula, Cibrian, Paola Del Cueto, Ngonzi, Joseph, Boatin, Adeline, Tambe, Milind

arXiv.org Artificial IntelligenceOct-10-2024

Maternal mortality remains a significant global public health challenge. One promising approach to reducing maternal deaths occurring during facility-based childbirth is through early warning systems, which require the consistent monitoring of mothers' vital signs after giving birth. Wireless vital sign monitoring devices offer a labor-efficient solution for continuous monitoring, but their scarcity raises the critical question of how to allocate them most effectively. We devise an allocation algorithm for this problem by modeling it as a variant of the popular Restless Multi-Armed Bandit (RMAB) paradigm. In doing so, we identify and address novel, previously unstudied constraints unique to this domain, which render previous approaches for RMABs unsuitable and significantly increase the complexity of the learning and planning problem. To overcome these challenges, we adopt the popular Proximal Policy Optimization (PPO) algorithm from reinforcement learning to learn an allocation policy by training a policy and value function network. We demonstrate in simulations that our approach outperforms the best heuristic baseline by up to a factor of $4$.

artificial intelligence, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.08377

Country:

Africa > Uganda (0.17)
North America > United States (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (1.00)
Health & Medicine > Public Health > Maternal Health (1.00)
Health & Medicine > Diagnostic Medicine > Vital Signs (0.98)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.48)

Add feedback

Lions' record-breaking swim across channel captured by drone camera

New ScientistJul-10-2024, 20:00:18 GMT

A pair of lion brothers have made the longest swim ever recorded for their species – about 1.5 kilometres across hippo and crocodile-infested waters. The massive swim – equivalent to the aquatic leg of an Olympic triathlon – was the pair's fourth attempt to cross the Kazinga Channel in Queen Elizabeth National Park, Uganda, and was recorded by a drone-mounted thermal camera at night. The lions had to abort earlier attempts after encountering large animals, most likely hippos or Nile crocodiles, which are also visible in the footage. Making the effort even more extraordinary, one of the lions, named Jacob, has only three legs. Jacob has had an extremely challenging life, says Alexander Braczkowski at Griffith University in Australia: he has been gored by a buffalo, his family was poisoned for the lion body-part trade, he was caught in a poacher's snare and he eventually lost his leg after it was stuck in a poacher's steel trap.

artificial intelligence, braczkowski, queen elizabeth national park, (7 more...)

New Scientist

Country:

Africa > Uganda (0.28)
Oceania > Australia (0.26)

Industry: Media > Photography (0.40)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.40)

Add feedback

Decoding moral judgement from text: a pilot study

Gherman, Diana E., Zander, Thorsten O.

arXiv.org Artificial IntelligenceMay-28-2024

Moral judgement is a complex human reaction that engages cognitive and emotional dimensions. While some of the morality neural correlates are known, it is currently unclear if we can detect moral violation at a single-trial level. In a pilot study, here we explore the feasibility of moral judgement decoding from text stimuli with passive brain-computer interfaces. For effective moral judgement elicitation, we use video-audio affective priming prior to text stimuli presentation and attribute the text to moral agents. Our results show that further efforts are necessary to achieve reliable classification between moral congruency vs. incongruency states. We obtain good accuracy results for neutral vs. morally-charged trials. With this research, we try to pave the way towards neuroadaptive human-computer interaction and more human-compatible large language models (LLMs)

classification, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2407.00039

Country:

Africa > Uganda (0.30)
North America > United States (0.28)

Genre: Research Report > New Finding (0.69)

Industry:

Law (0.69)
Health & Medicine > Health Care Technology (0.68)
Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)

Add feedback

Building a Luganda Text-to-Speech Model From Crowdsourced Data

Kagumire, Sulaiman, Katumba, Andrew, Nakatumba-Nabende, Joyce, Quinn, John

arXiv.org Artificial IntelligenceMay-16-2024

Text-to-speech (TTS) development for African languages such as Luganda is still limited, primarily due to the scarcity of high-quality, single-speaker recordings essential for training TTS models. Prior work has focused on utilizing the Luganda Common Voice recordings of multiple speakers aged between 20-49. Although the generated speech is intelligible, it is still of lower quality than the model trained on studio-grade recordings. This is due to the insufficient data preprocessing methods applied to improve the quality of the Common Voice recordings. Furthermore, speech convergence is more difficult to achieve due to varying intonations, as well as background noise. In this paper, we show that the quality of Luganda TTS from Common Voice can improve by training on multiple speakers of close intonation in addition to further preprocessing of the training data. Specifically, we selected six female speakers with close intonation determined by subjectively listening and comparing their voice recordings. In addition to trimming out silent portions from the beginning and end of the recordings, we applied a pre-trained speech enhancement model to reduce background noise and enhance audio quality. We also utilized a pre-trained, non-intrusive, self-supervised Mean Opinion Score (MOS) estimation model to filter recordings with an estimated MOS over 3.5, indicating high perceived quality. Subjective MOS evaluations from nine native Luganda speakers demonstrate that our TTS model achieves a significantly better MOS of 3.55 compared to the reported 2.5 MOS of the existing model. Moreover, for a fair comparison, our model trained on six speakers outperforms models trained on a single-speaker (3.13 MOS) or two speakers (3.22 MOS). This showcases the effectiveness of compensating for the lack of data from one speaker with data from multiple speakers of close intonation to improve TTS quality.

artificial intelligence, machine learning, multiple speaker, (15 more...)

arXiv.org Artificial Intelligence

2405.10211

Country: Africa > Uganda (0.15)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.62)

Add feedback

Luganda Speech Intent Recognition for IoT Applications

Katumba, Andrew, Murindanyi, Sudi, Kasule, John Trevor, Mugume, Elvis

arXiv.org Artificial IntelligenceMay-16-2024

The advent of Internet of Things (IoT) technology has generated massive interest in voice-controlled smart homes. While many voice-controlled smart home systems are designed to understand and support widely spoken languages like English, speakers of low-resource languages like Luganda may need more support. This research project aimed to develop a Luganda speech intent classification system for IoT applications to integrate local languages into smart home environments. The project uses hardware components such as Raspberry Pi, Wio Terminal, and ESP32 nodes as microcontrollers. The Raspberry Pi processes Luganda voice commands, the Wio Terminal is a display device, and the ESP32 nodes control the IoT devices. The ultimate objective of this work was to enable voice control using Luganda, which was accomplished through a natural language processing (NLP) model deployed on the Raspberry Pi. The NLP model utilized Mel Frequency Cepstral Coefficients (MFCCs) as acoustic features and a Convolutional Neural Network (Conv2D) architecture for speech intent classification. A dataset of Luganda voice commands was curated for this purpose and this has been made open-source. This work addresses the localization challenges and linguistic diversity in IoT applications by incorporating Luganda voice commands, enabling users to interact with smart home devices without English proficiency, especially in regions where local languages are predominant.

machine learning, natural language, raspberry pi, (20 more...)

arXiv.org Artificial Intelligence

2405.19343

Country: Africa > Uganda (0.15)

Genre: Research Report (0.50)

Industry: Information Technology > Smart Houses & Appliances (1.00)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

A batch of updates from Meta puts AI front and center when you use its apps

MashableApr-18-2024, 16:33:52 GMT

Meta has a batch of generative AI updates that users will notice on Facebook, Instagram, Messenger, and WhatsApp. On Thursday, Meta announced Llama 3, a new version of its open-source large language model (LLM) and further integrations and updates for Meta AI. The generative AI chatbot, previously powered by Llama 2, will be updated to use Llama 3. Meta AI also has expanded to 12 new countries (Australia, Canada, Ghana, Jamaica, Malawi, New Zealand, Nigeria, Pakistan, Singapore, South Africa, Uganda, Zambia and Zimbabwe) although still only in English for now. Meta unveiled Meta AI at its Meta Connect event last September. Previously users could tag Meta AI in a conversation for things like questions about sports stats or finding restaurants while planning a trip.

large language model, machine learning, natural language, (17 more...)

Mashable

Country:

Oceania > New Zealand (0.25)
Oceania > Australia (0.25)
North America > Jamaica (0.25)
(10 more...)

Genre: Press Release (0.31)

Industry:

Information Technology (0.52)
Media (0.32)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

Add feedback

Enhanced Labeling Technique for Reddit Text and Fine-Tuned Longformer Models for Classifying Depression Severity in English and Luganda

Kimera, Richard, Rim, Daniela N., Kirabira, Joseph, Udomah, Ubong Godwin, Choi, Heeyoul

arXiv.org Artificial IntelligenceJan-25-2024

Depression is a global burden and one of the most challenging mental health conditions to control. Experts can detect its severity early using the Beck Depression Inventory (BDI) questionnaire, administer appropriate medication to patients, and impede its progression. Due to the fear of potential stigmatization, many patients turn to social media platforms like Reddit for advice and assistance at various stages of their journey. This research extracts text from Reddit to facilitate the diagnostic process. It employs a proposed labeling approach to categorize the text and subsequently fine-tunes the Longformer model. The model's performance is compared against baseline models, including Naive Bayes, Random Forest, Support Vector Machines, and Gradient Boosting. Our findings reveal that the Longformer model outperforms the baseline models in both English (48%) and Luganda (45%) languages on a custom-made dataset.

artificial intelligence, depression, machine learning, (12 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICTC58733.2023.10393433

2401.1424

Country:

Asia > South Korea (0.15)
Africa > Uganda (0.15)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Consumer Health (1.00)
Media > News (0.94)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback