Tanzania
Interview with Erica Kimei: Using ML for studying greenhouse gas emissions from livestock
Greenhouse gas emissions are a key driver of climate change. We asked Erica about her work, and her experience at the AfriClimate AI workshop at the Deep Learning Indaba, where her research won an award. I am Erica Kimei, a PhD candidate at the Nelson Mandela African Institution of Science and Technology in Tanzania (NM-AIST), and an assistant lecturer at the National Institute of Transport. My research focuses on leveraging machine learning and remote sensing technology to monitor and forecast greenhouse gas emissions from ruminant livestock. This work aims to contribute to sustainable agricultural practices by enabling better management of emissions and addressing the climate impacts of livestock farming.
Optimized Quality of Service prediction in FSO Links over South Africa using Ensemble Learning
Adebusola, S. O., Owolawi, P. A., Ojo, J. S., Maswikaneng, P. S.
Fibre optic communication system is expected to increase exponentially in terms of application due to the numerous advantages over copper wires. The optical network evolution presents several advantages such as over long-distance, low-power requirement, higher carrying capacity and high bandwidth among others Such network bandwidth surpasses methods of transmission that include copper cables and microwaves. Despite these benefits, free-space optical communications are severely impacted by harsh weather situations like mist, precipitation, blizzard, fume, soil, and drizzle debris in the atmosphere, all of which have an impact on the Quality of Service (QoS) rendered by the systems. The primary goal of this article is to optimize the QoS using the ensemble learning models Random Forest, ADaBoost Regression, Stacking Regression, Gradient Boost Regression, and Multilayer Neural Network. To accomplish the stated goal, meteorological data, visibility, wind speed, and altitude were obtained from the South Africa Weather Services archive during a ten-year period (2010 to 2019) at four different locations: Polokwane, Kimberley, Bloemfontein, and George. We estimated the data rate, power received, fog-induced attenuation, bit error rate and power penalty using the collected and processed data. The RMSE and R-squared values of the model across all the study locations, Polokwane, Kimberley, Bloemfontein, and George, are 0.0073 and 0.9951, 0.0065 and 0.9998, 0.0060 and 0.9941, and 0.0032 and 0.9906, respectively. The result showed that using ensemble learning techniques in transmission modeling can significantly enhance service quality and meet customer service level agreements and ensemble method was successful in efficiently optimizing the signal to noise ratio, which in turn enhanced the QoS at the point of reception.
Active Exploration for Real-Time Haptic Training
Ketchum, Jake, Prabhakar, Ahalya, Murphey, Todd D.
Tactile perception is important for robotic systems that interact with the world through touch. Touch is an active sense in which tactile measurements depend on the contact properties of an interaction--e.g., velocity, force, acceleration--as well as properties of the sensor and object under test. These dependencies make training tactile perceptual models challenging. Additionally, the effects of limited sensor life and the near-field nature of tactile sensors preclude the practical collection of exhaustive data sets even for fairly simple objects. Active learning provides a mechanism for focusing on only the most informative aspects of an object during data collection. Here we employ an active learning approach that uses a data-driven model's entropy as an uncertainty measure and explore relative to that entropy conditioned on the sensor state variables. Using a coverage-based ergodic controller, we train perceptual models in near-real time. We demonstrate our approach using a biomimentic sensor, exploring "tactile scenes" composed of shapes, textures, and objects. Each learned representation provides a perceptual sensor model for a particular tactile scene. Models trained on actively collected data outperform their randomly collected counterparts in real-time training tests. Additionally, we find that the resulting network entropy maps can be used to identify high salience portions of a tactile scene.
Machine Learning-based NLP for Emotion Classification on a Cholera X Dataset
Recent social media posts on the cholera outbreak in Hammanskraal have highlighted the diverse range of emotions people experienced in response to such an event. The extent of people's opinions varies greatly depending on their level of knowledge and information about the disease. The documented re-search about Cholera lacks investigations into the classification of emotions. This study aims to examine the emotions expressed in social media posts about Chol-era. A dataset of 23,000 posts was extracted and pre-processed. The Python Nat-ural Language Toolkit (NLTK) sentiment analyzer library was applied to deter-mine the emotional significance of each text. Additionally, Machine Learning (ML) models were applied for emotion classification, including Long short-term memory (LSTM), Logistic regression, Decision trees, and the Bidirectional En-coder Representations from Transformers (BERT) model. The results of this study demonstrated that LSTM achieved the highest accuracy of 75%. Emotion classification presents a promising tool for gaining a deeper understanding of the impact of Cholera on society. The findings of this study might contribute to the development of effective interventions in public health strategies.
Impacts of Color and Texture Distortions on Earth Observation Data in Deep Learning
Willbo, Martin, Pirinen, Aleksis, Martinsson, John, Zec, Edvin Listo, Mogren, Olof, Nilsson, Mikael
Land cover classification and change detection are two important applications of remote sensing and Earth observation (EO) that have benefited greatly from the advances in deep learning. Convolutional and transformer-based U-net models are the state-of-the-art architectures for these tasks, and their performances have been boosted by an increased availability of large-scale annotated EO datasets. However, the influence of different visual characteristics of the input EO data on a model's predictions is not well understood. In this work we systematically examine model sensitivities with respect to several color-and texture-based distortions on the input EO data during inference, given models that have been trained without such distortions. We conduct experiments with multiple state-of-the-art segmentation networks for land cover classification and show that they are in general more sensitive to texture than to color distortions. Beyond revealing intriguing characteristics of widely used land cover classification models, our results can also be used to guide the development of more robust models within the EO domain. Land cover classification is a key application for remote sensing and Earth observation (EO) data, as it provides essential information for various domains, such as urban planning, environmental monitoring, disaster management, and agriculture.
Introducing Syllable Tokenization for Low-resource Languages: A Case Study with Swahili
Atuhurra, Jesse, Shindo, Hiroyuki, Kamigaito, Hidetaka, Watanabe, Taro
Many attempts have been made in multilingual NLP to ensure that pre-trained language models, such as mBERT or GPT2 get better and become applicable to low-resource languages. To achieve multilingualism for pre-trained language models (PLMs), we need techniques to create word embeddings that capture the linguistic characteristics of any language. Tokenization is one such technique because it allows for the words to be split based on characters or subwords, creating word embeddings that best represent the structure of the language. Creating such word embeddings is essential to applying PLMs to other languages where the model was not trained, enabling multilingual NLP. However, most PLMs use generic tokenization methods like BPE, wordpiece, or unigram which may not suit specific languages. We hypothesize that tokenization based on syllables within the input text, which we call syllable tokenization, should facilitate the development of syllable-aware language models. The syllable-aware language models make it possible to apply PLMs to languages that are rich in syllables, for instance, Swahili. Previous works introduced subword tokenization. Our work extends such efforts. Notably, we propose a syllable tokenizer and adopt an experiment-centric approach to validate the proposed tokenizer based on the Swahili language. We conducted text-generation experiments with GPT2 to evaluate the effectiveness of the syllable tokenizer. Our results show that the proposed syllable tokenizer generates syllable embeddings that effectively represent the Swahili language.
Bootstrapping Rare Object Detection in High-Resolution Satellite Imagery
Zaytar, Akram, Robinson, Caleb, Hacheme, Gilles Q., Tadesse, Girmaw A., Dodhia, Rahul, Ferres, Juan M. Lavista, Hughey, Lacey F., Stabach, Jared A., Amoke, Irene
Rare object detection is a fundamental task in applied geospatial machine learning, however is often challenging due to large amounts of high-resolution satellite or aerial imagery and few or no labeled positive samples to start with. This paper addresses the problem of bootstrapping such a rare object detection task assuming there is no labeled data and no spatial prior over the area of interest. We propose novel offline and online cluster-based approaches for sampling patches that are significantly more efficient, in terms of exposing positive samples to a human annotator, than random sampling. We apply our methods for identifying bomas, or small enclosures for herd animals, in the Serengeti Mara region of Kenya and Tanzania. We demonstrate a significant enhancement in detection efficiency, achieving a positive sampling rate increase from 2% (random) to 30%. This advancement enables effective machine learning mapping even with minimal labeling budgets, exemplified by an F1 score on the boma detection task of 0.51 with a budget of 300 total patches.
Machine Teaching for Building Modular AI Agents based on Zero-shot Learners
The recent advances in large language models (LLMs) have led to the creation of many modular AI agents. These agents employ LLMs as zero-shot learners to perform sub-tasks in order to solve complex tasks set forth by human users. We propose an approach to enhance the robustness and performance of modular AI agents that utilize LLMs as zero-shot learners. Our iterative machine teaching method offers an efficient way to teach AI agents over time with limited human feedback, addressing the limit posed by the quality of zero-shot learning. We advocate leveraging the data traces from initial deployments and outputs or annotations from the zero-shot learners to train smaller and task-specific substitute models which can reduce both the monetary costs and environmental impact. Our machine teaching process avails human expertise to correct examples with a high likelihood of misannotations. Results on three tasks, common to conversational AI agents, show that close-to-oracle performance can be achieved with supervision on 20-70% of the dataset depending upon the complexity of the task and performance of zero-shot learners.
Journey to the Center of the Knowledge Neurons: Discoveries of Language-Independent Knowledge Neurons and Degenerate Knowledge Neurons
Chen, Yuheng, Cao, Pengfei, Chen, Yubo, Liu, Kang, Zhao, Jun
Pre-trained language models (PLMs) contain vast amounts of factual knowledge, but how the knowledge is stored in the parameters remains unclear. This paper delves into the complex task of understanding how factual knowledge is stored in multilingual PLMs, and introduces the Architecture-adapted Multilingual Integrated Gradients method, which successfully localizes knowledge neurons more precisely compared to current methods, and is more universal across various architectures and languages. Moreover, we conduct an in-depth exploration of knowledge neurons, leading to the following two important discoveries: (1) The discovery of Language-Independent Knowledge Neurons, which store factual knowledge in a form that transcends language. We design cross-lingual knowledge editing experiments, demonstrating that the PLMs can accomplish this task based on language-independent neurons; (2) The discovery of Degenerate Knowledge Neurons, a novel type of neuron showing that different knowledge neurons can store the same fact. Its property of functional overlap endows the PLMs with a robust mastery of factual knowledge. We design fact-checking experiments, proving that the degenerate knowledge neurons can help the PLMs to detect wrong facts. Experiments corroborate these findings, shedding light on the mechanisms of factual knowledge storage in multilingual PLMs, and contribute valuable insights to the field. The code is available at https://github.com/heng840/AMIG.