Goto

Collaborating Authors

 Centre Region


From shrimp Jesus to erotic tractors: how viral AI slop took over the internet

The Guardian

Clockwise from top left: Shrimp Jesus, Nayib Bukele, Justin Bieber and Super Cat League. Clockwise from top left: Shrimp Jesus, Nayib Bukele, Justin Bieber and Super Cat League. In the algorithm-driven economy of 2025, one man's shrimp Jesus is another man's side hustle. AI slop - the low-quality, surreal content flooding social media platforms, designed to farm views - is a phenomenon, some would say the phenomenon of the 2024 and 2025 internet. Merriam-Webster's word of the year this year is "slop", referring exclusively to the internet variety.


LLM-TOPLA: Efficient LLM Ensemble by Maximising Diversity

arXiv.org Artificial Intelligence

Combining large language models during training or at inference time has shown substantial performance gain over component LLMs. This paper presents LLM-TOPLA, a diversity-optimized LLM ensemble method with three unique properties: (i) We introduce the focal diversity metric to capture the diversity-performance correlation among component LLMs of an ensemble. (ii) We develop a diversity-optimized ensemble pruning algorithm to select the top-k sub-ensembles from a pool of $N$ base LLMs. Our pruning method recommends top-performing LLM subensembles of size $S$, often much smaller than $N$. (iii) We generate new output for each prompt query by utilizing a learn-to-ensemble approach, which learns to detect and resolve the output inconsistency among all component LLMs of an ensemble. Extensive evaluation on four different benchmarks shows good performance gain over the best LLM ensemble methods: (i) In constrained solution set problems, LLM-TOPLA outperforms the best-performing ensemble (Mixtral) by 2.2\% in accuracy on MMLU and the best-performing LLM ensemble (MoreAgent) on GSM8k by 2.1\%. (ii) In generative tasks, LLM-TOPLA outperforms the top-2 performers (Llama70b/Mixtral) on SearchQA by $3.9\mathrm{x}$ in F1, and on XSum by more than $38$ in ROUGE-1. Our code and dataset, which contains outputs of 8 modern LLMs on 4 benchmarks is available at https://github.com/git-disl/llm-topla


5G NR PRACH Detection with Convolutional Neural Networks (CNN): Overcoming Cell Interference Challenges

arXiv.org Artificial Intelligence

In this paper, we present a novel approach to interference detection in 5G New Radio (5G-NR) networks using Convolutional Neural Networks (CNN). Interference in 5G networks challenges high-quality service due to dense user equipment deployment and increased wireless environment complexity. Our CNN-based model is designed to detect Physical Random Access Channel (PRACH) sequences amidst various interference scenarios, leveraging the spatial and temporal characteristics of PRACH signals to enhance detection accuracy and robustness. Comprehensive datasets of simulated PRACH signals under controlled interference conditions were generated to train and validate the model. Experimental results show that our CNN-based approach outperforms traditional PRACH detection methods in accuracy, precision, recall and F1-score. This study demonstrates the potential of AI/ML techniques in advancing interference management in 5G networks, providing a foundation for future research and practical applications in optimizing network performance and reliability.


A new approach for predicting the Quality of Experience in multimedia services using machine learning

arXiv.org Artificial Intelligence

In today's world, the Internet is recognized as one of the essentials of human life, playing a significant role in communications, business, and lifestyle. The quality of internet services can have widespread negative impacts on individual and social levels. Consequently, Quality of Service (QoS) has become a fundamental necessity for service providers in a competitive market aiming to offer superior services. The success and survival of these providers depend on their ability to maintain high service quality and ensure satisfaction.Alongside QoS, the concept of Quality of Experience (QoE) has emerged with the development of telephony networks. QoE focuses on the user's satisfaction with the service, helping operators adjust their services to meet user expectations. Recent research shows a trend towards utilizing machine learning and deep learning techniques to predict QoE. Researchers aim to develop accurate models by leveraging large volumes of data from network and user interactions, considering various real-world scenarios. Despite the complexity of network environments, this research provides a practical framework for improving and evaluating QoE. This study presents a comprehensive framework for evaluating QoE in multimedia services, adhering to the ITU-T P.1203 standard which includes automated data collection processes and uses machine learning algorithms to predict user satisfaction based on key network parameters. By collecting over 20,000 data records from different network conditions and users, the Random Forest model achieved a prediction accuracy of 95.8% for user satisfaction. This approach allows operators to dynamically allocate network resources in real-time, maintaining high levels of customer satisfaction with minimal costs.


Evaluation of Geographical Distortions in Language Models: A Crucial Step Towards Equitable Representations

arXiv.org Artificial Intelligence

Language models now constitute essential tools for improving efficiency for many professional tasks such as writing, coding, or learning. For this reason, it is imperative to identify inherent biases. In the field of Natural Language Processing, five sources of bias are well-identified: data, annotation, representation, models, and research design. This study focuses on biases related to geographical knowledge. We explore the connection between geography and language models by highlighting their tendency to misrepresent spatial information, thus leading to distortions in the representation of geographical distances. This study introduces four indicators to assess these distortions, by comparing geographical and semantic distances. Experiments are conducted from these four indicators with ten widely used language models. Results underscore the critical necessity of inspecting and rectifying spatial biases in language models to ensure accurate and equitable representations.


A novel interface for adversarial trivia question-writing

arXiv.org Artificial Intelligence

A critical component when developing question-answering AIs is an adversarial dataset that challenges models to adapt to the complex syntax and reasoning underlying our natural language. Present techniques for procedurally generating adversarial texts are not robust enough for training on complex tasks such as answering multi-sentence trivia questions. We instead turn to human-generated data by introducing an interface for collecting adversarial human-written trivia questions. Our interface is aimed towards question writers and players of Quiz Bowl, a buzzer-based trivia competition where paragraph-long questions consist of a sequence of clues of decreasing difficulty. To incentivize usage, a suite of machine learning-based tools in our interface assist humans in writing questions that are more challenging to answer for Quiz Bowl players and computers alike. Not only does our interface gather training data for the groundbreaking Quiz Bowl AI project QANTA, but it is also a proof-of-concept of future adversarial data collection for question-answering systems. The results of performance-testing our interface with ten originally-composed questions indicate that, despite some flaws, our interface's novel question-writing features as well as its real-time exposure of useful responses from our machine models could facilitate and enhance the collection of adversarial questions. The code for our interface is available at: https://github.com/Zefan-Cai/QAML


Machine Intelligence in Africa: a survey

arXiv.org Artificial Intelligence

In the last 5 years, the availability of large audio datasets in African countries has opened unlimited opportunities to build machine intelligence (MI) technologies that are closer to the people and speak, learn, understand, and do businesses in local languages, including for those who cannot read and write. Unfortunately, these audio datasets are not fully exploited by current MI tools, leaving several Africans out of MI business opportunities. Additionally, many state-of-the-art MI models are not culture-aware, and the ethics of their adoption indexes are questionable. The lack thereof is a major drawback in many applications in Africa. This paper summarizes recent developments in machine intelligence in Africa from a multi-layer multiscale and culture-aware ethics perspective, showcasing MI use cases in 54 African countries through 400 articles on MI research, industry, government actions, as well as uses in art, music, the informal economy, and small businesses in Africa. The survey also opens discussions on the reliability of MI rankings and indexes in the African continent as well as algorithmic definitions of unclear terms used in MI.


New York watchdog accuses Burkina Faso of war crimes through drone strikes, citing civilian casualties

FOX News

Human Rights Watch said Thursday that Burkina Faso's security forces last year killed at least 60 civilians in three different drone strikes, which the group says may have constituted war crimes. The West African nation's government claimed the strikes targeted extremists, including jihadi fighters and rebel groups that have been operating in many remote communities. The accusation by the New York-based watchdog were the latest in a string of similar charges raised by various rights groups. "The government should urgently and impartially investigate these apparent war crimes, hold those responsible to account, and provide adequate support for the victims and their families," HRW said in a new report. A mural is seen in Ouagadougou, Burkina Faso, on March 1, 2023.


Flickr Africa: Examining Geo-Diversity in Large-Scale, Human-Centric Visual Data

arXiv.org Artificial Intelligence

Biases in large-scale image datasets are known to influence the performance of computer vision models as a function of geographic context. To investigate the limitations of standard Internet data collection methods in low- and middle-income countries, we analyze human-centric image geo-diversity on a massive scale using geotagged Flickr images associated with each nation in Africa. We report the quantity and content of available data with comparisons to population-matched nations in Europe as well as the distribution of data according to fine-grained intra-national wealth estimates. Temporal analyses are performed at two-year intervals to expose emerging data trends. Furthermore, we present findings for an ``othering'' phenomenon as evidenced by a substantial number of images from Africa being taken by non-local photographers. The results of our study suggest that further work is required to capture image data representative of African people and their environments and, ultimately, to improve the applicability of computer vision models in a global context.


Location-aware green energy availability forecasting for multiple time frames in smart buildings: The case of Estonia

arXiv.org Artificial Intelligence

Renewable Energies (RE) have gained more attention in recent years since they offer clean and sustainable energy. One of the major sustainable development goals (SDG-7) set by the United Nations (UN) is to achieve affordable and clean energy for everyone. Among the world's all renewable resources, solar energy is considered as the most abundant and can certainly fulfill the target of SDGs. Solar energy is converted into electrical energy through Photovoltaic (PV) panels with no greenhouse gas emissions. However, power generated by PV panels is highly dependent on solar radiation received at a particular location over a given time period. Therefore, it is challenging to forecast the amount of PV output power. Predicting the output power of PV systems is essential since several public or private institutes generate such green energy, and need to maintain the balance between demand and supply. This research aims to forecast PV system output power based on weather and derived features using different machine learning models. The objective is to obtain the best-fitting model to precisely predict output power by inspecting the data. Moreover, different performance metrics are used to compare and evaluate the accuracy under different machine learning models such as random forest, XGBoost, KNN, etc.