Niger Delta
Ev2R: Evaluating Evidence Retrieval in Automated Fact-Checking
Akhtar, Mubashara, Schlichtkrull, Michael, Vlachos, Andreas
Current automated fact-checking (AFC) approaches commonly evaluate evidence either implicitly via the predicted verdicts or by comparing retrieved evidence with a predefined closed knowledge source, such as Wikipedia. However, these methods suffer from limitations, resulting from their reliance on evaluation metrics developed for different purposes and constraints imposed by closed knowledge sources. Recent advances in natural language generation (NLG) evaluation offer new possibilities for evidence assessment. In this work, we introduce Ev2R, an evaluation framework for AFC that comprises three types of approaches for evidence evaluation: reference-based, proxy-reference, and reference-less. We evaluate their effectiveness through agreement with human ratings and adversarial tests, and demonstrate that prompt-based scorers, particularly those leveraging LLMs and reference evidence, outperform traditional evaluation approaches.
Long-Context LLMs Meet RAG: Overcoming Challenges for Long Inputs in RAG
Jin, Bowen, Yoon, Jinsung, Han, Jiawei, Arik, Sercan O.
Retrieval-augmented generation (RAG) empowers large language models (LLMs) to utilize external knowledge sources. The increasing capacity of LLMs to process longer input sequences opens up avenues for providing more retrieved information, to potentially enhance the quality of generated outputs. It is plausible to assume that a larger retrieval set would contain more relevant information (higher recall), that might result in improved performance. However, our empirical findings demonstrate that for many long-context LLMs, the quality of generated output initially improves first, but then subsequently declines as the number of retrieved passages increases. This paper investigates this phenomenon, identifying the detrimental impact of retrieved "hard negatives" as a key contributor. To mitigate this and enhance the robustness of long-context LLM-based RAG, we propose both training-free and training-based approaches. We first showcase the effectiveness of retrieval reordering as a simple yet powerful training-free optimization. Furthermore, we explore training-based methods, specifically RAG-specific implicit LLM fine-tuning and RAG-oriented fine-tuning with intermediate reasoning, demonstrating their capacity for substantial performance gains. Finally, we conduct a systematic analysis of design choices for these training-based methods, including data distribution, retriever selection, and training context length.
Could AI save Nigerians from devastating floods?
In the small village of Ogba-Ojibo in central Nigeria, sitting at the confluence of two of the nation's largest rivers โ the Niger and Benue โ 27-year-old Ako Prince Omali is counting the steps carved out of the dirt, which lead down the loam-coloured banks of the river Niger. This river bank, dotted with tufts of spiky grass, is where villagers come to fish or wash produce and laundry. Just last week, three of the steps were submerged during one night of rain, which raised the water level by about five metres. Normally, you can count seven steps down into the river. Now, only four remain above the surface of the water, the sticks bracing the muddy steps having washed away in the deluge.
NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
Tonneau, Manuel, de Castro, Pedro Vitor Quinta, Lasri, Karim, Farouq, Ibrahim, Subramanian, Lakshminarayanan, Orozco-Olvera, Victor, Fraiberger, Samuel P.
To address the global issue of online hate, hate speech detection (HSD) systems are typically developed on datasets from the United States, thereby failing to generalize to English dialects from the Majority World. Furthermore, HSD models are often evaluated on non-representative samples, raising concerns about overestimating model performance in real-world settings. In this work, we introduce NaijaHate, the first dataset annotated for HSD which contains a representative sample of Nigerian tweets. We demonstrate that HSD evaluated on biased datasets traditionally used in the literature consistently overestimates real-world performance by at least two-fold. We then propose NaijaXLM-T, a pretrained model tailored to the Nigerian Twitter context, and establish the key role played by domain-adaptive pretraining and finetuning in maximizing HSD performance. Finally, owing to the modest performance of HSD systems in real-world conditions, we find that content moderators would need to review about ten thousand Nigerian tweets flagged as hateful daily to moderate 60% of all hateful content, highlighting the challenges of moderating hate speech at scale as social media usage continues to grow globally. Taken together, these results pave the way towards robust HSD systems and a better protection of social media users from hateful content in low-resource settings.
Which Nigerian-Pidgin does Generative AI speak?: Issues about Representativeness and Bias for Multilingual and Low Resource Languages
Adelani, David Ifeoluwa, Doฤruรถz, A. Seza, Shode, Iyanuoluwa, Aremu, Anuoluwapo
Naija is the Nigerian-Pidgin spoken by approx. 120M speakers in Nigeria and it is a mixed language (e.g., English, Portuguese and Indigenous languages). Although it has mainly been a spoken language until recently, there are currently two written genres (BBC and Wikipedia) in Naija. Through statistical analyses and Machine Translation experiments, we prove that these two genres do not represent each other (i.e., there are linguistic differences in word order and vocabulary) and Generative AI operates only based on Naija written in the BBC genre. In other words, Naija written in Wikipedia genre is not represented in Generative AI.
Time Series Forecasting of HIV/AIDS in the Philippines Using Deep Learning: Does COVID-19 Epidemic Matter?
Aribe, Sales G. Jr., Gerardo, Bobby D., Medina, Ruji P.
With a 676% growth rate in HIV incidence between 2010 and 2021, the HIV/AIDS epidemic in the Philippines is the one that is spreading the quickest in the western Pacific. Although the full effects of COVID-19 on HIV services and development are still unknown, it is predicted that such disruptions could lead to a significant increase in HIV casualties. Therefore, the nation needs some modeling and forecasting techniques to foresee the spread pattern and enhance the governments prevention, treatment, testing, and care program. In this study, the researcher uses Multilayer Perceptron Neural Network to forecast time series during the period when the COVID-19 pandemic strikes the nation, using statistics taken from the HIV/AIDS and ART Registry of the Philippines. After training, validation, and testing of data, the study finds that the predicted cumulative cases in the nation by 2030 will reach 145,273. Additionally, there is very little difference between observed and anticipated HIV epidemic levels, as evidenced by reduced RMSE, MAE, and MAPE values as well as a greater coefficient of determination. Further research revealed that the Philippines seems far from achieving Sustainable Development Goal 3 of Project 2030 due to an increase in the nations rate of new HIV infections. Despite the detrimental effects of COVID-19 spread on HIV/AIDS efforts nationwide, the Philippine government, under the Marcos administration, must continue to adhere to the United Nations 90-90-90 targets by enhancing its ART program and ensuring that all vital health services are readily accessible and available.
Language Model Decoding as Direct Metrics Optimization
Ji, Haozhe, Ke, Pei, Wang, Hongning, Huang, Minlie
Despite the remarkable advances in language modeling, current mainstream decoding methods still struggle to generate texts that align with human texts across different aspects. In particular, sampling-based methods produce less-repetitive texts which are often disjunctive in discourse, while search-based methods maintain topic coherence at the cost of increased repetition. Overall, these methods fall short in achieving holistic alignment across a broad range of aspects. In this work, we frame decoding from a language model as an optimization problem with the goal of strictly matching the expected performance with human texts measured by multiple metrics of desired aspects simultaneously. The resulting decoding distribution enjoys an analytical solution that scales the input language model distribution via a sequence-level energy function defined by these metrics. And most importantly, we prove that this induced distribution is guaranteed to improve the perplexity on human texts, which suggests a better approximation to the underlying distribution of human texts. To facilitate tractable sampling from this globally normalized distribution, we adopt the Sampling-Importance-Resampling technique. Experiments on various domains and model scales demonstrate the superiority of our method in metrics alignment with human texts and human evaluation over strong baselines.
Comment: how ships can outwit piracy with AI
Deep learning is on the frontline in a new age of piracy, outwitting attacks with pre-emptive tech, explains Yarden Gross, CEO and co-founder of Orca AI. Almost a decade has passed since piracy raged off Somalia, and yet the danger posed by maritime hijackings is as present as ever. The global pandemic last year sparked a resurgence of attacks, with piracy incidents doubling across Asia, in a worrying uptick also seen in the Gulf of Mexico and West Africa. The fallout from coronavirus, including the loss of key security personnel, turned quarantined vessels into easy targets. This wave has since receded a little, with the International Maritime Bureau reporting a 44 per cent YoY dip in piracy and armed robbery incidents in 2021.
Meta-Heuristic Solutions to a Student Grouping Optimization Problem faced in Higher Education Institutions
Kenekayoro, Patrick, Fawei, Biralatei
Combinatorial problems which have been proven to be NP-hard are faced in Higher Education Institutions and researches have extensively investigated some of the well-known combinatorial problems such as the timetabling and student project allocation problems. However, NP-hard problems faced in Higher Education Institutions are not only confined to these categories of combinatorial problems. The majority of NP-hard problems faced in institutions involve grouping students and/or resources, albeit with each problem having its own unique set of constraints. Thus, it can be argued that techniques to solve NP-hard problems in Higher Education Institutions can be transferred across the different problem categories. As no method is guaranteed to outperform all others in all problems, it is necessary to investigate heuristic techniques for solving lesser-known problems in order to guide stakeholders or software developers to the most appropriate algorithm for each unique class of NP-hard problems faced in Higher Education Institutions. To this end, this study described an optimization problem faced in a real university that involved grouping students for the presentation of semester results. Ordering based heuristics, genetic algorithm and the ant colony optimization algorithm implemented in Python programming language were used to find feasible solutions to this problem, with the ant colony optimization algorithm performing better or equal in 75% of the test instances and the genetic algorithm producing better or equal results in 38% of the test instances.
Nigeria News today & Breaking news Read Nigerian newspapers 24/7
Artificial intelligence can earn for you too - Read how! How Tambuwal's first wife celebrated his birthday will make you jealous OPINION: OMG! popular pastor, banker, others revealed their first experiences What to eat to bring benefit to skin regeneration? How many members are there in the world's biggest family? Sad! Woman welcomes triplets, this happens (photos) Wickedness! Woman puts broomstick and pepper into boy's manhood Lionel Messi's statue destroyed same day Ronaldo won FIFA Player of the Year (photo) You need to read Ruggedman's reaction to viral'Dog women' video on Instagram You won't believe how much Bobrisky is asking from people for his party Simi and Adekunle Gold's love nest uncovered (photos) Finally!