AITopics | Tuvalu

Collaborating Authors

Tuvalu

A Appendix

Neural Information Processing SystemsFeb-17-2026, 07:56:21 GMT

The complete list may be seen in Table 8. Here are a few general notes about these strings: 1. Based on their recommendations, we did the following: 1. zh, zh_Latn: This resulted in the special filters described below. URLs) the corpora were in languages different from the LangID predictions. This is mainly mis-rendered PDFs and may have practical applications for denoising, or for decoding such garbled PDFs.

latn, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Oceania > Tonga (0.04)
North America > United States (0.04)
South America > Peru > Huánuco Department > Huánuco Province > Huánuco (0.04)
(24 more...)

Industry: Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.67)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Add feedback

Belgian police arrest three for plotting drone attack on prime minister

Al JazeeraOct-10-2025, 03:14:52 GMT

Belgian authorities say they have arrested three people in connection with a plot to attack Prime Minister Bart De Wever and other politicians using drone-mounted explosives. Federal prosecutor Ann Fransen announced the arrests on Thursday and said the group were under investigation for an "attempted terrorist murder and participation in the activities of a terrorist group", according to Belgian public broadcaster RTBF. "There are also indications that the suspects aimed to construct a drone to which a payload could be attached," she added. Fransen did not name their intended targets, but social media posts from senior figures in De Wever's government indicate that he was on the list. "The news of a planned attack targeting Prime Minister Bart De Wever is deeply shocking," wrote Deputy Prime Minister Maxime Prevot in a post on X. "I express my full support to the Prime Minister, his wife, and his family, as well as my gratitude to the security and justice services whose swift action prevented the worst."

drone attack, prime minister, wever, (5 more...)

Al Jazeera

Country:

North America > United States (0.17)
Asia > Middle East > Palestine > Gaza Strip > Gaza Governorate > Gaza (0.07)
South America > Ecuador (0.06)
(9 more...)

Industry:

Law Enforcement & Public Safety > Terrorism (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > Europe Government > Belgium Government (0.88)

Technology: Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.73)

Add feedback

A Appendix A.1 LangID Details

Neural Information Processing SystemsOct-9-2025, 08:30:30 GMT

latn, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Oceania > Tonga (0.04)
North America > United States (0.04)
South America > Peru > Huánuco Department > Huánuco Province > Huánuco (0.04)
(24 more...)

Industry: Law (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media (0.67)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Add feedback

Hell is not other people – it's being stuck in the ninth circle of an automated telephone service Hilary Freeman

The GuardianApr-22-2025, 07:00:02 GMT

Life is about to change on the remote island nation of Tuvalu. To great fanfare, Tuvalu – an entirely cash-based society – has unveiled its first ever ATM, marking its move towards financial modernisation. But while the 10,000 people living in that country may be celebrating no longer having to queue at the bank, I fear their happiness will be short-lived. The world's first ATM was introduced in Britain in 1967, but for me the tyranny of machines that promise convenience but erode human contact really began about 20 years ago, in the form of self-checkouts in our local Sainsbury's. Having watched the Terminator movie franchise during my formative years, I railed prophetically against them, aware that it was just a small slippery slope from "unexpected item in the bagging area" to the extinction of the human race.

artificial intelligence, human contact, telephone service hilary freeman, (6 more...)

The Guardian

Country: Oceania > Tuvalu (0.82)

Industry:

Health & Medicine (0.73)
Retail (0.53)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.37)

Technology:

Information Technology > Communications (0.32)
Information Technology > Artificial Intelligence (0.31)

Add feedback

An Expanded Massive Multilingual Dataset for High-Performance Language Technologies

Burchell, Laurie, de Gibert, Ona, Arefyev, Nikolay, Aulamo, Mikko, Bañón, Marta, Chen, Pinzhen, Fedorova, Mariia, Guillou, Liane, Haddow, Barry, Hajič, Jan, Helcl, Jindřich, Henriksson, Erik, Klimaszewski, Mateusz, Komulainen, Ville, Kutuzov, Andrey, Kytöniemi, Joona, Laippala, Veronika, Mæhlum, Petter, Malik, Bhavitvya, Mehryary, Farrokh, Mikhailov, Vladislav, Moghe, Nikita, Myntti, Amanda, O'Brien, Dayyán, Oepen, Stephan, Pal, Proyag, Piha, Jousia, Pyysalo, Sampo, Ramírez-Sánchez, Gema, Samuel, David, Stepachev, Pavel, Tiedemann, Jörg, Variš, Dušan, Vojtěchová, Tereza, Zaragoza-Bernabeu, Jaume

arXiv.org Artificial IntelligenceMar-14-2025

Training state-of-the-art large language models requires vast amounts of clean and diverse textual data. However, building suitable multilingual datasets remains a challenge. In this work, we present HPLT v2, a collection of high-quality multilingual monolingual and parallel corpora. The monolingual portion of the data contains 8T tokens covering 193 languages, while the parallel data contains 380M sentence pairs covering 51 languages. We document the entire data pipeline and release the code to reproduce it. We provide extensive analysis of the quality and characteristics of our data. Finally, we evaluate the performance of language models and machine translation systems trained on HPLT v2, demonstrating its value.

artificial intelligence, machine translation, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.10267

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
Europe > Russia (0.04)
(66 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Information Technology (0.67)
Education (0.46)
Media > News (0.46)
Leisure & Entertainment > Games (0.45)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries

Liu, Shudong, Jin, Yiqiao, Li, Cheng, Wong, Derek F., Wen, Qingsong, Sun, Lichao, Chen, Haipeng, Xie, Xing, Wang, Jindong

arXiv.org Artificial IntelligenceJan-2-2025

Vision-language models (VLMs) have advanced human-AI interaction but struggle with cultural understanding, often misinterpreting symbols, gestures, and artifacts due to biases in predominantly Western-centric training data. In this paper, we construct CultureVerse, a large-scale multimodal benchmark covering 19, 682 cultural concepts, 188 countries/regions, 15 cultural concepts, and 3 question types, with the aim of characterizing and improving VLMs' multicultural understanding capabilities. Then, we propose CultureVLM, a series of VLMs fine-tuned on our dataset to achieve significant performance improvement in cultural understanding. Our evaluation of 16 models reveals significant disparities, with a stronger performance in Western concepts and weaker results in African and Asian contexts. Fine-tuning on our CultureVerse enhances cultural perception, demonstrating cross-cultural, cross-continent, and cross-dataset generalization without sacrificing performance on models' general VLM benchmarks. We further present insights on cultural generalization and forgetting. We hope that this work could lay the foundation for more equitable and culturally aware multimodal AI systems.

arxiv, cultural concept, language model, (16 more...)

arXiv.org Artificial Intelligence

2501.01282

Country:

Asia > Laos (0.14)
Europe > Russia (0.14)
Asia > Russia (0.14)
(187 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Leisure & Entertainment (0.93)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Untangling Hate Speech Definitions: A Semantic Componential Analysis Across Cultures and Domains

Korre, Katerina, Muti, Arianna, Ruggeri, Federico, Barrón-Cedeño, Alberto

arXiv.org Artificial IntelligenceNov-11-2024

Hate speech relies heavily on cultural influences, leading to varying individual interpretations. For that reason, we propose a Semantic Componential Analysis (SCA) framework for a cross-cultural and cross-domain analysis of hate speech definitions. We create the first dataset of definitions derived from five domains: online dictionaries, research papers, Wikipedia articles, legislation, and online platforms, which are later analyzed into semantic components. Our analysis reveals that the components differ from definition to definition, yet many domains borrow definitions from one another without taking into account the target culture. We conduct zero-shot model experiments using our proposed dataset, employing three popular open-sourced LLMs to understand the impact of different definitions on hate speech detection. Our findings indicate that LLMs are sensitive to definitions: responses for hate speech detection change according to the complexity of definitions used in the prompt.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.07417

Country:

Asia > Middle East > Syria (0.14)
North America > United States > Washington > King County > Seattle (0.14)
Asia > Brunei (0.14)
(146 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Information Technology (1.00)
Government (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

How AI Is Being Used to Respond to Natural Disasters in Cities

TIME - TechNov-4-2024, 16:01:22 GMT

The number of people living in urban areas has tripled in the last 50 years, meaning when a major natural disaster such as an earthquake strikes a city, more lives are in danger. Meanwhile, the strength and frequency of extreme weather events has increased--a trend set to continue as the climate warms. That is spurring efforts around the world to develop a new generation of earthquake monitoring and climate forecasting systems to make detecting and responding to disasters quicker, cheaper, and more accurate than ever. On Nov. 6, at the Barcelona Supercomputing Center in Spain, the Global Initiative on Resilience to Natural Hazards through AI Solutions will meet for the first time. The new United Nations initiative aims to guide governments, organizations, and communities in using AI for disaster management.

focus group, kuglitsch, sensor, (13 more...)

TIME - Tech

Country:

Europe > Spain (0.25)
Oceania > Tuvalu (0.05)
North America > United States > Florida > Leon County > Tallahassee (0.05)
(7 more...)

Industry:

Government > Intergovernmental Programs (0.35)
Materials > Construction Materials (0.31)

Technology:

Information Technology > Artificial Intelligence > Applied AI (0.52)
Information Technology > Artificial Intelligence > Machine Learning (0.50)

Add feedback

Rulebreakers Challenge: Revealing a Blind Spot in Large Language Models' Reasoning with Formal Logic

Chan, Jason, Gaizauskas, Robert, Zhao, Zhixue

arXiv.org Artificial IntelligenceOct-21-2024

Formal logic has long been applied to natural language reasoning, but this approach can sometimes lead to conclusions that, while logically entailed, are factually inconsistent with the premises or are not typically inferred by humans. This study introduces the concept of "rulebreakers", which refers to instances where logical entailment diverges from factually acceptable inference. We present RULEBREAKERS, a novel dataset for evaluating Large Language Models' (LLMs) ability to distinguish between rulebreakers and non-rulebreakers. Focusing on modus tollens and disjunctive syllogism, we assess six state-of-the-art LLMs using RULEBREAKERS, measuring their performance in terms of token-level exact accuracy and model confidence. Our findings reveal that while most models perform poorly to moderately in recognizing rulebreakers, they demonstrate a latent ability to distinguish rulebreakers when assessed by their confidence levels. Further analysis suggests that the failure to recognize rulebreakers is potentially associated with the models' world knowledge and their attention distribution patterns. This research highlights the limitation of LLMs' reasoning capabilities, and contributes to the ongoing discussion on reasoning in LLMs.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.16502

Country:

Europe > France (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(24 more...)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Sports > Martial Arts (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.32)

Add feedback

MIRAI: Evaluating LLM Agents for Event Forecasting

Ye, Chenchen, Hu, Ziniu, Deng, Yihe, Huang, Zijie, Ma, Mingyu Derek, Zhu, Yanqiao, Wang, Wei

arXiv.org Artificial IntelligenceJul-1-2024

Recent advancements in Large Language Models (LLMs) have empowered LLM agents to autonomously collect world information, over which to conduct reasoning to solve complex problems. Given this capability, increasing interests have been put into employing LLM agents for predicting international events, which can influence decision-making and shape policy development on an international scale. Despite such a growing interest, there is a lack of a rigorous benchmark of LLM agents' forecasting capability and reliability. To address this gap, we introduce MIRAI, a novel benchmark designed to systematically evaluate LLM agents as temporal forecasters in the context of international events. Our benchmark features an agentic environment with tools for accessing an extensive database of historical, structured events and textual news articles. We refine the GDELT event database with careful cleaning and parsing to curate a series of relational prediction tasks with varying forecasting horizons, assessing LLM agents' abilities from short-term to long-term forecasting. We further implement APIs to enable LLM agents to utilize different tools via a code-based interface. In summary, MIRAI comprehensively evaluates the agents' capabilities in three dimensions: 1) autonomously source and integrate critical information from large global databases; 2) write codes using domain-specific APIs and libraries for tool-use; and 3) jointly reason over historical knowledge from diverse formats and time to accurately predict future events. Through comprehensive benchmarking, we aim to establish a reliable framework for assessing the capabilities of LLM agents in forecasting international events, thereby contributing to the development of more accurate and trustworthy models for international relation analysis.

cameocode, isocode, relation, (15 more...)

arXiv.org Artificial Intelligence

2407.01231

Country:

Asia > North Korea (0.14)
Oceania > Australia > Australian Indian Ocean Territories > Territory of Cocos (Keeling) Islands (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(234 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Law (1.00)
Government > Foreign Policy (1.00)
Government > Military (0.93)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback