Goto

Collaborating Authors

 South America


Exploring Performance Variations in Finetuned Translators of Ultra-Low Resource Languages: Do Linguistic Differences Matter?

arXiv.org Artificial Intelligence

Finetuning pre-trained language models with small amounts of data is a commonly-used method to create translators for ultra-low resource languages such as endangered Indigenous languages. However, previous works have reported substantially different performances with translators created using similar methodology and data. In this work we systematically explored possible causes of the performance difference, aiming to determine whether it was a product of different cleaning procedures, limitations of the pre-trained models, the size of the base model, or the size of the training dataset, studying both directions of translation. Our studies, using two Brazilian Indigenous languages, related but with significant structural linguistic characteristics, indicated none or very limited influence from those training factors, suggesting differences between languages may play a significant role in the ability to produce translators by fine-tuning pre-trained models.


Invited to Develop: Institutional Belonging and the Counterfactual Architecture of Development

arXiv.org Artificial Intelligence

This paper examines how institutional belonging shapes long-term development by comparing Spain and Uruguay, two small democracies with similar historical endowments whose trajectories diverged sharply after the 1960s. While Spain integrated into dense European institutional architectures, Uruguay remained embedded within the Latin American governance regime, characterized by weaker coordination and lower institutional coherence. To assess how alternative institutional embeddings could have altered these paths, the study develops a generative counterfactual framework grounded in economic complexity, institutional path dependence, and a Wasserstein GAN trained on data from 1960-2020. The resulting Expected Developmental Shift (EDS) quantifies structural gains or losses from hypothetical re-embedding in different institutional ecosystems. Counterfactual simulations indicate that Spain would have experienced significant developmental decline under a Latin American configuration, while Uruguay would have achieved higher complexity and resilience within a European regime. These findings suggest that development is not solely determined by domestic reforms but emerges from a country's structural position within transnational institutional networks.


Towards a Foundation Model for Partial Differential Equations Across Physics Domains

arXiv.org Artificial Intelligence

We present PDE-FM, a modular foundation model for physics-informed machine learning that unifies spatial, spectral, and temporal reasoning across heterogeneous partial differential equation (PDE) systems. PDE-FM combines spatial-spectral tokenization, physics-aware conditioning, and a Mamba-based state-space backbone with an operator-theoretic decoder, enabling scalable and data-efficient modeling of complex physical dynamics. In contrast to task-specific neural operators, PDE-FM is pretrained once on diverse PDE datasets and can be transferred to new physical regimes without architectural or data-specific modifications. Evaluated on twelve 2D and 3D datasets from The Well benchmark - spanning hydrodynamic, radiative, elastic, and astrophysical phenomena - PDE-FM achieves state-of-the-art accuracy in six domains, reducing mean VRMSE by 46% relative to prior operator-learning baselines. The model demonstrates robust cross-physics generalization, excelling in turbulent and radiative systems while maintaining strong performance in linear and steady-state regimes. These results suggest that large-scale pretraining across diverse physical processes can yield transferable representations of dynamics, marking a step toward unified, foundation-level surrogates for multi-physics simulation and scientific discovery.


On the Cross-lingual Transferability of Pre-trained wav2vec2-based Models

arXiv.org Artificial Intelligence

Using representations provided by a large pre-trained model has become the primary strategy for achieving state-of-the-art results in a wide range of tasks. A recently proposed large pre-trained model, wav2vec 2.0, was seminal for several other works on pre-training large models on speech data. Many models are being pre-trained using the same architecture as wav2vec 2.0 and are getting state-of-the-art in various speech-related tasks. Previous work has demonstrated that the data used during the pre-training of these wav2vec2-based models can impact the model's performance in downstream tasks, and this should be taken into consideration before utilizing these models. However, few works have proposed investigating further how the transfer knowledge of these pre-trained models behaves in different languages, even when the target language differs from the one used during the model's pre-training. Our work aims to investigate the cross-lingual transferability of these wav2vec2-based models. We performed several fine-tuning experiments on the speech recognition task in 18 languages using 15 large pre-trained models. The results of our experiments showed us that the size of data used during the pre-training of these models is not as important to the final performance as the diversity. We noticed that the performance of Indo-European languages is superior to non-Indo-European languages in the evaluated models. We have observed a positive cross-lingual transfer of knowledge using monolingual models, which was evident in all the languages we used, but more pronounced when the language used during pre-training was more similar to the downstream task language. With these findings, we aim to assist the scientific community in utilizing existing wav2vec2-based pre-trained models, as well as facilitate the pre-training of new ones.


Adaptive Detection of Software Aging under Workload Shift

arXiv.org Artificial Intelligence

Software aging is a phenomenon that affects long-running systems, leading to progressive performance degradation and increasing the risk of failures. T o mitigate this problem, this work proposes an adaptive approach based on machine learning for software aging detection in environments subject to dynamic workload conditions. W e evaluate and compare a static model with adaptive models that incorporate adaptive detectors, specifically the Drift Detection Method (DDM) and Adaptive Windowing (ADWIN), originally developed for concept drift scenarios and applied in this work to handle workload shifts. Experiments with simulated sudden, gradual, and recurring workload transitions show that static models suffer a notable performance drop when applied to unseen workload profiles, whereas the adaptive model with ADWIN maintains high accuracy, achieving an F1-Score above 0.93 in all analyzed scenarios.


Top global arms producers' revenues surge as major wars rage: SIPRI report

Al Jazeera

Can Pakistan join the Gaza stabilisation force? Revenues from sales of weapons and military services by the 100 largest global arms-producing companies reached a record $679bn in 2024, according to new data released by the Stockholm International Peace Research Institute (SIPRI). The Gaza and Ukraine wars, as well as global and regional geopolitical tensions and ever-higher military expenditures, increased revenues generated by the companies from sales of military goods and services to customers domestic and abroad by 5.9 percent compared to the year before, the organisation said in a report published on Monday. Lockheed Martin, Northrop Grumman and General Dynamics led the pack in the US, where the combined arms revenues of arms companies in the top 100 grew by 3.8 percent in 2024 to reach $334bn, with 30 out of the 39 US companies in the ranking increasing their revenues. However, SIPRI said widespread delays and budget overruns continue to plague key projects such as the F-35 fighter jet, the Columbia and Virginia-class submarines, and the Sentinel intercontinental ballistic missile.


Polls open in Honduras presidential election marked by fraud accusations

Al Jazeera

Hondurans are heading to the polls to elect a new president in a tightly contested race that is taking place amid concerns over voter fraud in the impoverished Central American country. Polls opened on Sunday at 7am local time (13:00 GMT) for 10 hours of voting, with the first results expected late Sunday night. The elections, in which the 128 members of Congress, hundreds of mayors, and thousands of other public officials will also be chosen, are taking place in a highly polarised climate, with the three top candidates accusing each other of plotting fraud. Moncada has suggested that she will not recognise the official results. Incumbent President Xiomara Castro of the LIBRE party is limited by law to one term in office.


Russia-Ukraine war: List of key events, day 1,375

Al Jazeera

What is in the 28-point US plan for Ukraine? 'Ukraine is running out of men, money and time' Can the US get all sides to end the war? Why is Europe opposing Trump's peace plan? Here's where things stand on Sunday, November 30. A Russian drone attack killed one person and wounded 11, including a child, on the outskirts of the Ukrainian capital, Kyiv, regional Governor Mykola Kalashnyk said on Sunday.


It's time to lock in and let your winter arc begin

BBC News

It's time to lock in and let your winter arc begin Have you ever locked in? No, not finding yourself locked in a lift or a bathroom. We're talking about locking IN - the phrase you might have seen on social media or heard people saying lately. To lock in is to focus; to endure short-term pain for long-term gain - whether that be building your body or your business. Do it today - not tomorrow.


More than 70,000 killed in Gaza since Israel offensive began, Hamas-run health ministry says

BBC News

More than 70,000 Palestinians have been killed as a result of Israel's military campaign in Gaza, according to the territory's Hamas-run health ministry. The death toll has continued to rise since a ceasefire took effect on 10 October, with Israel carrying out air strikes for what it says are violations of the truce - while bodies continue to be recovered from under the rubble. Among those reportedly killed in an Israeli drone strike on Saturday were two young brothers, Fadi and Juma Abu Assi, whose family said they had been gathering firewood when they were killed. The Israel Defense Forces (IDF) told the BBC they had struck two suspects who had crossed the so-called yellow line. The line marks where the Israeli military agreed to withdraw to under a ceasefire brokered by the United States more than seven weeks ago.