Goto

Collaborating Authors

 Morelos


What Drives Cross-lingual Ranking? Retrieval Approaches with Multilingual Language Models

Goworek, Roksana, Macmillan-Scott, Olivia, Özyiğit, Eda B.

arXiv.org Artificial Intelligence

Cross-lingual information retrieval (CLIR) enables access to multilingual knowledge but remains challenging due to disparities in resources, scripts, and weak cross-lingual semantic alignment in embedding models. Existing pipelines often rely on translation and monolingual retrieval heuristics, which add computational overhead and noise, degrading performance. This work systematically evaluates four intervention types, namely document translation, multilingual dense retrieval with pretrained encoders, contrastive learning at word, phrase, and query-document levels, and cross-encoder re-ranking, across three benchmark datasets. We find that dense retrieval models trained specifically for CLIR consistently outperform lexical matching methods and derive little benefit from document translation. Contrastive learning mitigates language biases and yields substantial improvements for encoders with weak initial alignment, and re-ranking can be effective, but depends on the quality of the cross-encoder training data. Although high-resource languages still dominate overall performance, gains over lexical and document-translated baselines are most pronounced for low-resource and cross-script pairs. These findings indicate that cross-lingual search systems should prioritise semantic multilingual embeddings and targeted learning-based alignment over translation-based pipelines, particularly for cross-script and under-resourced languages.


Hyperoctant Search Clustering: A Method for Clustering Data in High-Dimensional Hyperspheres

Toledo-Acosta, Mauricio, Ramos-García, Luis Ángel, Hermosillo-Valadez, Jorge

arXiv.org Artificial Intelligence

Clustering of high-dimensional data sets is a growing need in artificial intelligence, machine learning and pattern recognition. In this paper, we propose a new clustering method based on a combinatorial-topological approach applied to regions of space defined by signs of coordinates (hyperoctants). In high-dimensional spaces, this approach often reduces the size of the dataset while preserving sufficient topological features. According to a density criterion, the method builds clusters of data points based on the partitioning of a graph, whose vertices represent hyperoctants, and whose edges connect neighboring hyperoctants under the Levenshtein distance. We call this method HyperOctant Search Clustering. We prove some mathematical properties of the method. In order to as assess its performance, we choose the application of topic detection, which is an important task in text mining. Our results suggest that our method is more stable under variations of the main hyperparameter, and remarkably, it is not only a clustering method, but also a tool to explore the dataset from a topological perspective, as it directly provides information about the number of hyperoctants where there are data points. We also discuss the possible connections between our clustering method and other research fields.


Dark energy reconstruction analysis with artificial neural networks: Application on simulated Supernova Ia data from Rubin Observatory

Mitra, Ayan, Gómez-Vargas, Isidro, Zarikas, Vasilios

arXiv.org Artificial Intelligence

In this paper, we present an analysis of Supernova Ia (SNIa) distance moduli $\mu(z)$ and dark energy using an Artificial Neural Network (ANN) reconstruction based on LSST simulated three-year SNIa data. The ANNs employed in this study utilize genetic algorithms for hyperparameter tuning and Monte Carlo Dropout for predictions. Our ANN reconstruction architecture is capable of modeling both the distance moduli and their associated statistical errors given redshift values. We compare the performance of the ANN-based reconstruction with two theoretical dark energy models: $\Lambda$CDM and Chevallier-Linder-Polarski (CPL). Bayesian analysis is conducted for these theoretical models using the LSST simulations and compared with observations from Pantheon and Pantheon+ SNIa real data. We demonstrate that our model-independent ANN reconstruction is consistent with both theoretical models. Performance metrics and statistical tests reveal that the ANN produces distance modulus estimates that align well with the LSST dataset and exhibit only minor discrepancies with $\Lambda$CDM and CPL.


Sustainable Visions: Unsupervised Machine Learning Insights on Global Development Goals

García-Rodríguez, Alberto, Núñez, Matias, Pérez, Miguel Robles, Govezensky, Tzipe, Barrio, Rafael A., Gershenson, Carlos, Kaski, Kimmo K., Tagüeña, Julia

arXiv.org Artificial Intelligence

The United Nations 2030 Agenda for Sustainable Development outlines 17 goals to address global challenges. However, progress has been slower than expected and, consequently, there is a need to investigate the reasons behind this fact. In this study, we used a novel data-driven methodology to analyze data from 107 countries (2000$-$2022) using unsupervised machine learning techniques. Our analysis reveals strong positive and negative correlations between certain SDGs. The findings show that progress toward the SDGs is heavily influenced by geographical, cultural and socioeconomic factors, with no country on track to achieve all goals by 2030. This highlights the need for a region specific, systemic approach to sustainable development that acknowledges the complex interdependencies of the goals and the diverse capacities of nations. Our approach provides a robust framework for developing efficient and data-informed strategies, to promote cooperative and targeted initiatives for sustainable progress.


Deep Learning and genetic algorithms for cosmological Bayesian inference speed-up

Gómez-Vargas, Isidro, Vázquez, J. Alberto

arXiv.org Machine Learning

In this paper, we present a novel approach to accelerate the Bayesian inference process, focusing specifically on the nested sampling algorithms. Bayesian inference plays a crucial role in cosmological parameter estimation, providing a robust framework for extracting theoretical insights from observational data. However, its computational demands can be substantial, primarily due to the need for numerous likelihood function evaluations. Our proposed method utilizes the power of deep learning, employing feedforward neural networks to approximate the likelihood function dynamically during the Bayesian inference process. Unlike traditional approaches, our method trains neural networks on-the-fly using the current set of live points as training data, without the need for pre-training. This flexibility enables adaptation to various theoretical models and datasets. We perform simple hyperparameter optimization using genetic algorithms to suggest initial neural network architectures for learning each likelihood function. Once sufficient accuracy is achieved, the neural network replaces the original likelihood function. The implementation integrates with nested sampling algorithms and has been thoroughly evaluated using both simple cosmological dark energy models and diverse observational datasets. Additionally, we explore the potential of genetic algorithms for generating initial live points within nested sampling inference, opening up new avenues for enhancing the efficiency and effectiveness of Bayesian inference methods.


Automatic Navigation Map Generation for Mobile Robots in Urban Environments

Mozzarelli, Luca, Specchia, Simone, Corno, Matteo, Savaresi, Sergio Matteo

arXiv.org Artificial Intelligence

A fundamental prerequisite for safe and efficient navigation of mobile robots is the availability of reliable navigation maps upon which trajectories can be planned. With the increasing industrial interest in mobile robotics, especially in urban environments, the process of generating navigation maps has become of particular interest, being a labor intensive step of the deployment process. Automating this step is challenging and becomes even more arduous when the perception capabilities are limited by cost considerations. This paper proposes an algorithm to automatically generate navigation maps using a typical navigation-oriented sensor setup: a single top-mounted 3D LiDAR sensor. The proposed method is designed and validated with the urban environment as the main use case: it is shown to be able to produce accurate maps featuring different terrain types, positive obstacles of different heights as well as negative obstacles. The algorithm is applied to data collected in a typical urban environment with a wheeled inverted pendulum robot, showing its robustness against localization, perception and dynamic uncertainties. The generated map is validated against a human-made map.


Photovoltaic power forecasting using quantum machine learning

Sagingalieva, Asel, Komornyik, Stefan, Senokosov, Arsenii, Joshi, Ayush, Sedykh, Alexander, Mansell, Christopher, Tsurkan, Olga, Pinto, Karan, Pflitsch, Markus, Melnikov, Alexey

arXiv.org Artificial Intelligence

Predicting solar panel power output is crucial for advancing the energy transition but is complicated by the variable and non-linear nature of solar energy. This is influenced by numerous meteorological factors, geographical positioning, and photovoltaic cell properties, posing significant challenges to forecasting accuracy and grid stability. Our study introduces a suite of solutions centered around hybrid quantum neural networks designed to tackle these complexities. The first proposed model, the Hybrid Quantum Long Short-Term Memory, surpasses all tested models by over 40% lower mean absolute and mean squared errors. The second proposed model, Hybrid Quantum Sequence-to-Sequence neural network, once trained, predicts photovoltaic power with 16% lower mean absolute error for arbitrary time intervals without the need for prior meteorological data, highlighting its versatility. Moreover, our hybrid models perform better even when trained on limited datasets, underlining their potential utility in data-scarce scenarios. These findings represent a stride towards resolving time series prediction challenges in energy power forecasting through hybrid quantum models, showcasing the transformative potential of quantum machine learning in catalyzing the renewable energy transition.


Towards General Error Diagnosis via Behavioral Testing in Machine Translation

Wu, Junjie, Liu, Lemao, Yeung, Dit-Yan

arXiv.org Artificial Intelligence

Behavioral testing offers a crucial means of diagnosing linguistic errors and assessing capabilities of NLP models. However, applying behavioral testing to machine translation (MT) systems is challenging as it generally requires human efforts to craft references for evaluating the translation quality of such systems on newly generated test cases. Existing works in behavioral testing of MT systems circumvent this by evaluating translation quality without references, but this restricts diagnosis to specific types of errors, such as incorrect translation of single numeric or currency words. In order to diagnose general errors, this paper proposes a new Bilingual Translation Pair Generation based Behavior Testing (BTPGBT) framework for conducting behavioral testing of MT systems. The core idea of BTPGBT is to employ a novel bilingual translation pair generation (BTPG) approach that automates the construction of high-quality test cases and their pseudoreferences. Experimental results on various MT systems demonstrate that BTPGBT could provide comprehensive and accurate behavioral testing results for general error diagnosis, which further leads to several insightful findings. Our code and data are available at https: //github.com/wujunjie1998/BTPGBT.


World Models and Predictive Coding for Cognitive and Developmental Robotics: Frontiers and Challenges

Taniguchi, Tadahiro, Murata, Shingo, Suzuki, Masahiro, Ognibene, Dimitri, Lanillos, Pablo, Ugur, Emre, Jamone, Lorenzo, Nakamura, Tomoaki, Ciria, Alejandra, Lara, Bruno, Pezzulo, Giovanni

arXiv.org Artificial Intelligence

Creating autonomous robots that can actively explore the environment, acquire knowledge and learn skills continuously is the ultimate achievement envisioned in cognitive and developmental robotics. Their learning processes should be based on interactions with their physical and social world in the manner of human learning and cognitive development. Based on this context, in this paper, we focus on the two concepts of world models and predictive coding. Recently, world models have attracted renewed attention as a topic of considerable interest in artificial intelligence. Cognitive systems learn world models to better predict future sensory observations and optimize their policies, i.e., controllers. Alternatively, in neuroscience, predictive coding proposes that the brain continuously predicts its inputs and adapts to model its own dynamics and control behavior in its environment. Both ideas may be considered as underpinning the cognitive development of robots and humans capable of continual or lifelong learning. Although many studies have been conducted on predictive coding in cognitive robotics and neurorobotics, the relationship between world model-based approaches in AI and predictive coding in robotics has rarely been discussed. Therefore, in this paper, we clarify the definitions, relationships, and status of current research on these topics, as well as missing pieces of world models and predictive coding in conjunction with crucially related concepts such as the free-energy principle and active inference in the context of cognitive and developmental robotics. Furthermore, we outline the frontiers and challenges involved in world models and predictive coding toward the further integration of AI and robotics, as well as the creation of robots with real cognitive and developmental capabilities in the future.


The Analysis of Synonymy and Antonymy in Discourse Relations: An interpretable Modeling Approach

Reig-Alamillo, A., Torres-Moreno, D., Morales-González, E., Toledo-Acosta, M., Taroni, A., Hermosillo-Valadez, J.

arXiv.org Artificial Intelligence

The idea that discourse relations are construed through explicit content and shared, or implicit, knowledge between producer and interpreter is ubiquitous in discourse research and linguistics. However, the actual contribution of the lexical semantics of arguments is unclear. We propose a computational approach to the analysis of contrast and concession relations in the PDTB corpus. Our work sheds light on the extent to which lexical semantics contributes to signaling explicit and implicit discourse relations and clarifies the contribution of different parts of speech in both. This study contributes to bridging the gap between corpus linguistics and computational linguistics by proposing transparent and explainable models of discourse relations based on the synonymy and antonymy of their arguments.