Goto

Collaborating Authors

 Valais


Geological Inference from Textual Data using Word Embeddings

Linphrachaya, Nanmanas, Gómez-Méndez, Irving, Siripatana, Adil

arXiv.org Artificial Intelligence

This research explores the use of Natural Language Processing (NLP) techniques to locate geological resources, with a specific focus on industrial minerals. By using word embeddings trained with the GloVe model, we extract semantic relationships between target keywords and a corpus of geological texts. The text is filtered to retain only words with geographical significance, such as city names, which are then ranked by their cosine similarity to the target keyword. Dimensional reduction techniques, including Principal Component Analysis (PCA), Autoencoder, Variational Autoencoder (VAE), and VAE with Long Short-Term Memory (VAE-LSTM), are applied to enhance feature extraction and improve the accuracy of semantic relations. For benchmarking, we calculate the proximity between the ten cities most semantically related to the target keyword and identified mine locations using the haversine equation. The results demonstrate that combining NLP with dimensional reduction techniques provides meaningful insights into the spatial distribution of natural resources. Although the result shows to be in the same region as the supposed location, the accuracy has room for improvement.


Multi-Class Segmentation of Aortic Branches and Zones in Computed Tomography Angiography: The AortaSeg24 Challenge

Imran, Muhammad, Krebs, Jonathan R., Sivaraman, Vishal Balaji, Zhang, Teng, Kumar, Amarjeet, Ueland, Walker R., Fassler, Michael J., Huang, Jinlong, Sun, Xiao, Wang, Lisheng, Shi, Pengcheng, Rokuss, Maximilian, Baumgartner, Michael, Kirchhof, Yannick, Maier-Hein, Klaus H., Isensee, Fabian, Liu, Shuolin, Han, Bing, Nguyen, Bong Thanh, Shin, Dong-jin, Ji-Woo, Park, Choi, Mathew, Uhm, Kwang-Hyun, Ko, Sung-Jea, Lee, Chanwoong, Chun, Jaehee, Kim, Jin Sung, Zhang, Minghui, Zhang, Hanxiao, You, Xin, Gu, Yun, Pan, Zhaohong, Liu, Xuan, Liang, Xiaokun, Tiefenthaler, Markus, Almar-Munoz, Enrique, Schwab, Matthias, Kotyushev, Mikhail, Epifanov, Rostislav, Wodzinski, Marek, Muller, Henning, Qayyum, Abdul, Mazher, Moona, Niederer, Steven A., Wang, Zhiwei, Yang, Kaixiang, Ren, Jintao, Korreman, Stine Sofia, Gao, Yuchong, Zeng, Hongye, Zheng, Haoyu, Zheng, Rui, Yue, Jinghua, Zhou, Fugen, Liu, Bo, Cosman, Alexander, Liang, Muxuan, Zhao, Chang, Upchurch, Gilbert R. Jr., Ma, Jun, Zhou, Yuyin, Cooper, Michol A., Shao, Wei

arXiv.org Artificial Intelligence

Multi-class segmentation of the aorta in computed tomography angiography (CTA) scans is essential for diagnosing and planning complex endovascular treatments for patients with aortic dissections. However, existing methods reduce aortic segmentation to a binary problem, limiting their ability to measure diameters across different branches and zones. Furthermore, no open-source dataset is currently available to support the development of multi-class aortic segmentation methods. To address this gap, we organized the AortaSeg24 MICCAI Challenge, introducing the first dataset of 100 CTA volumes annotated for 23 clinically relevant aortic branches and zones. This dataset was designed to facilitate both model development and validation. The challenge attracted 121 teams worldwide, with participants leveraging state-of-the-art frameworks such as nnU-Net and exploring novel techniques, including cascaded models, data augmentation strategies, and custom loss functions. We evaluated the submitted algorithms using the Dice Similarity Coefficient (DSC) and Normalized Surface Distance (NSD), highlighting the approaches adopted by the top five performing teams. This paper presents the challenge design, dataset details, evaluation metrics, and an in-depth analysis of the top-performing algorithms. The annotated dataset, evaluation code, and implementations of the leading methods are publicly available to support further research. All resources can be accessed at https://aortaseg24.grand-challenge.org.


Are large language models superhuman chemists?

Mirza, Adrian, Alampara, Nawaf, Kunchapu, Sreekanth, Emoekabu, Benedict, Krishnan, Aswanth, Wilhelmi, Mara, Okereke, Macjonathan, Eberhardt, Juliane, Elahi, Amir Mohammad, Greiner, Maximilian, Holick, Caroline T., Gupta, Tanya, Asgari, Mehrdad, Glaubitz, Christina, Klepsch, Lea C., Köster, Yannik, Meyer, Jakob, Miret, Santiago, Hoffmann, Tim, Kreth, Fabian Alexander, Ringleb, Michael, Roesner, Nicole, Schubert, Ulrich S., Stafast, Leanne M., Wonanke, Dinga, Pieler, Michael, Schwaller, Philippe, Jablonka, Kevin Maik

arXiv.org Artificial Intelligence

Large language models (LLMs) have gained widespread interest due to their ability to process human language and perform tasks on which they have not been explicitly trained. This is relevant for the chemical sciences, which face the problem of small and diverse datasets that are frequently in the form of text. LLMs have shown promise in addressing these issues and are increasingly being harnessed to predict chemical properties, optimize reactions, and even design and conduct experiments autonomously. However, we still have only a very limited systematic understanding of the chemical reasoning capabilities of LLMs, which would be required to improve models and mitigate potential harms. Here, we introduce "ChemBench," an automated framework designed to rigorously evaluate the chemical knowledge and reasoning abilities of state-of-the-art LLMs against the expertise of human chemists. We curated more than 7,000 question-answer pairs for a wide array of subfields of the chemical sciences, evaluated leading open and closed-source LLMs, and found that the best models outperformed the best human chemists in our study on average. The models, however, struggle with some chemical reasoning tasks that are easy for human experts and provide overconfident, misleading predictions, such as about chemicals' safety profiles. These findings underscore the dual reality that, although LLMs demonstrate remarkable proficiency in chemical tasks, further research is critical to enhancing their safety and utility in chemical sciences. Our findings also indicate a need for adaptations to chemistry curricula and highlight the importance of continuing to develop evaluation frameworks to improve safe and useful LLMs.


Cross-Modal Learning of Housing Quality in Amsterdam

Levering, Alex, Marcos, Diego, Tuia, Devis

arXiv.org Artificial Intelligence

In our research we test data and models for the recognition of housing quality in the city of Amsterdam from ground-level and aerial imagery. For ground-level images we compare Google StreetView (GSV) to Flickr images. Our results show that GSV predicts the most accurate building quality scores, approximately 30% better than using only aerial images. However, we find that through careful filtering and by using the right pre-trained model, Flickr image features combined with aerial image features are able to halve the performance gap to GSV features from 30% to 15%. Our results indicate that there are viable alternatives to GSV for liveability factor prediction, which is encouraging as GSV images are more difficult to acquire and not always available.


The curse of language biases in remote sensing VQA: the role of spatial attributes, language diversity, and the need for clear evaluation

Chappuis, Christel, Walt, Eliot, Mendez, Vincent, Lobry, Sylvain, Saux, Bertrand Le, Tuia, Devis

arXiv.org Artificial Intelligence

Remote sensing visual question answering (RSVQA) opens new opportunities for the use of overhead imagery by the general public, by enabling human-machine interaction with natural language. Building on the recent advances in natural language processing and computer vision, the goal of RSVQA is to answer a question formulated in natural language about a remote sensing image. Language understanding is essential to the success of the task, but has not yet been thoroughly examined in RSVQA. In particular, the problem of language biases is often overlooked in the remote sensing community, which can impact model robustness and lead to wrong conclusions about the performances of the model. Thus, the present work aims at highlighting the problem of language biases in RSVQA with a threefold analysis strategy: visual blind models, adversarial testing and dataset analysis. This analysis focuses both on model and data. Moreover, we motivate the use of more informative and complementary evaluation metrics sensitive to the issue. The gravity of language biases in RSVQA is then exposed for all of these methods with the training of models discarding the image data and the manipulation of the visual input during inference. Finally, a detailed analysis of question-answer distribution demonstrates the root of the problem in the data itself. Thanks to this analytical study, we observed that biases in remote sensing are more severe than in standard VQA, likely due to the specifics of existing remote sensing datasets for the task, e.g. geographical similarities and sparsity, as well as a simpler vocabulary and question generation strategies. While new, improved and less-biased datasets appear as a necessity for the development of the promising field of RSVQA, we demonstrate that more informed, relative evaluation metrics remain much needed to transparently communicate results of future RSVQA methods.


VIDIMU. Multimodal video and IMU kinematic dataset on daily life activities using affordable devices

Martínez-Zarzuela, Mario, González-Alonso, Javier, Antón-Rodríguez, Míriam, Díaz-Pernas, Francisco J., Müller, Henning, Simón-Martínez, Cristina

arXiv.org Artificial Intelligence

Human activity recognition and clinical biomechanics are challenging problems in physical telerehabilitation medicine. However, most publicly available datasets on human body movements cannot be used to study both problems in an out-of-the-lab movement acquisition setting. The objective of the VIDIMU dataset is to pave the way towards affordable patient tracking solutions for remote daily life activities recognition and kinematic analysis. The dataset includes 13 activities registered using a commodity camera and five inertial sensors. The video recordings were acquired in 54 subjects, of which 16 also had simultaneous recordings of inertial sensors. The novelty of VIDIMU lies in: i) the clinical relevance of the chosen movements, ii) the combined utilization of affordable video and custom sensors, and iii) the implementation of state-of-the-art tools for multimodal data processing of 3D body pose tracking and motion reconstruction in a musculoskeletal model from inertial data. The validation confirms that a minimally disturbing acquisition protocol, performed according to real-life conditions can provide a comprehensive picture of human joint angles during daily life activities.


Gradient-Based Learning of Discrete Structured Measurement Operators for Signal Recovery

Sauder, Jonathan, Genzel, Martin, Jung, Peter

arXiv.org Artificial Intelligence

Countless signal processing applications include the reconstruction of signals from few indirect linear measurements. The design of effective measurement operators is typically constrained by the underlying hardware and physics, posing a challenging and often even discrete optimization task. While the potential of gradient-based learning via the unrolling of iterative recovery algorithms has been demonstrated, it has remained unclear how to leverage this technique when the set of admissible measurement operators is structured and discrete. We tackle this problem by combining unrolled optimization with Gumbel reparametrizations, which enable the computation of low-variance gradient estimates of categorical random variables. Our approach is formalized by GLODISMO (Gradient-based Learning of DIscrete Structured Measurement Operators). This novel method is easy-to-implement, computationally efficient, and extendable due to its compatibility with automatic differentiation. We empirically demonstrate the performance and flexibility of GLODISMO in several prototypical signal recovery applications, verifying that the learned measurement matrices outperform conventional designs based on randomization as well as discrete optimization baselines.


SELFIES and the future of molecular string representations

Krenn, Mario, Ai, Qianxiang, Barthel, Senja, Carson, Nessa, Frei, Angelo, Frey, Nathan C., Friederich, Pascal, Gaudin, Théophile, Gayle, Alberto Alexander, Jablonka, Kevin Maik, Lameiro, Rafael F., Lemm, Dominik, Lo, Alston, Moosavi, Seyed Mohamad, Nápoles-Duarte, José Manuel, Nigam, AkshatKumar, Pollice, Robert, Rajan, Kohulan, Schatzschneider, Ulrich, Schwaller, Philippe, Skreta, Marta, Smit, Berend, Strieth-Kalthoff, Felix, Sun, Chong, Tom, Gary, von Rudorff, Guido Falk, Wang, Andrew, White, Andrew, Young, Adamo, Yu, Rose, Aspuru-Guzik, Alán

arXiv.org Artificial Intelligence

Artificial intelligence (AI) and machine learning (ML) are expanding in popularity for broad applications to challenging tasks in chemistry and materials science. Examples include the prediction of properties, the discovery of new reaction pathways, or the design of new molecules. The machine needs to read and write fluently in a chemical language for each of these tasks. Strings are a common tool to represent molecular graphs, and the most popular molecular string representation, SMILES, has powered cheminformatics since the late 1980s. However, in the context of AI and ML in chemistry, SMILES has several shortcomings -- most pertinently, most combinations of symbols lead to invalid results with no valid chemical interpretation. To overcome this issue, a new language for molecules was introduced in 2020 that guarantees 100\% robustness: SELFIES (SELF-referencIng Embedded Strings). SELFIES has since simplified and enabled numerous new applications in chemistry. In this manuscript, we look to the future and discuss molecular string representations, along with their respective opportunities and challenges. We propose 16 concrete Future Projects for robust molecular representations. These involve the extension toward new chemical domains, exciting questions at the interface of AI and robust languages and interpretability for both humans and machines. We hope that these proposals will inspire several follow-up works exploiting the full potential of molecular string representations for the future of AI in chemistry and materials science.


Reinforcement Learning based Collective Entity Alignment with Adaptive Features

Zeng, Weixin, Zhao, Xiang, Tang, Jiuyang, Lin, Xuemin, Groth, Paul

arXiv.org Artificial Intelligence

Entity alignment (EA) is the task of identifying the entities that refer to the same real-world object but are located in different knowledge graphs (KGs). For entities to be aligned, existing EA solutions treat them separately and generate alignment results as ranked lists of entities on the other side. Nevertheless, this decision-making paradigm fails to take into account the interdependence among entities. Although some recent efforts mitigate this issue by imposing the 1-to-1 constraint on the alignment process, they still cannot adequately model the underlying interdependence and the results tend to be sub-optimal. To fill in this gap, in this work, we delve into the dynamics of the decision-making process, and offer a reinforcement learning (RL) based model to align entities collectively. Under the RL framework, we devise the coherence and exclusiveness constraints to characterize the interdependence and restrict collective alignment. Additionally, to generate more precise inputs to the RL framework, we employ representative features to capture different aspects of the similarity between entities in heterogeneous KGs, which are integrated by an adaptive feature fusion strategy. Our proposal is evaluated on both cross-lingual and mono-lingual EA benchmarks and compared against state-of-the-art solutions. The empirical results verify its effectiveness and superiority.


Artificial Intelligence ordered 3D vertex importance

Vasic, Iva, Vasic, Bata, Nikolic, Zorica

arXiv.org Artificial Intelligence

Ranking vertices of multidimensional networks is crucial in many areas of research, including selecting and determining the importance of decisions. Some decisions are significantly more important than others, and their weight categorization is also imortant. This paper defines a completely new method for determining the weight decisions using artificial intelligence for importance ranking of three-dimensional network vertices, improving the existing Ordered Statistics Vertex Extraction and Tracking Algorithm (OSVETA) based on modulation of quantized indices (QIM) and error correction codes. The technique we propose in this paper offers significant improvements the efficiency of determination the importance of network vertices in relation to statistical OSVETA criteria, replacing heuristic methods with methods of precise prediction of modern neural networks. The new artificial intelligence technique enables a significantly better definition of the 3D meshes and a better assessment of their topological features. The new method contributions result in a greater precision in defining stable vertices, significantly reducing the probability of deleting mesh vertices.