Goto

Collaborating Authors

 Slovenia


A Proofs Proposition 1 The mapping f: R D! V

Neural Information Processing Systems

See proof of Proposition 3 below for the form of the Jacobian. Theorem 4.7] and so is the product p Intermediate steps above used the following gradient identities. Following this, we then first step is the same as the forward procedure: we solve for recover x by inverting Step 2 of the forward procedure. Since will the same in all dimensions, we can simply pick a dimension in Equation (51). C.1 UCI Data sets The main preprocessing we did was to (i) remove the "label" attribute from each data set, and (ii) remove attributes that only ever take on one value.


SPO: Sequential Monte Carlo Policy Optimisation

Neural Information Processing Systems

Leveraging planning during learning and decision-making is central to the longterm development of intelligent agents. Recent works have successfully combined tree-based search methods and self-play learning mechanisms to this end. However, these methods typically face scaling challenges due to the sequential nature of their search. While practical engineering solutions can partly overcome this, they often result in a negative impact on performance. In this paper, we introduce SPO: Sequential Monte Carlo Policy Optimisation, a model-based reinforcement learning algorithm grounded within the Expectation Maximisation (EM) framework. We show that SPO provides robust policy improvement and efficient scaling properties. The sample-based search makes it directly applicable to both discrete and continuous action spaces without modifications. We demonstrate statistically significant improvements in performance relative to model-free and model-based baselines across both continuous and discrete environments. Furthermore, the parallel nature of SPO's search enables effective utilisation of hardware accelerators, yielding favourable scaling laws.




MLFMF: Data Sets for Machine Learning for Mathematical Formalization Supplementary Material Matej Petković Faculty of Mathematics and Physics Faculty of Mathematics and Physics University of Ljubljana

Neural Information Processing Systems

This document provides several pieces of meta-information about the MLFMF data set collection, as well as some additional details and results from the experiments. For a detailed description of the preprocessing scripts and the script for running the model, please refer to the README in the repository. However, due to space limitations, all the preprocessed data can be found at https://doi.org/10.5281/zenodo.10041075, We obtained the source code of the libraries from their publicly available GitHub repositories. At the time of collection, we retrieved the latest versions of the libraries, which are specified in Table 1.



Solving Word-Sense Disambiguation and Word-Sense Induction with Dictionary Examples

arXiv.org Artificial Intelligence

Many less-resourced languages struggle with a lack of large, task-specific datasets that are required for solving relevant tasks with modern transformer-based large language models (LLMs). On the other hand, many linguistic resources, such as dictionaries, are rarely used in this context despite their large information contents. We show how LLMs can be used to extend existing language resources in less-resourced languages for two important tasks: word-sense disambiguation (WSD) and word-sense induction (WSI). We approach the two tasks through the related but much more accessible word-in-context (WiC) task where, given a pair of sentences and a target word, a classification model is tasked with predicting whether the sense of a given word differs between sentences. We demonstrate that a well-trained model for this task can distinguish between different word senses and can be adapted to solve the WSD and WSI tasks. The advantage of using the WiC task, instead of directly predicting senses, is that the WiC task does not need pre-constructed sense inventories with a sufficient number of examples for each sense, which are rarely available in less-resourced languages. We show that sentence pairs for the WiC task can be successfully generated from dictionary examples using LLMs. The resulting prediction models outperform existing models on WiC, WSD, and WSI tasks. We demonstrate our methodology on the Slovene language, where a monolingual dictionary is available, but word-sense resources are tiny.


Multispectral to Hyperspectral using Pretrained Foundational model

arXiv.org Artificial Intelligence

Multispectral to Hyperspectral using Pretrained Foundational model Ruben Gonzalez* 1, Conrad M Albrecht 1, Nassim Ait Ali Braham 1, Devyani Lambhate* 2, Joao Lucas de Sousa Almeida 2, Paolo Fraccaro 2, Benedikt Blumenstiel 2, Thomas Brunschwiler 2, and Ranjini Bangalore 2 1 Remote Sensing Technology Institute, German Aerospace Center (DLR), Germany 2 IBM Research Labs, India, U.K., Zurich, Brazil February 28, 2025 Abstract Hyperspectral imaging provides detailed spectral information, offering significant potential for monitoring greenhouse gases like CH 4 and NO 2. However, its application is constrained by limited spatial coverage and infrequent revisit times. In contrast, multispectral imaging delivers broader spatial and temporal coverage but lacks the spectral granularity required for precise GHG detection. To address these challenges, this study proposes Spectral and Spatial-Spectral transformer models that reconstructs hyperspectral data from multispectral inputs. The models in this paper are pretrained on EnMAP and EMIT datasets and fine-tuned on spatio-temporally aligned (Sentinel-2, EnMAP) and (HLS-S30, EMIT) image pairs respectively. Our model has the potential to enhance atmospheric monitoring by combining the strengths of hyperspectral and multispectral imaging systems. 1 Introduction Satellite images are being used to create detailed maps of Earth's surface.


Extracting domain-specific terms using contextual word embeddings

arXiv.org Artificial Intelligence

Automated terminology extraction refers to the task of extracting meaningful terms from domain-specific texts. This paper proposes a novel machine learning approach to terminology extraction, which combines features from traditional term extraction systems with novel contextual features derived from contextual word embeddings. Instead of using a predefined list of part-of-speech patterns, we first analyse a new term-annotated corpus RSDO5 for the Slovenian language and devise a set of rules for term candidate selection and then generate statistical, linguistic and context-based features. We use a support-vector machine algorithm to train a classification model, evaluate it on the four domains (biomechanics, linguistics, chemistry, veterinary) of the RSDO5 corpus and compare the results with state-of-art term extraction approaches for the Slovenian language. Our approach provides significant improvements in terms of F1 score over the previous state-of-the-art, which proves that contextual word embeddings are valuable for improving term extraction.1. Introduction Automated terminology extraction (ATE) refers to the task of extracting meaningful terms from domain-specific texts. Terms are single-word (SWU) or multi-word units (MWU) of knowledge, which are relevant for a particular domain. Since manual identification of terms is costly and time consuming, ATE approaches can reduce the effort needed to generate relevant domain-specific terms. Recognizing and extracting domain-specific terms, which is useful in various fields, such as translation, dictionary creation, ontology generation and others, remains a difficult task.


Make Literature-Based Discovery Great Again through Reproducible Pipelines

arXiv.org Artificial Intelligence

By connecting disparate sources of scientific literature, literature\-/based discovery (LBD) methods help to uncover new knowledge and generate new research hypotheses that cannot be found from domain-specific documents alone. Our work focuses on bisociative LBD methods that combine bisociative reasoning with LBD techniques. The paper presents LBD through the lens of reproducible science to ensure the reproducibility of LBD experiments, overcome the inconsistent use of benchmark datasets and methods, trigger collaboration, and advance the LBD field toward more robust and impactful scientific discoveries. The main novelty of this study is a collection of Jupyter Notebooks that illustrate the steps of the bisociative LBD process, including data acquisition, text preprocessing, hypothesis formulation, and evaluation. The contributed notebooks implement a selection of traditional LBD approaches, as well as our own ensemble-based, outlier-based, and link prediction-based approaches. The reader can benefit from hands-on experience with LBD through open access to benchmark datasets, code reuse, and a ready-to-run Docker recipe that ensures reproducibility of the selected LBD methods.