Goto

Collaborating Authors

 Saxony-Anhalt


CausalRivers -- Scaling up benchmarking of causal discovery for real-world time-series

arXiv.org Machine Learning

Causal discovery, or identifying causal relationships from observational data, is a notoriously challenging task, with numerous methods proposed to tackle it. Despite this, in-the-wild evaluation of these methods is still lacking, as works frequently rely on synthetic data evaluation and sparse real-world examples under critical theoretical assumptions. Real-world causal structures, however, are often complex, making it hard to decide on a proper causal discovery strategy. To bridge this gap, we introduce CausalRivers, the largest in-the-wild causal discovery benchmarking kit for time-series data to date. CausalRivers features an extensive dataset on river discharge that covers the eastern German territory (666 measurement stations) and the state of Bavaria (494 measurement stations). It spans the years 2019 to 2023 with a 15-minute temporal resolution. Further, we provide additional data from a flood around the Elbe River, as an event with a pronounced distributional shift. Leveraging multiple sources of information and time-series meta-data, we constructed two distinct causal ground truth graphs (Bavaria and eastern Germany). These graphs can be sampled to generate thousands of subgraphs to benchmark causal discovery across diverse and challenging settings. To demonstrate the utility of CausalRivers, we evaluate several causal discovery approaches through a set of experiments to identify areas for improvement. CausalRivers has the potential to facilitate robust evaluations and comparisons of causal discovery methods. Besides this primary purpose, we also expect that this dataset will be relevant for connected areas of research, such as time-series forecasting and anomaly detection. Based on this, we hope to push benchmark-driven method development that fosters advanced techniques for causal discovery, as is the case for many other areas of machine learning.


Graph Neural Networks embedded into Margules model for vapor-liquid equilibria prediction

arXiv.org Artificial Intelligence

Graph Neural Networks embedded into Margules model for vapor-liquid equilibria prediction Edgar Ivan Sanchez Medina a,, Kai Sundmacher a,b a Process Systems Engineering, Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstraße 1, Magdeburg, 39106, Saxony-Anhalt, Germany b Chair for Process Systems Engineering, Otto-von-Guericke University, Universit atsplatz 2, Magdeburg, 39106, Saxony-Anhalt, GermanyAbstract Predictive thermodynamic models are crucial for the early stages of product and process design. In this paper the performance of Graph Neural Networks (GNNs) embedded into a relatively simple excess Gibbs energy model, the extended Margules model, for predicting vapor-liquid equilibrium is analyzed. By comparing its performance against the established UNIFAC-Dortmund model it has been shown that GNNs embedded in Margules achieves an overall lower accuracy. However, higher accuracy is observed in the case of various types of binary mixtures. Moreover, since group contribution methods, like UNIFAC, are limited due to feasibility of molecular fragmentation or availability of parameters, the GNN in Margules model offers an alternative for VLE estimation. The findings establish a baseline for the predictive accuracy that simple excess Gibbs energy models combined with GNNs trained solely on infinite dilution data can achieve. Keywords: graph neural networks, vapor-liquid equilibria, Margules, activity coefficients 1. Introduction Modeling vapor-liquid equilibria is essential for the development of most chemical processes. This is because many chemical processes operate under conditions where vapor and liquid phases interact. Although vapor-liquid Corresponding author Email address: sanchez@mpi-magdeburg.mpg.de


An analysis of optimization problems involving ReLU neural networks

arXiv.org Artificial Intelligence

Solving mixed-integer optimization problems with embedded neural networks with ReLU activation functions is challenging. Big-M coefficients that arise in relaxing binary decisions related to these functions grow exponentially with the number of layers. We survey and propose different approaches to analyze and improve the run time behavior of mixed-integer programming solvers in this context. Among them are clipped variants and regularization techniques applied during training as well as optimization-based bound tightening and a novel scaling for given ReLU networks. We numerically compare these approaches for three benchmark problems from the literature. We use the number of linear regions, the percentage of stable neurons, and overall computational effort as indicators. As a major takeaway we observe and quantify a trade-off between the often desired redundancy of neural network models versus the computational costs for solving related optimization problems.


Generative Discrimination: What Happens When Generative AI Exhibits Bias, and What Can Be Done About It

arXiv.org Artificial Intelligence

As generative Artificial Intelligence (genAI) technologies proliferate across sectors, they offer significant benefits but also risk exacerbating discrimination. This chapter explores how genAI intersects with non-discrimination laws, identifying shortcomings and suggesting improvements. It highlights two main types of discriminatory outputs: (i) demeaning and abusive content and (ii) subtler biases due to inadequate representation of protected groups, which may not be overtly discriminatory in individual cases but have cumulative discriminatory effects. For example, genAI systems may predominantly depict white men when asked for images of people in important jobs. This chapter examines these issues, categorizing problematic outputs into three legal categories: discriminatory content; harassment; and legally hard cases like unbalanced content, harmful stereotypes or misclassification. It argues for holding genAI providers and deployers liable for discriminatory outputs and highlights the inadequacy of traditional legal frameworks to address genAI-specific issues. The chapter suggests updating EU laws, including the AI Act, to mitigate biases in training and input data, mandating testing and auditing, and evolving legislation to enforce standards for bias mitigation and inclusivity as technology advances.


Learning reduced-order Quadratic-Linear models in Process Engineering using Operator Inference

arXiv.org Artificial Intelligence

In this work, we address the challenge of efficiently modeling dynamical systems in process engineering. We use reduced-order model learning, specifically operator inference. This is a non-intrusive, data-driven method for learning dynamical systems from time-domain data. The application in our study is carbon dioxide methanation, an important reaction within the Power-to-X framework, to demonstrate its potential. The numerical results show the ability of the reduced-order models constructed with operator inference to provide a reduced yet accurate surrogate solution. This represents an important milestone towards the implementation of fast and reliable digital twin architectures.


Weakly-supervised segmentation using inherently-explainable classification models and their application to brain tumour classification

arXiv.org Artificial Intelligence

Deep learning models have shown their potential for several applications. However, most of the models are opaque and difficult to trust due to their complex reasoning - commonly known as the black-box problem. Some fields, such as medicine, require a high degree of transparency to accept and adopt such technologies. Consequently, creating explainable/interpretable models or applying post-hoc methods on classifiers to build trust in deep learning models are required. Moreover, deep learning methods can be used for segmentation tasks, which typically require hard-to-obtain, time-consuming manually-annotated segmentation labels for training. This paper introduces three inherently-explainable classifiers to tackle both of these problems as one. The localisation heatmaps provided by the networks -- representing the models' focus areas and being used in classification decision-making -- can be directly interpreted, without requiring any post-hoc methods to derive information for model explanation. The models are trained by using the input image and only the classification labels as ground-truth in a supervised fashion - without using any information about the location of the region of interest (i.e. the segmentation labels), making the segmentation training of the models weakly-supervised through classification labels. The final segmentation is obtained by thresholding these heatmaps. The models were employed for the task of multi-class brain tumour classification using two different datasets, resulting in the best F1-score of 0.93 for the supervised classification task while securing a median Dice score of 0.67$\pm$0.08 for the weakly-supervised segmentation task. Furthermore, the obtained accuracy on a subset of tumour-only images outperformed the state-of-the-art glioma tumour grading binary classifiers with the best model achieving 98.7\% accuracy.


PULASki: Learning inter-rater variability using statistical distances to improve probabilistic segmentation

arXiv.org Artificial Intelligence

In the domain of medical imaging, many supervised learning based methods for segmentation face several challenges such as high variability in annotations from multiple experts, paucity of labelled data and class imbalanced datasets. These issues may result in segmentations that lack the requisite precision for clinical analysis and can be misleadingly overconfident without associated uncertainty quantification. We propose the PULASki for biomedical image segmentation that accurately captures variability in expert annotations, even in small datasets. Our approach makes use of an improved loss function based on statistical distances in a conditional variational autoencoder structure (Probabilistic UNet), which improves learning of the conditional decoder compared to the standard cross-entropy particularly in class imbalanced problems. We analyse our method for two structurally different segmentation tasks (intracranial vessel and multiple sclerosis (MS) lesion) and compare our results to four well-established baselines in terms of quantitative metrics and qualitative output. Empirical results demonstrate the PULASKi method outperforms all baselines at the 5\% significance level. The generated segmentations are shown to be much more anatomically plausible than in the 2D case, particularly for the vessel task. Our method can also be applied to a wide range of multi-label segmentation tasks and and is useful for downstream tasks such as hemodynamic modelling (computational fluid dynamics and data assimilation), clinical decision making, and treatment planning.


MICDIR: Multi-scale Inverse-consistent Deformable Image Registration using UNetMSS with Self-Constructing Graph Latent

arXiv.org Artificial Intelligence

Image registration is the process of bringing different images into a common coordinate system - a technique widely used in various applications of computer vision, such as remote sensing, image retrieval, and, most commonly, medical imaging. Deep learning based techniques have been applied successfully to tackle various complex medical image processing problems, including medical image registration. Over the years, several image registration techniques have been proposed using deep learning. Deformable image registration techniques such as Voxelmorph have been successful in capturing finer changes and providing smoother deformations. However, Voxelmorph, as well as ICNet and FIRE, do not explicitly encode global dependencies (i.e. the overall anatomical view of the supplied image) and, therefore, cannot track large deformations. In order to tackle the aforementioned problems, this paper extends the Voxelmorph approach in three different ways. To improve the performance in case of small as well as large deformations, supervision of the model at different resolutions has been integrated using a multi-scale UNet. To support the network to learn and encode the minute structural co-relations of the given image-pairs, a self-constructing graph network (SCGNet) has been used as the latent of the multi-scale UNet - which can improve the learning process of the model and help the model to generalise better. And finally, to make the deformations inverse-consistent, cycle consistency loss has been employed. On the task of registration of brain MRIs, the proposed method achieved significant improvements over ANTs and VoxelMorph, obtaining a Dice score of 0.8013 \pm 0.0243 for intramodal and 0.6211 \pm 0.0309 for intermodal, while VoxelMorph achieved 0.7747 \pm 0.0260 and 0.6071 \pm 0.0510, respectively


Liver Segmentation in Time-resolved C-arm CT Volumes Reconstructed from Dynamic Perfusion Scans using Time Separation Technique

arXiv.org Artificial Intelligence

Perfusion imaging is a valuable tool for diagnosing and treatment planning for liver tumours. The time separation technique (TST) has been successfully used for modelling C-arm cone-beam computed tomography (CBCT) perfusion data. The reconstruction can be accompanied by the segmentation of the liver - for better visualisation and for generating comprehensive perfusion maps. Recently introduced Turbolift learning has been seen to perform well while working with TST reconstructions, but has not been explored for the time-resolved volumes (TRV) estimated out of TST reconstructions. The segmentation of the TRVs can be useful for tracking the movement of the liver over time. This research explores this possibility by training the multi-scale attention UNet of Turbolift learning at its third stage on the TRVs and shows the robustness of Turbolift learning since it can even work efficiently with the TRVs, resulting in a Dice score of 0.864$\pm$0.004.


Indexed Minimum Empirical Divergence for Unimodal Bandits

arXiv.org Artificial Intelligence

We consider a multi-armed bandit problem specified by a set of one-dimensional family exponential distributions endowed with a unimodal structure. We introduce IMED-UB, a algorithm that optimally exploits the unimodal-structure, by adapting to this setting the Indexed Minimum Empirical Divergence (IMED) algorithm introduced by Honda and Takemura [2015]. Owing to our proof technique, we are able to provide a concise finite-time analysis of IMED-UB algorithm. Numerical experiments show that IMED-UB competes with the state-of-the-art algorithms.