Brune, Christoph
PDE-DKL: PDE-constrained deep kernel learning in high dimensionality
Yan, Weihao, Brune, Christoph, Guo, Mengwu
Many physics-informed machine learning methods for PDE-based problems rely on Gaussian processes (GPs) or neural networks (NNs). However, both face limitations when data are scarce and the dimensionality is high. Although GPs are known for their robust uncertainty quantification in low-dimensional settings, their computational complexity becomes prohibitive as the dimensionality increases. In contrast, while conventional NNs can accommodate high-dimensional input, they often require extensive training data and do not offer uncertainty quantification. To address these challenges, we propose a PDE-constrained Deep Kernel Learning (PDE-DKL) framework that combines DL and GPs under explicit PDE constraints. Specifically, NNs learn a low-dimensional latent representation of the high-dimensional PDE problem, reducing the complexity of the problem. GPs then perform kernel regression subject to the governing PDEs, ensuring accurate solutions and principled uncertainty quantification, even when available data are limited. This synergy unifies the strengths of both NNs and GPs, yielding high accuracy, robust uncertainty estimates, and computational efficiency for high-dimensional PDEs. Numerical experiments demonstrate that PDE-DKL achieves high accuracy with reduced data requirements. They highlight its potential as a practical, reliable, and scalable solver for complex PDE-based applications in science and engineering.
Deep Networks are Reproducing Kernel Chains
Heeringa, Tjeerd Jan, Spek, Len, Brune, Christoph
Identifying an appropriate function space for deep neural networks remains a key open question. While shallow neural networks are naturally associated with Reproducing Kernel Banach Spaces (RKBS), deep networks present unique challenges. In this work, we extend RKBS to chain RKBS (cRKBS), a new framework that composes kernels rather than functions, preserving the desirable properties of RKBS. We prove that any deep neural network function is a neural cRKBS function, and conversely, any neural cRKBS function defined on a finite dataset corresponds to a deep neural network. This approach provides a sparse solution to the empirical risk minimization problem, requiring no more than $N$ neurons per layer, where $N$ is the number of data points.
Deep vectorised operators for pulsatile hemodynamics estimation in coronary arteries from a steady-state prior
Suk, Julian, Nannini, Guido, Rygiel, Patryk, Brune, Christoph, Pontone, Gianluca, Redaelli, Alberto, Wolterink, Jelmer M.
Cardiovascular hemodynamic fields provide valuable medical decision markers for coronary artery disease. Computational fluid dynamics (CFD) is the gold standard for accurate, non-invasive evaluation of these quantities in vivo. In this work, we propose a time-efficient surrogate model, powered by machine learning, for the estimation of pulsatile hemodynamics based on steady-state priors. We introduce deep vectorised operators, a modelling framework for discretisation independent learning on infinite-dimensional function spaces. The underlying neural architecture is a neural field conditioned on hemodynamic boundary conditions. Importantly, we show how relaxing the requirement of point-wise action to permutation-equivariance leads to a family of models that can be parametrised by message passing and self-attention layers. We evaluate our approach on a dataset of 74 stenotic coronary arteries extracted from coronary computed tomography angiography (CCTA) with patient-specific pulsatile CFD simulations as ground truth. We show that our model produces accurate estimates of the pulsatile velocity and pressure while being agnostic to re-sampling of the source domain (discretisation independence). This shows that deep vectorised operators are a powerful modelling tool for cardiovascular hemodynamics estimation in coronary arteries and beyond.
Sparsifying dimensionality reduction of PDE solution data with Bregman learning
Heeringa, Tjeerd Jan, Brune, Christoph, Guo, Mengwu
Classical model reduction techniques project the governing equations onto a linear subspace of the original state space. More recent data-driven techniques use neural networks to enable nonlinear projections. Whilst those often enable stronger compression, they may have redundant parameters and lead to suboptimal latent dimensionality. To overcome these, we propose a multistep algorithm that induces sparsity in the encoder-decoder networks for effective reduction in the number of parameters and additional compression of the latent space. This algorithm starts with sparsely initialized a network and training it using linearized Bregman iterations. These iterations have been very successful in computer vision and compressed sensing tasks, but have not yet been used for reduced-order modelling. After the training, we further compress the latent space dimensionality by using a form of proper orthogonal decomposition. Last, we use a bias propagation technique to change the induced sparsity into an effective reduction of parameters. We apply this algorithm to three representative PDE models: 1D diffusion, 1D advection, and 2D reaction-diffusion. Compared to conventional training methods like Adam, the proposed method achieves similar accuracy with 30% less parameters and a significantly smaller latent space.
Learning a Sparse Representation of Barron Functions with the Inverse Scale Space Flow
Heeringa, Tjeerd Jan, Roith, Tim, Brune, Christoph, Burger, Martin
This paper presents a method for finding a sparse representation of Barron functions. Specifically, given an $L^2$ function $f$, the inverse scale space flow is used to find a sparse measure $\mu$ minimising the $L^2$ loss between the Barron function associated to the measure $\mu$ and the function $f$. The convergence properties of this method are analysed in an ideal setting and in the cases of measurement noise and sampling bias. In an ideal setting the objective decreases strictly monotone in time to a minimizer with $\mathcal{O}(1/t)$, and in the case of measurement noise or sampling bias the optimum is achieved up to a multiplicative or additive constant. This convergence is preserved on discretization of the parameter space, and the minimizers on increasingly fine discretizations converge to the optimum on the full parameter space.
SIRE: scale-invariant, rotation-equivariant estimation of artery orientations using graph neural networks
Alblas, Dieuwertje, Suk, Julian, Brune, Christoph, Yeung, Kak Khee, Wolterink, Jelmer M.
Blood vessel orientation as visualized in 3D medical images is an important descriptor of its geometry that can be used for centerline extraction and subsequent segmentation and visualization. Arteries appear at many scales and levels of tortuosity, and determining their exact orientation is challenging. Recent works have used 3D convolutional neural networks (CNNs) for this purpose, but CNNs are sensitive to varying vessel sizes and orientations. We present SIRE: a scale-invariant, rotation-equivariant estimator for local vessel orientation. SIRE is modular and can generalise due to symmetry preservation. SIRE consists of a gauge equivariant mesh CNN (GEM-CNN) operating on multiple nested spherical meshes with different sizes in parallel. The features on each mesh are a projection of image intensities within the corresponding sphere. These features are intrinsic to the sphere and, in combination with the GEM-CNN, lead to SO(3)-equivariance. Approximate scale invariance is achieved by weight sharing and use of a symmetric maximum function to combine multi-scale predictions. Hence, SIRE can be trained with arbitrarily oriented vessels with varying radii to generalise to vessels with a wide range of calibres and tortuosity. We demonstrate the efficacy of SIRE using three datasets containing vessels of varying scales: the vascular model repository (VMR), the ASOCA coronary artery set, and a set of abdominal aortic aneurysms (AAAs). We embed SIRE in a centerline tracker which accurately tracks AAAs, regardless of the data SIRE is trained with. Moreover, SIRE can be used to track coronary arteries, even when trained only with AAAs. In conclusion, by incorporating SO(3) and scale symmetries, SIRE can determine the orientations of vessels outside of the training domain, forming a robust and data-efficient solution to geometric analysis of blood vessels in 3D medical images.
What do neural networks learn in image classification? A frequency shortcut perspective
Wang, Shunxin, Veldhuis, Raymond, Brune, Christoph, Strisciuglio, Nicola
Frequency analysis is useful for understanding the mechanisms of representation learning in neural networks (NNs). Most research in this area focuses on the learning dynamics of NNs for regression tasks, while little for classification. This study empirically investigates the latter and expands the understanding of frequency shortcuts. First, we perform experiments on synthetic datasets, designed to have a bias in different frequency bands. Our results demonstrate that NNs tend to find simple solutions for classification, and what they learn first during training depends on the most distinctive frequency characteristics, which can be either low- or high-frequencies. Second, we confirm this phenomenon on natural images. We propose a metric to measure class-wise frequency characteristics and a method to identify frequency shortcuts. The results show that frequency shortcuts can be texture-based or shape-based, depending on what best simplifies the objective. Third, we validate the transferability of frequency shortcuts on out-of-distribution (OOD) test sets. Our results suggest that frequency shortcuts can be transferred across datasets and cannot be fully avoided by larger model capacity and data augmentation. We recommend that future research should focus on effective training schemes mitigating frequency shortcut learning.
SE(3) symmetry lets graph neural networks learn arterial velocity estimation from small datasets
Suk, Julian, Brune, Christoph, Wolterink, Jelmer M.
Hemodynamic velocity fields in coronary arteries could be the basis of valuable biomarkers for diagnosis, prognosis and treatment planning in cardiovascular disease. Velocity fields are typically obtained from patient-specific 3D artery models via computational fluid dynamics (CFD). However, CFD simulation requires meticulous setup by experts and is time-intensive, which hinders large-scale acceptance in clinical practice. To address this, we propose graph neural networks (GNN) as an efficient black-box surrogate method to estimate 3D velocity fields mapped to the vertices of tetrahedral meshes of the artery lumen. We train these GNNs on synthetic artery models and CFD-based ground truth velocity fields. Once the GNN is trained, velocity estimates in a new and unseen artery can be obtained with 36-fold speed-up compared to CFD. We demonstrate how to construct an SE(3)-equivariant GNN that is independent of the spatial orientation of the input mesh and show how this reduces the necessary amount of training data compared to a baseline neural network.
Trajectory Generation, Control, and Safety with Denoising Diffusion Probabilistic Models
Botteghi, Nicolò, Califano, Federico, Poel, Mannes, Brune, Christoph
The Control barrier functions (CBFs) (Ames et al., 2017; 2019) technology of control barrier functions (CBFs), represent a formal framework aiming to achieve safety as encoding desired safety constraints, is used in a hard constraint in an optimization problem in which the combination with DDPMs to plan actions by iteratively cost function encodes information on the nominal task to denoising trajectories through a CBFbased be executed. In particular CBF-based safety constraints are guided sampling procedure. At the same represented by forward invariance of so-called safe sets, i.e. time, the generated trajectories are also guided to subsets of the state space which the controlled system should maximize a future cumulative reward representing not leave during the task execution. We stress that within a specific task to be optimally executed. The this context, safety becomes a mathematically rigorous system proposed scheme can be seen as an offline and theoretic property and, even if unable to represent any model-based reinforcement learning algorithm resembling possible safety hazard, it is very useful to design safety constraints, in its functionalities a model-predictive e.g.
Embeddings between Barron spaces with higher order activation functions
Heeringa, Tjeerd Jan, Spek, Len, Schwenninger, Felix, Brune, Christoph
The approximation properties of infinitely wide shallow neural networks heavily depend on the choice of the activation function. To understand this influence, we study embeddings between Barron spaces with different activation functions. These embeddings are proven by providing push-forward maps on the measures $\mu$ used to represent functions $f$. An activation function of particular interest is the rectified power unit ($\operatorname{RePU}$) given by $\operatorname{RePU}_s(x)=\max(0,x)^s$. For many commonly used activation functions, the well-known Taylor remainder theorem can be used to construct a push-forward map, which allows us to prove the embedding of the associated Barron space into a Barron space with a $\operatorname{RePU}$ as activation function. Moreover, the Barron spaces associated with the $\operatorname{RePU}_s$ have a hierarchical structure similar to the Sobolev spaces $H^m$.