Goto

Collaborating Authors

 uller



Delta-learned force fields for nonbonded interactions: Addressing the strength mismatch between covalent-nonbonded interaction for global models

Cázares-Trejo, Leonardo, Loreto-Silva, Marco, Sauceda, Huziel E.

arXiv.org Artificial Intelligence

Noncovalent interactions--vdW dispersion, hydrogen/halogen bonding, ion-$π$, and $π$-stacking--govern structure, dynamics, and emergent phenomena in materials and molecular systems, yet accurately learning them alongside covalent forces remains a core challenge for machine-learned force fields (MLFFs). This challenge is acute for global models that use Coulomb-matrix (CM) descriptors compared under Euclidean/Frobenius metrics in multifragment settings. We show that the mismatch between predominantly covalent force labels and the CM's overrepresentation of intermolecular features biases single-model training and degrades force-field fidelity. To address this, we introduce \textit{$Δ$-sGDML}, a scale-aware formulation within the sGDML framework that explicitly decouples intra- and intermolecular physics by training fragment-specific models alongside a dedicated binding model, then composing them at inference. Across benzene dimers, host-guest complexes (C$_{60}$@buckycatcher, NO$_3^-$@i-corona[6]arene), benzene-water, and benzene-Na$^+$, \mbox{$Δ$-sGDML} delivers consistent gains over a single global model, with fragment-resolved force-error reductions up to \textbf{75\%}, without loss of energy accuracy. Furthermore, molecular-dynamics simulations further confirm that the $Δ$-model yields a reliable force field for C$_{60}$@buckycatcher, producing stable trajectories across a wide range of temperatures (10-400~K), unlike the single global model, which loses stability above $\sim$200~K. The method offers a practical route to homogenize per-fragment errors and recover reliable noncovalent physics in global MLFFs.


DFNN: A Deep Fréchet Neural Network Framework for Learning Metric-Space-Valued Responses

Kim, Kyum, Chen, Yaqing, Dubey, Paromita

arXiv.org Machine Learning

Regression with non-Euclidean responses -- e.g., probability distributions, networks, symmetric positive-definite matrices, and compositions -- has become increasingly important in modern applications. In this paper, we propose deep Fréchet neural networks (DFNNs), an end-to-end deep learning framework for predicting non-Euclidean responses -- which are considered as random objects in a metric space -- from Euclidean predictors. Our method leverages the representation-learning power of deep neural networks (DNNs) to the task of approximating conditional Fréchet means of the response given the predictors, the metric-space analogue of conditional expectations, by minimizing a Fréchet risk. The framework is highly flexible, accommodating diverse metrics and high-dimensional predictors. We establish a universal approximation theorem for DFNNs, advancing the state-of-the-art of neural network approximation theory to general metric-space-valued responses without making model assumptions or relying on local smoothing. Empirical studies on synthetic distributional and network-valued responses, as well as a real-world application to predicting employment occupational compositions, demonstrate that DFNNs consistently outperform existing methods.


End-to-End Deep Learning for Predicting Metric Space-Valued Outputs

Zhou, Yidong, Iao, Su I, Müller, Hans-Georg

arXiv.org Machine Learning

Many modern applications involve predicting structured, non-Euclidean outputs such as probability distributions, networks, and symmetric positive-definite matrices. These outputs are naturally modeled as elements of general metric spaces, where classical regression techniques that rely on vector space structure no longer apply. We introduce E2M (End-to-End Metric regression), a deep learning framework for predicting metric space-valued outputs. E2M performs prediction via a weighted Fréchet means over training outputs, where the weights are learned by a neural network conditioned on the input. This construction provides a principled mechanism for geometry-aware prediction that avoids surrogate embeddings and restrictive parametric assumptions, while fully preserving the intrinsic geometry of the output space. We establish theoretical guarantees, including a universal approximation theorem that characterizes the expressive capacity of the model and a convergence analysis of the entropy-regularized training objective. Through extensive simulations involving probability distributions, networks, and symmetric positive-definite matrices, we show that E2M consistently achieves state-of-the-art performance, with its advantages becoming more pronounced at larger sample sizes. Applications to human mortality distributions and New York City taxi networks further demonstrate the flexibility and practical utility of the framework.


Wohlhart's Three-Loop Mechanism: An Overconstrained and Shaky Linkage

Mueller, Andreas

arXiv.org Artificial Intelligence

This paper revisits a three-loop spatial linkage that was proposed in an ARK 2004 paper by Karl Wohlhart (as extension of a two-loop linkage proposed by Eddie Baker in 1980) and later analyzed in an ARK 2006 paper by Diez-Martinez et. al. A local analysis shows that this linkage has a finite degree of freedom (DOF) 3 (and is thus overconstrained) while in its reference configuration the differential DOF is 5. It is shown that its configuration space is locally a smooth manifold so that the reference configuration is not a c-space singularity. It is shown that the differential DOF is locally constant, which makes this linkage shaky (so that the reference configuration is not a singularity). The higher-order local analysis is facilitated by the computation of the kinematic tangent cone as well as a local approximation of the c-space.


Fast and Accurate Explanations of Distance-Based Classifiers by Uncovering Latent Explanatory Structures

Bley, Florian, Kauffmann, Jacob, Krug, Simon León, Müller, Klaus-Robert, Montavon, Grégoire

arXiv.org Machine Learning

Distance-based classifiers, such as k-nearest neighbors and support vector machines, continue to be a workhorse of machine learning, widely used in science and industry. In practice, to derive insights from these models, it is also important to ensure that their predictions are explainable. While the field of Explainable AI has supplied methods that are in principle applicable to any model, it has also emphasized the usefulness of latent structures (e.g. the sequence of layers in a neural network) to produce explanations. In this paper, we contribute by uncovering a hidden neural network structure in distance-based classifiers (consisting of linear detection units combined with nonlinear pooling layers) upon which Explainable AI techniques such as layer-wise relevance propagation (LRP) become applicable. Through quantitative evaluations, we demonstrate the advantage of our novel explanation approach over several baselines. We also show the overall usefulness of explaining distance-based models through two practical use cases.


ULLER: A Unified Language for Learning and Reasoning

van Krieken, Emile, Badreddine, Samy, Manhaeve, Robin, Giunchiglia, Eleonora

arXiv.org Artificial Intelligence

The field of neuro-symbolic artificial intelligence (NeSy), which combines learning and reasoning, has recently experienced significant growth. There now are a wide variety of NeSy frameworks, each with its own specific language for expressing background knowledge and how to relate it to neural networks. This heterogeneity hinders accessibility for newcomers and makes comparing different NeSy frameworks challenging. We propose a unified language for NeSy, which we call ULLER, a Unified Language for LEarning and Reasoning. ULLER encompasses a wide variety of settings, while ensuring that knowledge described in it can be used in existing NeSy systems. ULLER has a neuro-symbolic first-order syntax for which we provide example semantics including classical, fuzzy, and probabilistic logics. We believe ULLER is a first step towards making NeSy research more accessible and comparable, paving the way for libraries that streamline training and evaluation across a multitude of semantics, knowledge bases, and NeSy systems.


Robust Spatial Filtering with Beta Divergence Wojciech Samek

Neural Information Processing Systems

The efficiency of Brain-Computer Interfaces (BCI) largely depends upon a reliable extraction of informative features from the high-dimensional EEG signal. A crucial step in this protocol is the computation of spatial filters. The Common Spatial Patterns (CSP) algorithm computes filters that maximize the difference in band power between two conditions, thus it is tailored to extract the relevant information in motor imagery experiments. However, CSP is highly sensitive to artifacts in the EEG data, i.e. few outliers may alter the estimate drastically and decrease classification performance. Inspired by concepts from the field of information geometry we propose a novel approach for robustifying CSP . More precisely, we formulate CSP as a divergence maximization problem and utilize the property of a particular type of divergence, namely beta divergence, for robustifying the estimation of spatial filters in the presence of artifacts in the data. We demonstrate the usefulness of our method on toy data and on EEG recordings from 80 subjects.


Towards best practice in explaining neural network decisions with LRP

Kohlbrenner, Maximilian, Bauer, Alexander, Nakajima, Shinichi, Binder, Alexander, Samek, Wojciech, Lapuschkin, Sebastian

arXiv.org Machine Learning

Within the last decade, neural network based predictors have demonstrated impressive - and at times super-human - capabilities. This performance is often paid for with an intransparent prediction process and thus has sparked numerous contributions in the novel field of explainable artificial intelligence (XAI). In this paper, we focus on a popular and widely used method of XAI, the Layer-wise Relevance Propagation (LRP). Since its initial proposition LRP has evolved as a method, and a best practice for applying the method has tacitly emerged, based on humanly observed evidence. We investigate - and for the first time quantify - the effect of this current best practice on feedforward neural networks in a visual object detection setting. The results verify that the current, layer-dependent approach to LRP applied in recent literature better represents the model's reasoning, and at the same time increases the object localization and class discriminativity of LRP.


Towards Explainable Artificial Intelligence

Samek, Wojciech, Müller, Klaus-Robert

arXiv.org Artificial Intelligence

In recent years, machine learning (ML) has become a key enabling technology for the sciences and industry. Especially through improvements in methodology, the availability of large databases and increased computational power, today's ML algorithms are able to achieve excellent performance (at times even exceeding the human level) on an increasing number of complex tasks. Deep learning models are at the forefront of this development. However, due to their nested non-linear structure, these powerful models have been generally considered "black boxes", not providing any information about what exactly makes them arrive at their predictions. Since in many applications, e.g., in the medical domain, such lack of transparency may be not acceptable, the development of methods for visualizing, explaining and interpreting deep learning models has recently attracted increasing attention. This introductory paper presents recent developments and applications in this field and makes a plea for a wider use of explainable learning algorithms in practice.