Goto

Collaborating Authors

 heat flow


On the Robustness of Graph Neural Diffusion to Topology Perturbations

Neural Information Processing Systems

Neural diffusion on graphs is a novel class of graph neural networks that has attracted increasing attention recently. The capability of graph neural partial differential equations (PDEs) in addressing common hurdles of graph neural networks (GNNs), such as the problems of over-smoothing and bottlenecks, has been investigated but not their robustness to adversarial attacks. In this work, we explore the robustness properties of graph neural PDEs. We empirically demonstrate that graph neural PDEs are intrinsically more robust against topology perturbation as compared to other GNNs. We provide insights into this phenomenon by exploiting the stability of the heat semigroup under graph topology perturbations. We discuss various graph diffusion operators and relate them to existing graph neural PDEs. Furthermore, we propose a general graph neural PDE framework based on which a new class of robust GNNs can be defined. We verify that the new model achieves comparable state-of-the-art performance on several benchmark datasets.






Learning under Latent Group Sparsity via Diffusion on Networks

arXiv.org Machine Learning

Group or cluster structure on explanatory variables in machine learning problems is a very general phenomenon, which has attracted broad interest from practitioners and theoreticians alike. In this work we contribute an approach to sparse learning under such group structure, that does not require prior information on the group identities. Our paradigm is motivated by the Laplacian geometry of an underlying network with a related community structure, and proceeds by directly incorporating this into a penalty that is effectively computed via a heat-flow-based local network dynamics. The proposed penalty interpolates between the lasso and the group lasso penalties, the runtime of the heat-flow dynamics being the interpolating parameter. As such it can automatically default to lasso when the group structure reflected in the Laplacian is weak. In fact, we demonstrate a data-driven procedure to construct such a network based on the available data. Notably, we dispense with computationally intensive pre-processing involving clustering of variables, spectral or otherwise. Our technique is underpinned by rigorous theorems that guarantee its effective performance and provide bounds on its sample complexity. In particular, in a wide range of settings, it provably suffices to run the diffusion for time that is only logarithmic in the problem dimensions. We explore in detail the interfaces of our approach with key statistical physics models in network science, such as the Gaussian Free Field and the Stochastic Block Model. Our work raises the possibility of applying similar diffusion-based techniques to classical learning tasks, exploiting the interplay between geometric, dynamical and stochastic structures underlying the data.


Integrating Symbolic Neural Networks with Building Physics: A Study and Proposal

arXiv.org Artificial Intelligence

Symbolic neural networks, such as Kolmogorov-Arnold Networks (KAN), offer a promising approach for integrating prior knowledge with data-driven methods, making them valuable for addressing inverse problems in scientific and engineering domains. This study explores the application of KAN in building physics, focusing on predictive modeling, knowledge discovery, and continuous learning. Through four case studies, we demonstrate KAN's ability to rediscover fundamental equations, approximate complex formulas, and capture time-dependent dynamics in heat transfer. While there are challenges in extrapolation and interpretability, we highlight KAN's potential to combine advanced modeling methods for knowledge augmentation, which benefits energy efficiency, system optimization, and sustainability assessments beyond the personal knowledge constraints of the modelers. Additionally, we propose a model selection decision tree to guide practitioners in appropriate applications for building physics.


Iterated Schr\"odinger bridge approximation to Wasserstein Gradient Flows

arXiv.org Machine Learning

We introduce a novel discretization scheme for Wasserstein gradient flows that involves successively computing Schr\"{o}dinger bridges with the same marginals. This is different from both the forward/geodesic approximation and the backward/Jordan-Kinderlehrer-Otto (JKO) approximations. The proposed scheme has two advantages: one, it avoids the use of the score function, and, two, it is amenable to particle-based approximations using the Sinkhorn algorithm. Our proof hinges upon showing that relative entropy between the Schr\"{o}dinger bridge with the same marginals at temperature $\epsilon$ and the joint distribution of a stationary Langevin diffusion at times zero and $\epsilon$ is of the order $o(\epsilon^2)$ with an explicit dependence given by Fisher information. Owing to this inequality, we can show, using a triangular approximation argument, that the interpolated iterated application of the Schr\"{o}dinger bridge approximation converge to the Wasserstein gradient flow, for a class of gradient flows, including the heat flow. The results also provide a probabilistic and rigorous framework for the convergence of the self-attention mechanisms in transformer networks to the solutions of heat flows, first observed in the inspiring work SABP22 in machine learning research.


A Separation in Heavy-Tailed Sampling: Gaussian vs. Stable Oracles for Proximal Samplers

arXiv.org Machine Learning

We study the complexity of heavy-tailed sampling and present a separation result in terms of obtaining high-accuracy versus low-accuracy guarantees i.e., samplers that require only $O(\log(1/\varepsilon))$ versus $\Omega(\text{poly}(1/\varepsilon))$ iterations to output a sample which is $\varepsilon$-close to the target in $\chi^2$-divergence. Our results are presented for proximal samplers that are based on Gaussian versus stable oracles. We show that proximal samplers based on the Gaussian oracle have a fundamental barrier in that they necessarily achieve only low-accuracy guarantees when sampling from a class of heavy-tailed targets. In contrast, proximal samplers based on the stable oracle exhibit high-accuracy guarantees, thereby overcoming the aforementioned limitation. We also prove lower bounds for samplers under the stable oracle and show that our upper bounds cannot be fundamentally improved.


Explainable AI for engineering design: A unified approach of systems engineering and component-based deep learning

arXiv.org Artificial Intelligence

Data-driven models created by machine learning gain in importance in all fields of design and engineering. They have high potential to assist decision-makers in creating novel artefacts with better performance and sustainability. However, limited generalization and the black-box nature of these models lead to limited explainability and reusability. To overcome this situation, we propose a component-based approach to create partial component models by machine learning (ML). This component-based approach aligns deep learning with systems engineering (SE). For the domain of energy efficient building design, we first demonstrate better generalization of the component-based method by analyzing prediction accuracy outside the training data. We observe a much higher accuracy (R2 = 0.94) compared to conventional monolithic methods (R2 = 0.71). Second, we illustrate explainability by exemplary demonstrating how sensitivity information from SE and rules from low-depth decision trees serve engineering. Third, we evaluate explainability by qualitative and quantitative methods demonstrating the matching of preliminary knowledge and data-driven derived strategies and show correctness of activations at component interfaces compared to white-box simulation results (envelope components: R2 = 0.92..0.99; zones: R2 = 0.78..0.93). The key for component-based explainability is that activations at interfaces between the components are interpretable engineering quantities. The large range of possible configurations in composing components allows the examination of novel unseen design cases with understandable data-driven models. The matching of parameter ranges of components by similar probability distribution produces reusable, well-generalizing, and trustworthy models. The approach adapts the model structure to engineering methods of systems engineering and to domain knowledge.