Goto

Collaborating Authors

 rbf


Adaptive Bayesian Data-Driven Design of Reliable Solder Joints for Micro-electronic Devices

Guo, Leo, Inamdar, Adwait, van Driel, Willem D., Zhang, GuoQi

arXiv.org Machine Learning

Solder joint reliability related to failures due to thermomechanical loading is a critically important yet physically complex engineering problem. As a result, simulated behavior is oftentimes computationally expensive. In an increasingly data-driven world, the usage of efficient data-driven design schemes is a popular choice. Among them, Bayesian optimization (BO) with Gaussian process regression is one of the most important representatives. The authors argue that computational savings can be obtained from exploiting thorough surrogate modeling and selecting a design candidate based on multiple acquisition functions. This is feasible due to the relatively low computational cost, compared to the expensive simulation objective. This paper addresses the shortcomings in the adjacent literature by providing and implementing a novel heuristic framework to perform BO with adaptive hyperparameters across the various optimization iterations. Adaptive BO is subsequently compared to regular BO when faced with synthetic objective minimization problems. The results show the efficiency of adaptive BO when compared any worst-performing regular Bayesian schemes. As an engineering use case, the solder joint reliability problem is tackled by minimizing the accumulated non-linear creep strain under a cyclic thermal load. Results show that adaptive BO outperforms regular BO by 3% on average at any given computational budget threshold, critically saving half of the computational expense budget. This practical result underlines the methodological potential of the adaptive Bayesian data-driven methodology to achieve better results and cut optimization-related expenses. Lastly, in order to promote the reproducibility of the results, the data-driven implementations are made available on an open-source basis.


Machine learning surrogate models of many-body dispersion interactions in polymer melts

Shen, Zhaoxiang, Sosa, Raúl I., Lengiewicz, Jakub, Tkatchenko, Alexandre, Bordas, Stéphane P. A.

arXiv.org Artificial Intelligence

Accurate prediction of many-body dispersion (MBD) interactions is essential for understanding the van der Waals forces that govern the behavior of many complex molecular systems. However, the high computational cost of MBD calculations limits their direct application in large-scale simulations. In this work, we introduce a machine learning surrogate model specifically designed to predict MBD forces in polymer melts, a system that demands accurate MBD description and offers structural advantages for machine learning approaches. Our model is based on a trimmed SchNet architecture that selectively retains the most relevant atomic connections and incorporates trainable radial basis functions for geometric encoding. We validate our surrogate model on datasets from polyethylene, polypropylene, and polyvinyl chloride melts, demonstrating high predictive accuracy and robust generalization across diverse polymer systems. In addition, the model captures key physical features, such as the characteristic decay behavior of MBD interactions, providing valuable insights for optimizing cutoff strategies. Characterized by high computational efficiency, our surrogate model enables practical incorporation of MBD effects into large-scale molecular simulations.


On the similarity of bandwidth-tuned quantum kernels and classical kernels

Ablan, Roberto Flórez, Roth, Marco, Schnabel, Jan

arXiv.org Artificial Intelligence

Quantum kernels (QK) are widely used in quantum machine learning applications; yet, their potential to surpass classical machine learning methods on classical datasets remains uncertain. This limitation can be attributed to the exponential concentration phenomenon, which can impair both trainability and generalization. A common strategy to alleviate this is bandwidth tuning, which involves rescaling data points in the quantum model to improve generalization. In this work, we numerically demonstrate that optimal bandwidth tuning results in QKs that closely resemble radial basis function (RBF) kernels, leading to a lack of quantum advantage over classical methods. Moreover, we reveal that the size of optimal bandwidth tuning parameters further simplifies QKs, causing them to behave like polynomial kernels, corresponding to a low-order Taylor approximation of a RBF kernel. We thoroughly investigate this for fidelity quantum kernels and projected quantum kernels using various data encoding circuits across several classification datasets. We provide numerical evidence and derive a simple analytical model that elucidates how bandwidth tuning influences key quantities in classification tasks. Overall, our findings shed light on the mechanisms that render QK methods classically simulatable.


CVKAN: Complex-Valued Kolmogorov-Arnold Networks

Wolff, Matthias, Eilers, Florian, Jiang, Xiaoyi

arXiv.org Artificial Intelligence

In this work we propose CKAN, a complex-valued KAN, to join the intrinsic interpretability of KANs and the advantages of Complex-Valued Neural Networks (CVNNs). We show how to transfer a KAN and the necessary associated mechanisms into the complex domain. To confirm that CKAN meets expectations we conduct experiments on symbolic complex-valued function fitting and physically meaningful formulae as well as on a more realistic dataset from knot theory. Our proposed CKAN is more stable and performs on par or better than real-valued KANs while requiring less parameters and a shallower network architecture, making it more explainable.


Multiscale scattered data analysis in samplet coordinates

Avesani, Sara, Kempf, Rüdiger, Multerer, Michael, Wendland, Holger

arXiv.org Artificial Intelligence

We study multiscale scattered data interpolation schemes for globally supported radial basis functions, with a focus on the Mat\'ern class. The multiscale approximation is constructed through a sequence of residual corrections, where radial basis functions with different lengthscale parameters are employed to capture varying levels of detail. To apply this approach to large data sets, we suggest to represent the resulting generalized Vandermonde matrices in samplet coordinates. Samplets are localized, discrete signed measures exhibiting vanishing moments and allow for the sparse approximation of generalized Vandermonde matrices issuing from a vast class of radial basis functions. Given a quasi-uniform set of $N$ data sites, and local approximation spaces with geometrically decreasing dimension, the full multiscale system can be assembled with cost $\mathcal{O}(N \log N)$. We prove that the condition numbers of the linear systems at each level remain bounded independent of the particular level, allowing us to use an iterative solver with a bounded number of iterations for the numerical solution. Hence, the overall cost of the proposed approach is $\mathcal{O}(N \log N)$. The theoretical findings are accompanied by extensive numerical studies in two and three spatial dimensions.


Kolmogorov-Arnold Networks are Radial Basis Function Networks

Li, Ziyao

arXiv.org Artificial Intelligence

This short paper is a fast proof-of-concept that the 3-order B-splines used in Kolmogorov-Arnold Networks (KANs) can be well approximated by Gaussian radial basis functions. Doing so leads to FastKAN, a much faster implementation of KAN which is also a radial basis function (RBF) network.


RBF-PINN: Non-Fourier Positional Embedding in Physics-Informed Neural Networks

Zeng, Chengxi, Burghardt, Tilo, Gambaruto, Alberto M

arXiv.org Artificial Intelligence

While many recent Physics-Informed Neural Networks (PINNs) variants have had considerable success in solving Partial Differential Equations, the empirical benefits of feature mapping drawn from the broader Neural Representations research have been largely overlooked. We highlight the limitations of widely used Fourier-based feature mapping in certain situations and suggest the use of the conditionally positive definite Radial Basis Function. The empirical findings demonstrate the effectiveness of our approach across a variety of forward and inverse problem cases. Our method can be seamlessly integrated into coordinate-based input neural networks and contribute to the wider field of PINNs research.


Training dynamics in Physics-Informed Neural Networks with feature mapping

Zeng, Chengxi, Burghardt, Tilo, Gambaruto, Alberto M

arXiv.org Artificial Intelligence

Physics-Informed Neural Networks (PINNs) have emerged as an iconic machine learning approach for solving Partial Differential Equations (PDEs). Although its variants have achieved significant progress, the empirical success of utilising feature mapping from the wider Implicit Neural Representations studies has been substantially neglected. We investigate the training dynamics of PINNs with a feature mapping layer via the limiting Conjugate Kernel and Neural Tangent Kernel, which sheds light on the convergence and generalisation of the model. We also show the inadequacy of commonly used Fourier-based feature mapping in some scenarios and propose the conditional positive definite Radial Basis Function as a better alternative. The empirical results reveal the efficacy of our method in diverse forward and inverse problem sets. This simple technique can be easily implemented in coordinate input networks and benefits the broad PINNs research.


A Deep Q-Network Based on Radial Basis Functions for Multi-Echelon Inventory Management

Cheng, Liqiang, Luo, Jun, Fan, Weiwei, Zhang, Yidong, Li, Yuan

arXiv.org Artificial Intelligence

This paper addresses a multi-echelon inventory management problem with a complex network topology where deriving optimal ordering decisions is difficult. Deep reinforcement learning (DRL) has recently shown potential in solving such problems, while designing the neural networks in DRL remains a challenge. In order to address this, a DRL model is developed whose Q-network is based on radial basis functions. The approach can be more easily constructed compared to classic DRL models based on neural networks, thus alleviating the computational burden of hyperparameter tuning. Through a series of simulation experiments, the superior performance of this approach is demonstrated compared to the simple base-stock policy, producing a better policy in the multi-echelon system and competitive performance in the serial system where the base-stock policy is optimal. In addition, the approach outperforms current DRL approaches.


A novel RNA pseudouridine site prediction model using Utility Kernel and data-driven parameters

Patil, Sourabh, Mathur, Archana, Aduri, Raviprasad, Saha, Snehanshu

arXiv.org Artificial Intelligence

RNA protein Interactions (RPIs) play an important role in biological systems. Recently, we have enumerated the RPIs at the residue level and have elucidated the minimum structural unit (MSU) in these interactions to be a stretch of five residues (Nucleotides/amino acids). Pseudouridine is the most frequent modification in RNA. The conversion of uridine to pseudouridine involves interactions between pseudouridine synthase and RNA. The existing models to predict the pseudouridine sites in a given RNA sequence mainly depend on user-defined features such as mono and dinucleotide composition/propensities of RNA sequences. Predicting pseudouridine sites is a non-linear classification problem with limited data points. Deep Learning models are efficient discriminators when the data set size is reasonably large and fail when there is a paucity of data ($<1000$ samples). To mitigate this problem, we propose a Support Vector Machine (SVM) Kernel based on utility theory from Economics, and using data-driven parameters (i.e. MSU) as features. For this purpose, we have used position-specific tri/quad/pentanucleotide composition/propensity (PSPC/PSPP) besides nucleotide and dineculeotide composition as features. SVMs are known to work well in small data regimes and kernels in SVM are designed to classify non-linear data. The proposed model outperforms the existing state-of-the-art models significantly (10%-15% on average).