Goto

Collaborating Authors

 Howard, Amanda A.


SPIKANs: Separable Physics-Informed Kolmogorov-Arnold Networks

arXiv.org Artificial Intelligence

Physics-Informed Neural Networks (PINNs) have emerged as a promising method for solving partial differential equations (PDEs) in scientific computing. While PINNs typically use multilayer perceptrons (MLPs) as their underlying architecture, recent advancements have explored alternative neural network structures. One such innovation is the Kolmogorov-Arnold Network (KAN), which has demonstrated benefits over traditional MLPs, including faster neural scaling and better interpretability. The application of KANs to physics-informed learning has led to the development of Physics-Informed KANs (PIKANs), enabling the use of KANs to solve PDEs. However, despite their advantages, KANs often suffer from slower training speeds, particularly in higher-dimensional problems where the number of collocation points grows exponentially with the dimensionality of the system. To address this challenge, we introduce Separable Physics-Informed Kolmogorov-Arnold Networks (SPIKANs). This novel architecture applies the principle of separation of variables to PIKANs, decomposing the problem such that each dimension is handled by an individual KAN. This approach drastically reduces the computational complexity of training without sacrificing accuracy, facilitating their application to higher-dimensional PDEs. Through a series of benchmark problems, we demonstrate the effectiveness of SPIKANs, showcasing their superior scalability and performance compared to PIKANs and highlighting their potential for solving complex, high-dimensional PDEs in scientific computing.


Multifidelity Kolmogorov-Arnold Networks

arXiv.org Artificial Intelligence

In recent years, scientific machine learning (SciML) has emerged as a paradigm for modeling physical systems [1, 2, 3]. Typically using the theory of multilayer perceptrons (MLPs), SciML has shown great success in modeling a wide range of applications, however, data-informed training struggles when high-quality data is not available. Kolmogorov-Arnold networks (KANs) have recently been developed as an alternative to MLPs [4, 5]. KANs use the Kolmogorov-Arnold Theorem as inspiration and can offer advantages over MLPs in some cases, such as for discovering interpretable models. However, KANs have been shown to struggle to reach the accuracy of MLPs, particularly without modifications [6, 7, 8, 9]. In the short time since the publication of [4], many variations of KANs have been developed, including physics-informed KANs (PIKANs)[9], KAN-informed neural networks (KINNs)[10], temporal KANs [11], wavelet KANs [12], graph KANs [13, 14, 15], Chebyshev KANs (cKANs) [16], convolutional KANs [17], ReLU-KANs [18], Higher-order-ReLU-KANs (HRKANs) [19], fractional KANs [20], finite basis KANs [21], deep operator KANs [22], and others.


Finite basis Kolmogorov-Arnold networks: domain decomposition for data-driven and physics-informed problems

arXiv.org Artificial Intelligence

Kolmogorov-Arnold networks (KANs) have attracted attention recently as an alternative to multilayer perceptrons (MLPs) for scientific machine learning. However, KANs can be expensive to train, even for relatively small networks. Inspired by finite basis physics-informed neural networks (FBPINNs), in this work, we develop a domain decomposition method for KANs that allows for several small KANs to be trained in parallel to give accurate solutions for multiscale problems. We show that finite basis KANs (FBKANs) can provide accurate results with noisy data and for physics-informed training.


Self-adaptive weights based on balanced residual decay rate for physics-informed neural networks and deep operator networks

arXiv.org Machine Learning

Physics-informed deep learning has emerged as a promising alternative for solving partial differential equations. However, for complex problems, training these networks can still be challenging, often resulting in unsatisfactory accuracy and efficiency. In this work, we demonstrate that the failure of plain physics-informed neural networks arises from the significant discrepancy in the convergence speed of residuals at different training points, where the slowest convergence speed dominates the overall solution convergence. Based on these observations, we propose a point-wise adaptive weighting method that balances the residual decay rate across different training points. The performance of our proposed adaptive weighting method is compared with current state-of-the-art adaptive weighting methods on benchmark problems for both physics-informed neural networks and physics-informed deep operator networks. Through extensive numerical results we demonstrate that our proposed approach of balanced residual decay rates offers several advantages, including bounded weights, high prediction accuracy, fast convergence speed, low training uncertainty, low computational cost and ease of hyperparameter tuning.


Multifidelity domain decomposition-based physics-informed neural networks for time-dependent problems

arXiv.org Artificial Intelligence

Multiscale problems are challenging for neural network-based discretizations of differential equations, such as physics-informed neural networks (PINNs). This can be (partly) attributed to the so-called spectral bias of neural networks. To improve the performance of PINNs for time-dependent problems, a combination of multifidelity stacking PINNs and domain decomposition-based finite basis PINNs are employed. In particular, to learn the high-fidelity part of the multifidelity model, a domain decomposition in time is employed. The performance is investigated for a pendulum and a two-frequency problem as well as the Allen-Cahn equation. It can be observed that the domain decomposition approach clearly improves the PINN and stacking PINN approaches.


Multifidelity Deep Operator Networks For Data-Driven and Physics-Informed Problems

arXiv.org Artificial Intelligence

In general, low-fidelity data is easier to obtain in greater quantities, but it may be too inaccurate or not dense enough to accurately train a machine learning model. High-fidelity data is costly to obtain, so there may not be sufficient data to use in training, however, it is more accurate. A small amount of high fidelity data, such as from measurements, combined with low fidelity data, can improve predictions when used together; this has motivated geophysicists to develop cokriging [1], which is based on Gaussian process regression at two different fidelity levels by exploiting correlations-albeit only linear ones - between different levels. An example of cokriging for obtaining the sea surface temperature (as well as the associated uncertainty) is presented in [2], where satellite images are used as low-fidelity data whereas in situ measurements are used as high-fidelity data. To exploit nonlinear correlations at different levels of fidelity, a probabilistic framework based on Gaussian process regression and nonlinear autoregressive scheme was proposed in [3] that can learn complex nonlinear and space-dependent cross-correlations between multifidelity models. However, the limitation of this work is the high computational cost for big data sets, and to this end, the subsequent work in [4] was based on neural networks and provided the first method of multifidelity training of deep neural networks.


A Hybrid Deep Neural Operator/Finite Element Method for Ice-Sheet Modeling

arXiv.org Artificial Intelligence

One of the most challenging and consequential problems in climate modeling is to provide probabilistic projections of sea level rise. A large part of the uncertainty of sea level projections is due to uncertainty in ice sheet dynamics. At the moment, accurate quantification of the uncertainty is hindered by the cost of ice sheet computational models. In this work, we develop a hybrid approach to approximate existing ice sheet computational models at a fraction of their cost. Our approach consists of replacing the finite element model for the momentum equations for the ice velocity, the most expensive part of an ice sheet model, with a Deep Operator Network, while retaining a classic finite element discretization for the evolution of the ice thickness. We show that the resulting hybrid model is very accurate and it is an order of magnitude faster than the traditional finite element model. Further, a distinctive feature of the proposed model compared to other neural network approaches, is that it can handle high-dimensional parameter spaces (parameter fields) such as the basal friction at the bed of the glacier, and can therefore be used for generating samples for uncertainty quantification. We study the impact of hyper-parameters, number of unknowns and correlation length of the parameter distribution on the training and accuracy of the Deep Operator Network on a synthetic ice sheet model. We then target the evolution of the Humboldt glacier in Greenland and show that our hybrid model can provide accurate statistics of the glacier mass loss and can be effectively used to accelerate the quantification of uncertainty.