AITopics | Perdikaris, Paris

Collaborating Authors

Perdikaris, Paris

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Gradient Alignment in Physics-informed Neural Networks: A Second-Order Optimization Perspective

Wang, Sifan, Bhartari, Ananyae Kumar, Li, Bowen, Perdikaris, Paris

arXiv.org Artificial IntelligenceFeb-1-2025

Multi-task learning through composite loss functions has become a cornerstone of modern deep learning, from computer vision to scientific computing. However, when different loss terms compete for model capacity, they can generate conflicting gradients that impede optimization and degrade performance. While this fundamental challenge is known to the multi-task learning literature [1-3], several challenges remain open, particularly in settings where objectives are tightly coupled through complex physical constraints. In this work, we examine gradient conflicts through the lens of physics-informed neural networks (PINNs), where the challenge manifests acutely due to the inherent coupling between physical constraints and data-fitting objectives. Our key insight is that while first-order optimization methods struggle with competing objectives, appropriate preconditioning can naturally align gradients to enable efficient optimization. While our findings on gradient alignment and second-order preconditioning have broad implications for multi-task learning, here we focus on PINNs as they provide an ideal testbed: their physically-constrained objectives are mathematically precise, their solutions can be rigorously verified, and their performance bottlenecks are well-documented. Through theoretical analysis and extensive experiments on challenging partial differential equations (PDEs), we demonstrate breakthrough results in problems ranging from basic wave propagation to turbulent flows. To better motivate our approach, consider training a PINN to solve the Navier-Stokes equations. The model must simultaneously satisfy boundary conditions, conservation laws, and empirical measurements - objectives that often push a neural network's parameters in opposing directions.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2502.00604

Country:

North America > United States (0.28)
Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On conditional diffusion models for PDE simulations

Shysheya, Aliaksandra, Diaconu, Cristiana, Bergamin, Federico, Perdikaris, Paris, Hernández-Lobato, José Miguel, Turner, Richard E., Mathieu, Emile

arXiv.org Artificial IntelligenceOct-21-2024

Modelling partial differential equations (PDEs) is of crucial importance in science and engineering, and it includes tasks ranging from forecasting to inverse problems, such as data assimilation. However, most previous numerical and machine learning approaches that target forecasting cannot be applied out-of-the-box for data assimilation. Recently, diffusion models have emerged as a powerful tool for conditional generation, being able to flexibly incorporate observations without retraining. In this work, we perform a comparative study of score-based diffusion models for forecasting and assimilation of sparse observations. In particular, we focus on diffusion models that are either trained in a conditional manner, or conditioned after unconditional training. We address the shortcomings of existing models by proposing 1) an autoregressive sampling approach, that significantly improves performance in forecasting, 2) a new training strategy for conditional score-based models that achieves stable performance over a range of history lengths, and 3) a hybrid model which employs flexible pre-training conditioning on initial conditions and flexible posttraining conditioning to handle data assimilation. We empirically show that these modifications are crucial for successfully tackling the combination of forecasting and data assimilation, a task commonly encountered in real-world scenarios.

artificial intelligence, machine learning, trajectory, (19 more...)

arXiv.org Artificial Intelligence

2410.16415

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.45)

Industry:

Energy (0.67)
Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

Score Neural Operator: A Generative Model for Learning and Generalizing Across Multiple Probability Distributions

Liao, Xinyu, Qin, Aoyang, Seidman, Jacob, Wang, Junqi, Wang, Wei, Perdikaris, Paris

arXiv.org Artificial IntelligenceOct-15-2024

Most existing generative models are limited to learning a single probability distribution from the training data and cannot generalize to novel distributions for unseen data. An architecture that can generate samples from both trained datasets and unseen probability distributions would mark a significant breakthrough. Recently, score-based generative models have gained considerable attention for their comprehensive mode coverage and high-quality image synthesis, as they effectively learn an operator that maps a probability distribution to its corresponding score function. In this work, we introduce the $\emph{Score Neural Operator}$, which learns the mapping from multiple probability distributions to their score functions within a unified framework. We employ latent space techniques to facilitate the training of score matching, which tends to over-fit in the original image pixel space, thereby enhancing sample generation quality. Our trained Score Neural Operator demonstrates the ability to predict score functions of probability measures beyond the training space and exhibits strong generalization performance in both 2-dimensional Gaussian Mixture Models and 1024-dimensional MNIST double-digit datasets. Importantly, our approach offers significant potential for few-shot learning applications, where a single image from a new distribution can be leveraged to generate multiple distinct images from that distribution.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2410.08549

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.84)

Add feedback

Deep Learning Alternatives of the Kolmogorov Superposition Theorem

Guilhoto, Leonardo Ferreira, Perdikaris, Paris

arXiv.org Artificial IntelligenceOct-2-2024

This paper explores alternative formulations of the Kolmogorov Superposition Theorem (KST) as a foundation for neural network design. The original KST formulation, while mathematically elegant, presents practical challenges due to its limited insight into the structure of inner and outer functions and the large number of unknown variables it introduces. Kolmogorov-Arnold Networks (KANs) leverage KST for function approximation, but they have faced scrutiny due to mixed results compared to traditional multilayer perceptrons (MLPs) and practical limitations imposed by the original KST formulation. To address these issues, we introduce ActNet, a scalable deep learning model that builds on the KST and overcomes many of the drawbacks of Kolmogorov's original formulation. We evaluate ActNet in the context of Physics-Informed Neural Networks (PINNs), a framework well-suited for leveraging KST's strengths in low-dimensional function approximation, particularly for simulating partial differential equations (PDEs). In this challenging setting, where models must learn latent functions without direct measurements, ActNet consistently outperforms KANs across multiple benchmarks and is competitive against the current best MLP-based approaches. These results present ActNet as a promising new direction for KST-based deep learning applications, particularly in scientific computing and PDE simulation tasks.

actnet, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.0199

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report (1.00)

Industry: Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Disk2Planet: A Robust and Automated Machine Learning Tool for Parameter Inference in Disk-Planet Systems

Mao, Shunyuan, Dong, Ruobing, Yi, Kwang Moo, Lu, Lu, Wang, Sifan, Perdikaris, Paris

arXiv.org Artificial IntelligenceSep-25-2024

We introduce Disk2Planet, a machine learning-based tool to infer key parameters in disk-planet systems from observed protoplanetary disk structures. Disk2Planet takes as input the disk structures in the form of two-dimensional density and velocity maps, and outputs disk and planet properties, that is, the Shakura--Sunyaev viscosity, the disk aspect ratio, the planet--star mass ratio, and the planet's radius and azimuth. We integrate the Covariance Matrix Adaptation Evolution Strategy (CMA--ES), an evolutionary algorithm tailored for complex optimization problems, and the Protoplanetary Disk Operator Network (PPDONet), a neural network designed to predict solutions of disk--planet interactions. Our tool is fully automated and can retrieve parameters in one system in three minutes on an Nvidia A100 graphics processing unit. We empirically demonstrate that our tool achieves percent-level or higher accuracy, and is able to handle missing data and unknown levels of noise.

artificial intelligence, evolutionary algorithm, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2409.17228

Country:

North America > Canada > British Columbia (0.28)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.69)

Add feedback

Micrometer: Micromechanics Transformer for Predicting Mechanical Responses of Heterogeneous Materials

Wang, Sifan, Liu, Tong-Rui, Sankaran, Shyam, Perdikaris, Paris

arXiv.org Artificial IntelligenceSep-23-2024

Heterogeneous materials, crucial in various engineering applications, exhibit complex multiscale behavior, which challenges the effectiveness of traditional computational methods. In this work, we introduce the Micromechanics Transformer ({\em Micrometer}), an artificial intelligence (AI) framework for predicting the mechanical response of heterogeneous materials, bridging the gap between advanced data-driven methods and complex solid mechanics problems. Trained on a large-scale high-resolution dataset of 2D fiber-reinforced composites, Micrometer can achieve state-of-the-art performance in predicting microscale strain fields across a wide range of microstructures, material properties under any loading conditions and We demonstrate the accuracy and computational efficiency of Micrometer through applications in computational homogenization and multiscale modeling, where Micrometer achieves 1\% error in predicting macroscale stress fields while reducing computational time by up to two orders of magnitude compared to conventional numerical solvers. We further showcase the adaptability of the proposed model through transfer learning experiments on new materials with limited data, highlighting its potential to tackle diverse scenarios in mechanical analysis of solid materials. Our work represents a significant step towards AI-driven innovation in computational solid mechanics, addressing the limitations of traditional numerical methods and paving the way for more efficient simulations of heterogeneous materials across various industrial applications.

machine learning, micrometer, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.05281

Country:

North America > United States (0.92)
Europe (0.92)

Genre:

Overview (0.87)
Research Report > New Finding (0.46)

Industry:

Materials (0.88)
Health & Medicine (0.67)
Energy > Oil & Gas > Upstream (0.48)
Government > Regional Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Aurora: A Foundation Model of the Atmosphere

Bodnar, Cristian, Bruinsma, Wessel P., Lucic, Ana, Stanley, Megan, Brandstetter, Johannes, Garvan, Patrick, Riechert, Maik, Weyn, Jonathan, Dong, Haiyu, Vaughan, Anna, Gupta, Jayesh K., Tambiratnam, Kit, Archibald, Alex, Heider, Elizabeth, Welling, Max, Turner, Richard E., Perdikaris, Paris

arXiv.org Artificial IntelligenceMay-28-2024

Deep learning foundation models are revolutionizing many facets of science by leveraging vast amounts of data to learn general-purpose representations that can be adapted to tackle diverse downstream tasks. Foundation models hold the promise to also transform our ability to model our planet and its subsystems by exploiting the vast expanse of Earth system data. Here we introduce Aurora, a large-scale foundation model of the atmosphere trained on over a million hours of diverse weather and climate data. Aurora leverages the strengths of the foundation modelling approach to produce operational forecasts for a wide variety of atmospheric prediction problems, including those with limited training data, heterogeneous variables, and extreme events. In under a minute, Aurora produces 5-day global air pollution predictions and 10-day high-resolution weather forecasts that outperform state-of-the-art classical simulation tools and the best specialized deep learning models. Taken together, these results indicate that foundation models can transform environmental forecasting.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

arXiv.org Artificial Intelligence

2405.13063

Country:

North America > United States (0.67)
Asia (0.67)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (0.46)
Materials > Chemicals > Industrial Gases (0.45)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Bridging Operator Learning and Conditioned Neural Fields: A Unifying Perspective

Wang, Sifan, Seidman, Jacob H, Sankaran, Shyam, Wang, Hanwen, Pappas, George J., Perdikaris, Paris

arXiv.org Machine LearningMay-22-2024

Operator learning is an emerging area of machine learning which aims to learn mappings between infinite dimensional function spaces. Here we uncover a connection between operator learning architectures and conditioned neural fields from computer vision, providing a unified perspective for examining differences between popular operator learning models. We find that many commonly used operator learning models can be viewed as neural fields with conditioning mechanisms restricted to point-wise and/or global information. Motivated by this, we propose the Continuous Vision Transformer (CViT), a novel neural operator architecture that employs a vision transformer encoder and uses cross-attention to modulate a base field constructed with a trainable grid-based positional encoding of query coordinates. Despite its simplicity, CViT achieves state-of-the-art results across challenging benchmarks in climate modeling and fluid dynamics. Our contributions can be viewed as a first step towards adapting advanced computer vision architectures for building more flexible and accurate machine learning models in physical sciences.

artificial intelligence, machine learning, operator, (18 more...)

arXiv.org Machine Learning

2405.13998

Country:

North America > United States (0.46)
Asia > Middle East > Israel (0.14)
Asia > Japan > Honshū > Chūbu (0.14)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Composite Bayesian Optimization In Function Spaces Using NEON -- Neural Epistemic Operator Networks

Guilhoto, Leonardo Ferreira, Perdikaris, Paris

arXiv.org Machine LearningApr-3-2024

High-dimensional problems are prominent across all corners of science and industrial applications. Within this realm, optimizing black-box functions and operators can be computationally expensive and require large amounts of hardto-obtain data for training surrogate models. Uncertainty quantification becomes a key element in this setting, as the ability to quantify what a surrogate model does not know offers a guiding principle for new data acquisition. However, existing methods for surrogate modeling with built-in uncertainty quantification, such as Gaussian Processes (GPs) [1], have demonstrated difficulty in modeling problems that exist in high dimensions. While other methods such as Bayesian neural networks [2] (BNNs) and deep ensembles [3] are able to mitigate this issue, their computational cost can still be prohibitive for some applications. This problem becomes more prominent in Operator Learning, where either inputs or outputs of a model are functions residing in infinite-dimensional function spaces. The field of Operator Learning has had many advances in recent years[4, 5, 6, 7, 8, 9], with applications across many domains in the natural sciences and engineering, but so far its integration with uncertainty quantification is limited [10, 11]. In addition to safety-critical problems using deep learning such as ones in medicine [12, 13] and autonomous driving [14], the generation of uncertainty measures can also be important for decision making when collecting new data in the physical sciences. Total uncertainty is often made up of two distinct parts: epistemic and aleatoric uncertainty.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

2404.03099

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report (1.00)

Industry:

Energy (0.68)
Transportation (0.54)
Health & Medicine > Therapeutic Area (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

PirateNets: Physics-informed Deep Learning with Residual Adaptive Networks

Wang, Sifan, Li, Bowen, Chen, Yuhan, Perdikaris, Paris

arXiv.org Artificial IntelligenceFeb-11-2024

While physics-informed neural networks (PINNs) have become a popular deep learning framework for tackling forward and inverse problems governed by partial differential equations (PDEs), their performance is known to degrade when larger and deeper neural network architectures are employed. Our study identifies that the root of this counter-intuitive behavior lies in the use of multi-layer perceptron (MLP) architectures with non-suitable initialization schemes, which result in poor trainablity for the network derivatives, and ultimately lead to an unstable minimization of the PDE residual loss. To address this, we introduce Physics-informed Residual Adaptive Networks (PirateNets), a novel architecture that is designed to facilitate stable and efficient training of deep PINN models. PirateNets leverage a novel adaptive residual connection, which allows the networks to be initialized as shallow networks that progressively deepen during training. We also show that the proposed initialization scheme allows us to encode appropriate inductive biases corresponding to a given PDE system into the network architecture. We provide comprehensive empirical evidence showing that PirateNets are easier to optimize and can gain accuracy from considerably increased depth, ultimately achieving state-of-the-art results across various benchmarks. All code and data accompanying this manuscript will be made publicly available at \url{https://github.com/PredictiveIntelligenceLab/jaxpi}.

artificial intelligence, machine learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2402.00326

Country:

North America > United States > North Carolina (0.28)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine (0.68)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback