AITopics | Chattopadhyay, Ashesh

Collaborating Authors

Chattopadhyay, Ashesh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Explaining the physics of transfer learning a data-driven subgrid-scale closure to a different turbulent flow

Subel, Adam, Guan, Yifei, Chattopadhyay, Ashesh, Hassanzadeh, Pedram

arXiv.org Artificial IntelligenceJun-7-2022

Transfer learning (TL) is becoming a powerful tool in scientific applications of neural networks (NNs), such as weather/climate prediction and turbulence modeling. TL enables out-of-distribution generalization (e.g., extrapolation in parameters) and effective blending of disparate training sets (e.g., simulations and observations). In TL, selected layers of a NN, already trained for a base system, are re-trained using a small dataset from a target system. For effective TL, we need to know 1) what are the best layers to re-train? and 2) what physics are learned during TL? Here, we present novel analyses and a new framework to address (1)-(2) for a broad range of multi-scale, nonlinear systems. Our approach combines spectral analyses of the systems' data with spectral analyses of convolutional NN's activations and kernels, explaining the inner-workings of TL in terms of the system's nonlinear physics. Using subgrid-scale modeling of several setups of 2D turbulence as test cases, we show that the learned kernels are combinations of low-, band-, and high-pass filters, and that TL learns new filters whose nature is consistent with the spectral differences of base and target systems. We also find the shallowest layers are the best to re-train in these cases, which is against the common wisdom guiding TL in machine learning literature. Our framework identifies the best layer(s) to re-train beforehand, based on physics and NN theory. Together, these analyses explain the physics learned in TL and provide a framework to guide TL for wide-ranging applications in science and engineering, such as climate change modeling.

artificial intelligence, data-driven subgrid-scale closure, machine learning, (2 more...)

arXiv.org Artificial Intelligence

doi: 10.1093/pnasnexus/pgad015

2206.03198

Genre: Research Report (0.40)

Industry:

Education > Curriculum > Subject-Specific Education (0.53)
Energy > Oil & Gas > Upstream (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.60)

Add feedback

Closed-form discovery of structural errors in models of chaotic systems by integrating Bayesian sparse regression and data assimilation

Mojgani, Rambod, Chattopadhyay, Ashesh, Hassanzadeh, Pedram

arXiv.org Machine LearningOct-1-2021

Models used for many important engineering and natural systems are imperfect. The discrepancy between the mathematical representations of a true physical system and its imperfect model is called the model error. These model errors can lead to substantial difference between the numerical solutions of the model and the observations of the system, particularly in those involving nonlinear, multi-scale phenomena. Thus, there is substantial interest in reducing model errors, particularly through understanding their physics and sources and leveraging the rapid growth of observational data. Here we introduce a framework named MEDIDA: Model Error Discovery with Interpretability and Data Assimilation. MEDIDA only requires a working numerical solver of the model and a small number of noise-free or noisy sporadic observations of the system. In MEDIDA, first the model error is estimated from differences between the observed states and model-predicted states (the latter are obtained from a number of one-time-step numerical integrations from the previous observed states). If observations are noisy, a data assimilation (DA) technique such as ensemble Kalman filter (EnKF) is first used to provide a noise-free analysis state of the system, which is then used in estimating the model error. Finally, an equation-discovery technique, such as the relevance vector machine (RVM), a sparsity-promoting Bayesian method, is used to identify an interpretable, parsimonious, closed-form representation of the model error. Using the chaotic Kuramoto-Sivashinsky (KS) system as the test case, we demonstrate the excellent performance of MEDIDA in discovering different types of structural/parametric model errors, representing different types of missing physics, using noise-free and noisy observations.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2110.00546

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Towards physically consistent data-driven weather forecasting: Integrating data assimilation with equivariance-preserving deep spatial transformers

Chattopadhyay, Ashesh, Mustafa, Mustafa, Hassanzadeh, Pedram, Bach, Eviatar, Kashinath, Karthik

arXiv.org Artificial IntelligenceMar-16-2021

There is growing interest in data-driven weather prediction (DDWP), for example using convolutional neural networks such as U-NETs that are trained on data from models or reanalysis. Here, we propose 3 components to integrate with commonly used DDWP models in order to improve their physical consistency and forecast accuracy. These components are 1) a deep spatial transformer added to the latent space of the U-NETs to preserve a property called equivariance, which is related to correctly capturing rotations and scalings of features in spatio-temporal data, 2) a data-assimilation (DA) algorithm to ingest noisy observations and improve the initial conditions for next forecasts, and 3) a multi-time-step algorithm, which combines forecasts from DDWP models with different time steps through DA, improving the accuracy of forecasts at short intervals. To show the benefit/feasibility of each component, we use geopotential height at 500~hPa (Z500) from ERA5 reanalysis and examine the short-term forecast accuracy of specific setups of the DDWP framework. Results show that the equivariance-preserving networks (U-STNs) clearly outperform the U-NETs, for example improving the forecast skill by $45\%$. Using a sigma-point ensemble Kalman (SPEnKF) algorithm for DA and U-STN as the forward model, we show that stable, accurate DA cycles are achieved even with high observation noise. The DDWP+DA framework substantially benefits from large ($O(1000)$) ensembles that are inexpensively generated with the data-driven forward model in each DA cycle. The multi-time-step DDWP+DA framework also shows promises, e.g., it reduces the average error by factors of 2-3.

deep learning, forecast, neural network, (17 more...)

arXiv.org Artificial Intelligence

2103.0936

Country:

Asia (0.67)
North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (1.00)
Government > Regional Government (0.93)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Data-driven prediction of a multi-scale Lorenz 96 chaotic system using a hierarchy of deep learning methods: Reservoir computing, ANN, and RNN-LSTM

Chattopadhyay, Ashesh, Hassanzadeh, Pedram, Palem, Krishna, Subramanian, Devika

arXiv.org Machine LearningJun-30-2019

In this paper, the performance of three deep learning methods for predicting short-term evolution and reproducing the long-term statistics of a multi-scale spatio-temporal Lorenz 96 system is examined. The methods are: echo state network (a type of reservoir computing, RC-ESN), deep feed-forward artificial neural network (ANN), and recurrent neural network with long short-term memory (RNN-LSTM). This Lorenz system has three tiers of nonlinearly interacting variables representing slow/large-scale ($X$), intermediate ($Y$), and fast/small-scale ($Z$) processes. For training or testing, only $X$ is available; $Y$ and $Z$ are never known/used. It is shown that RC-ESN substantially outperforms ANN and RNN-LSTM for short-term prediction, e.g., accurately forecasting the chaotic trajectories for hundreds of numerical solver's time steps, equivalent to several Lyapunov timescales. RNN-LSTM and ANN show some prediction skills as well; RNN-LSTM bests ANN. Furthermore, even after losing the trajectory, data predicted by RC-ESN and RNN-LSTM have probability density functions (PDFs) that closely match the true PDF, even at the tails. PDF of the ANN data deviates from the true PDF. Implications, caveats, and applications to data-driven and inexact, data-assisted surrogate modeling of complex dynamical systems such as weather/climate are discussed.

deep learning, neural network, prediction, (17 more...)

arXiv.org Machine Learning

1906.08829

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback