training region
On Global Applicability and Location Transferability of Generative Deep Learning Models for Precipitation Downscaling
Harder, Paula, Lessig, Christian, Chantry, Matthew, Pelletier, Francis, Rolnick, David
Deep learning offers promising capabilities for the statistical downscaling of climate and weather forecasts, with generative approaches showing particular success in capturing fine-scale precipitation patterns. However, most existing models are region-specific, and their ability to generalize to unseen geographic areas remains largely unexplored. In this study, we evaluate the generalization performance of generative downscaling models across diverse regions. Using a global framework, we employ ERA5 reanalysis data as predictors and IMERG precipitation estimates at $0.1^\circ$ resolution as targets. A hierarchical location-based data split enables a systematic assessment of model performance across 15 regions around the world.
Exploring the Generalizability of Geomagnetic Navigation: A Deep Reinforcement Learning approach with Policy Distillation
Bai, Wenqi, Zhang, Shiliang, Zhang, Xiaohui, Ma, Xuehui, Yang, Songnan, Li, Yushuai, Huang, Tingwen
The advancement in autonomous vehicles has empowered navigation and exploration in unknown environments. Geomagnetic navigation for autonomous vehicles has drawn increasing attention with its independence from GPS or inertial navigation devices. While geomagnetic navigation approaches have been extensively investigated, the generalizability of learned geomagnetic navigation strategies remains unexplored. The performance of a learned strategy can degrade outside of its source domain where the strategy is learned, due to a lack of knowledge about the geomagnetic characteristics in newly entered areas. This paper explores the generalization of learned geomagnetic navigation strategies via deep reinforcement learning (DRL). Particularly, we employ DRL agents to learn multiple teacher models from distributed domains that represent dispersed navigation strategies, and amalgamate the teacher models for generalizability across navigation areas. We design a reward shaping mechanism in training teacher models where we integrate both potential-based and intrinsic-motivated rewards. The designed reward shaping can enhance the exploration efficiency of the DRL agent and improve the representation of the teacher models. Upon the gained teacher models, we employ multi-teacher policy distillation to merge the policies learned by individual teachers, leading to a navigation strategy with generalizability across navigation domains. We conduct numerical simulations, and the results demonstrate an effective transfer of the learned DRL model from a source domain to new navigation areas. Compared to existing evolutionary-based geomagnetic navigation methods, our approach provides superior performance in terms of navigation length, duration, heading deviation, and success rate in cross-domain navigation. Geomagnetic navigation leverages the ubiquitous earth magnetic field signals for the navigation [1], [2], without independence on dedicated devices along the navigation route [3]-[5]. Geomagnetic navigation thus can secure the navigation mission, e.g., in remote areas or underwater environments where there GPS or pre-deployed navigation devices is unavailable [6].
Extrapolative Controlled Sequence Generation via Iterative Refinement
Padmakumar, Vishakh, Pang, Richard Yuanzhe, He, He, Parikh, Ankur P.
We study the problem of extrapolative controlled generation, i.e., generating sequences with attribute values beyond the range seen in training. This task is of significant importance in automated design, especially drug discovery, where the goal is to design novel proteins that are \textit{better} (e.g., more stable) than existing sequences. Thus, by definition, the target sequences and their attribute values are out of the training distribution, posing challenges to existing methods that aim to directly generate the target sequence. Instead, in this work, we propose Iterative Controlled Extrapolation (ICE) which iteratively makes local edits to a sequence to enable extrapolation. We train the model on synthetically generated sequence pairs that demonstrate small improvement in the attribute value. Results on one natural language task (sentiment analysis) and two protein engineering tasks (ACE2 stability and AAV fitness) show that ICE considerably outperforms state-of-the-art approaches despite its simplicity. Our code and models are available at: https://github.com/vishakhpk/iter-extrapolation.
Learning-based Design of Luenberger Observers for Autonomous Nonlinear Systems
Niazi, Muhammad Umar B., Cao, John, Sun, Xudong, Das, Amritam, Johansson, Karl Henrik
Designing Luenberger observers for nonlinear systems involves the challenging task of transforming the state to an alternate coordinate system, possibly of higher dimensions, where the system is asymptotically stable and linear up to output injection. The observer then estimates the system's state in the original coordinates by inverting the transformation map. However, finding a suitable injective transformation whose inverse can be derived remains a primary challenge for general nonlinear systems. We propose a novel approach that uses supervised physics-informed neural networks to approximate both the transformation and its inverse. Our method exhibits superior generalization capabilities to contemporary methods and demonstrates robustness to both neural network's approximation errors and system uncertainties.
Characterizing 4-string contact interaction using machine learning
Erbin, Harold, Fırat, Atakan Hilmi
The geometry of 4-string contact interaction of closed string field theory is characterized using machine learning. We obtain Strebel quadratic differentials on 4-punctured spheres as a neural network by performing unsupervised learning with a custom-built loss function. This allows us to solve for local coordinates and compute their associated mapping radii numerically. We also train a neural network distinguishing vertex from Feynman region. As a check, 4-tachyon contact term in the tachyon potential is computed and a good agreement with the results in the literature is observed. We argue that our algorithm is manifestly independent of number of punctures and scaling it to characterize the geometry of $n$-string contact interaction is feasible.
Interpretable Polynomial Neural Ordinary Differential Equations
Neural networks have the ability to serve as universal function approximators, but they are not interpretable and don't generalize well outside of their training region. Both of these issues are problematic when trying to apply standard neural ordinary differential equations (neural ODEs) to dynamical systems. We introduce the polynomial neural ODE, which is a deep polynomial neural network inside of the neural ODE framework. We demonstrate the capability of polynomial neural ODEs to predict outside of the training region, as well as perform direct symbolic regression without additional tools such as SINDy.
Probabilistic Deep Learning for Real-Time Large Deformation Simulations
Deshpande, Saurabh, Lengiewicz, Jakub, Bordas, Stéphane P. A.
For many novel applications, such as patient-specific computer-aided surgery, conventional solution techniques of the underlying nonlinear problems are usually computationally too expensive and are lacking information about how certain can we be about their predictions. In the present work, we propose a highly efficient deep-learning surrogate framework that is able to accurately predict the response of bodies undergoing large deformations in real-time. The surrogate model has a convolutional neural network architecture, called U-Net, which is trained with force-displacement data obtained with the finite element method. We propose deterministic and probabilistic versions of the framework. The probabilistic framework utilizes the Variational Bayes Inference approach and is able to capture all the uncertainties present in the data as well as in the deep-learning model. Based on several benchmark examples, we show the predictive capabilities of the framework and discuss its possible limitations
Two Shifts for Crop Mapping: Leveraging Aggregate Crop Statistics to Improve Satellite-based Maps in New Regions
Kluger, Dan M., Wang, Sherrie, Lobell, David B.
Crop type mapping at the field level is critical for a variety of applications in agricultural monitoring, and satellite imagery is becoming an increasingly abundant and useful raw input from which to create crop type maps. Still, in many regions crop type mapping with satellite data remains constrained by a scarcity of field-level crop labels for training supervised classification models. When training data is not available in one region, classifiers trained in similar regions can be transferred, but shifts in the distribution of crop types as well as transformations of the features between regions lead to reduced classification accuracy. We present a methodology that uses aggregate-level crop statistics to correct the classifier by accounting for these two types of shifts. To adjust for shifts in the crop type composition we present a scheme for properly reweighting the posterior probabilities of each class that are output by the classifier. To adjust for shifts in features we propose a method to estimate and remove linear shifts in the mean feature vector. We demonstrate that this methodology leads to substantial improvements in overall classification accuracy when using Linear Discriminant Analysis (LDA) to map crop types in Occitanie, France and in Western Province, Kenya. When using LDA as our base classifier, we found that in France our methodology led to percent reductions in misclassifications ranging from 2.8% to 42.2% (mean = 21.9%) over eleven different training departments, and in Kenya the percent reductions in misclassification were 6.6%, 28.4%, and 42.7% for three training regions. While our methodology was statistically motivated by the LDA classifier, it can be applied to any type of classifier. As an example, we demonstrate its successful application to improve a Random Forest classifier.
A Markov Reward Process-Based Approach to Spatial Interpolation
The interpolation of spatial data can be of tremendous value in various applications, such as forecasting weather from only a few measurements of meteorological or remote sensing data. Existing methods for spatial interpolation, such as variants of kriging and spatial autoregressive models, tend to suffer from at least one of the following limitations: (a) the assumption of stationarity, (b) the assumption of isotropy, and (c) the trade-off between modelling local or global spatial interaction. Addressing these issues in this work, we propose the use of Markov reward processes (MRPs) as a spatial interpolation method, and we introduce three variants thereof: (i) a basic static discount MRP (SD-MRP), (ii) an accurate but mostly theoretical optimised MRP (O-MRP), and (iii) a transferable weight prediction MRP (WP-MRP). All variants of MRP interpolation operate locally, while also implicitly accounting for global spatial relationships in the entire system through recursion. Additionally, O-MRP and WP-MRP no longer assume stationarity and are robust to anisotropy. We evaluated our proposed methods by comparing the mean absolute errors of their interpolated grid cells to those of 7 common baselines, selected from models based on spatial autocorrelation, (spatial) regression, and deep learning. We performed detailed evaluations on two publicly available datasets (local GDP values, and COVID-19 patient trajectory data). The results from these experiments clearly show the competitive advantage of MRP interpolation, which achieved significantly lower errors than the existing methods in 23 out of 40 experimental conditions, or 35 out of 40 when including O-MRP.
Model-data-driven constitutive responses: application to a multiscale computational framework
Fuhg, Jan Niklas, Boehm, Christoph, Bouklas, Nikolaos, Fau, Amelie, Wriggers, Peter, Marino, Michele
Computational multiscale methods for analyzing and deriving constitutive responses have been used as a tool in engineering problems because of their ability to combine information at different length scales. However, their application in a nonlinear framework can be limited by high computational costs, numerical difficulties, and/or inaccuracies. In this paper, a hybrid methodology is presented which combines classical constitutive laws (model-based), a data-driven correction component, and computational multiscale approaches. A model-based material representation is locally improved with data from lower scales obtained by means of a nonlinear numerical homogenization procedure leading to a model-data-driven approach. Therefore, macroscale simulations explicitly incorporate the true microscale response, maintaining the same level of accuracy that would be obtained with online micro-macro simulations but with a computational cost comparable to classical model-driven approaches. In the proposed approach, both model and data play a fundamental role allowing for the synergistic integration between a physics-based response and a machine learning black-box. Numerical applications are implemented in two dimensions for different tests investigating both material and structural responses in large deformation.