random initial condition
Robustness Test for AI Forecasting of Hurricane Florence Using FourCastNetv2 and Random Perturbations of the Initial Condition
Lizerbram, Adam, Stevenson, Shane, Khadir, Iman, Tu, Matthew, Shen, Samuel S. P.
Understanding the robustness of a weather forecasting model with respect to input noise or different uncertainties is important in assessing its output reliability, particularly for extreme weather events like hurricanes. In this paper, we test sensitivity and robustness of an artificial intelligence (AI) weather forecasting model: NVIDIAs FourCastNetv2 (FCNv2). We conduct two experiments designed to assess model output under different levels of injected noise in the models initial condition. First, we perturb the initial condition of Hurricane Florence from the European Centre for Medium-Range Weather Forecasts (ECMWF) Reanalysis v5 (ERA5) dataset (September 13-16, 2018) with varying amounts of Gaussian noise and examine the impact on predicted trajectories and forecasted storm intensity. Second, we start FCNv2 with fully random initial conditions and observe how the model responds to nonsensical inputs. Our results indicate that FCNv2 accurately preserves hurricane features under low to moderate noise injection. Even under high levels of noise, the model maintains the general storm trajectory and structure, although positional accuracy begins to degrade. FCNv2 consistently underestimates storm intensity and persistence across all levels of injected noise. With full random initial conditions, the model generates smooth and cohesive forecasts after a few timesteps, implying the models tendency towards stable, smoothed outputs. Our approach is simple and portable to other data-driven AI weather forecasting models.
- North America > United States > California > San Diego County > San Diego (0.05)
- Atlantic Ocean (0.04)
- Asia > India > NCT > New Delhi (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > India > NCT > New Delhi (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Unravelling the Performance of Physics-informed Graph Neural Networks for Dynamical Systems
Thangamuthu, Abishek, Kumar, Gunjan, Bishnoi, Suresh, Bhattoo, Ravinder, Krishnan, N M Anoop, Ranu, Sayan
Recently, graph neural networks have been gaining a lot of attention to simulate dynamical systems due to their inductive nature leading to zero-shot generalizability. Similarly, physics-informed inductive biases in deep-learning frameworks have been shown to give superior performance in learning the dynamics of physical systems. There is a growing volume of literature that attempts to combine these two approaches. Here, we evaluate the performance of thirteen different graph neural networks, namely, Hamiltonian and Lagrangian graph neural networks, graph neural ODE, and their variants with explicit constraints and different architectures. We briefly explain the theoretical formulation highlighting the similarities and differences in the inductive biases and graph architecture of these systems. We evaluate these models on spring, pendulum, gravitational, and 3D deformable solid systems to compare the performance in terms of rollout error, conserved quantities such as energy and momentum, and generalizability to unseen system sizes. Our study demonstrates that GNNs with additional inductive biases, such as explicit constraints and decoupling of kinetic and potential energies, exhibit significantly enhanced performance. Further, all the physics-informed GNNs exhibit zero-shot generalizability to system sizes an order of magnitude larger than the training system, thus providing a promising route to simulate large-scale realistic systems.
- Asia > India > NCT > New Delhi (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Large-N dynamics of the spiked tensor model with random initial conditions
Non-convex multidimensional optimization and the related problem of finding the global minimum in rough landscapes are crucial challenges of modern science. Such problems were extensively studied in the context of the spin-glass systems [1], and found applications in biology [2], finance [3], and data science [4]. Here, we focus on a task motivated by data science and consider the model of the signal recovering from a noisy high-dimensional data tensor - the spiked tensor model (tensor PCA) [5, 6, 7].
- Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.05)
- North America > Canada > Alberta > Census Division No. 12 > Lac La Biche County (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
Saxe, Andrew M., McClelland, James L., Ganguli, Surya
Despite the widespread practical success of deep learning methods, our theoretical understanding of the dynamics of learning in deep neural networks remains quite sparse. We attempt to bridge the gap between the theory and practice of deep learning by systematically analyzing learning dynamics for the restricted case of deep linear neural networks. Despite the linearity of their input-output map, such networks have nonlinear gradient descent dynamics on weights that change with the addition of each new hidden layer. We show that deep linear networks exhibit nonlinear learning phenomena similar to those seen in simulations of nonlinear networks, including long plateaus followed by rapid transitions to lower error solutions, and faster convergence from greedy unsupervised pretraining initial conditions than from random initial conditions. We provide an analytical description of these phenomena by finding new exact solutions to the nonlinear dynamics of deep learning. Our theoretical analysis also reveals the surprising finding that as the depth of a network approaches infinity, learning speed can nevertheless remain finite: for a special class of initial conditions on the weights, very deep networks incur only a finite, depth independent, delay in learning speed relative to shallow networks. We show that, under certain conditions on the training data, unsupervised pretraining can find this special class of initial conditions, while scaled random Gaussian initializations cannot. We further exhibit a new class of random orthogonal initial conditions on weights that, like unsupervised pre-training, enjoys depth independent learning times. We further show that these initial conditions also lead to faithful propagation of gradients even in deep nonlinear networks, as long as they operate in a special regime known as the edge of chaos.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- (2 more...)