Energy
Learning optimal environments using projected stochastic gradient ascent
Bolland, Adrien, Boukas, Ioannis, Cornet, François, Berger, Mathias, Ernst, Damien
In this work, we generalize the direct policy search algorithms to an algorithm we call Direct Environment Search with (projected stochastic) Gradient Ascent (DESGA). The latter can be used to jointly learn a reinforcement learning (RL) environment and a policy with maximal expected return over a joint hypothesis space of environments and policies. We illustrate the performance of DESGA on two benchmarks. First, we consider a parametrized space of Mass-Spring-Damper (MSD) environments. Then, we use our algorithm for optimizing the size of the components and the operation of a small-scale and autonomous energy system, i.e. a solar off-grid microgrid, composed of photovoltaic panels, batteries, etc.
Careful analysis of XRD patterns with Attention
Kano, Koichi, Segi, Takashi, Ozono, Hiroshi
The important peaks related to the physical properties of a lithium ion rechargeable battery were extracted from the measured X ray diffraction spectrum by a convolutional neural network based on the Attention mechanism. Among the deep features, the lattice constant of the cathodic active material was selected as a cell voltage predictor, and the crystallographic behavior of the active anodic and cathodic materials revealed the rate property during the charge discharge states. The machine learning automatically selected the significant peaks from the experimental spectrum. Applying the Attention mechanism with appropriate objective variables in multi task trained models, one can selectively visualize the correlations between interesting physical properties. As the deep features are automatically defined, this approach can adapt to the conditions of various physical experiments.
Fully probabilistic quasar continua predictions near Lyman-{\alpha} with conditional neural spline flows
Reiman, David M., Tamanas, John, Prochaska, J. Xavier, Ďurovčíková, Dominika
Measurement of the red damping wing of neutral hydrogen in quasar spectra provides a probe of the epoch of reionization in the early Universe. Such quantification requires precise and unbiased estimates of the intrinsic continua near Lyman-$\alpha$ (Ly$\alpha$), a challenging task given the highly variable Ly$\alpha$ emission profiles of quasars. Here, we introduce a fully probabilistic approach to intrinsic continua prediction. We frame the problem as a conditional density estimation task and explicitly model the distribution over plausible blue-side continua ($1190\ \unicode{xC5} \leq \lambda_{\text{rest}} < 1290\ \unicode{xC5}$) conditional on the red-side spectrum ($1290\ \unicode{xC5} \leq \lambda_{\text{rest}} < 2900\ \unicode{xC5}$) using normalizing flows. Our approach achieves state-of-the-art precision and accuracy, allows for sampling one thousand plausible continua in less than a tenth of a second, and can natively provide confidence intervals on the blue-side continua via Monte Carlo sampling. We measure the damping wing effect in two $z>7$ quasars and estimate the volume-averaged neutral fraction of hydrogen from each, finding $\bar{x}_\text{HI}=0.304 \pm 0.042$ for ULAS J1120+0641 ($z=7.09$) and $\bar{x}_\text{HI}=0.384 \pm 0.133$ for ULAS J1342+0928 ($z=7.54$).
The best gifts for your dad, the outdoorsman
As summer quickly approaches, some dads are itching to get outside. Even if the number of places we can go has been reduced due to the pandemic, many will spend hours in their backyards tinkering with home projects, training for a nonexistent triathlon and grilling every chance they get. As Father's Day approaches, here are the best gifts for all the DIY-, camping-, grilling- and sport-loving dads in our lives. A good head lamp is an easy to way upgrade Dad's camping kit. We've recommended BioLite head lamps in the past, and the new HeadLamp 200 is a winner too, not to mention quite affordable. This model's USB rechargeable battery makes it more convenient than traditional head lamps because your dad won't have to worry about having a few AAA batteries on hand: Just plug it in and charge it up.
The future of artificial intelligence -- Neuromorphic computing
Everyone in the field of Artificial Intelligence knows what neural networks are. And most practitioners know the huge processing power and energy consumption needed to train pretty much any noteworthy neural network. That is to say, for the field to develop further, a new type of hardware is needed. Some experts consider that the quantum computer is that hardware. But even though it holds great promise, quantum computing is a technology that will take many decades to develop.
Deep Learning of Dynamic Subsurface Flow via Theory-guided Generative Adversarial Network
Generative adversarial network (GAN) has been shown to be useful in various applications, such as image recognition, text processing and scientific computing, due its strong ability to learn complex data distributions. In this study, a theory-guided generative adversarial network (TgGAN) is proposed to solve dynamic partial differential equations (PDEs). Different from standard GANs, the training term is no longer the true data and the generated data, but rather their residuals. In addition, such theories as governing equations, other physical constraints and engineering controls, are encoded into the loss function of the generator to ensure that the prediction does not only honor the training data, but also obey these theories. TgGAN is proposed for dynamic subsurface flow with heterogeneous model parameters, and the data at each time step are treated as a two-dimensional image. In this study, several numerical cases are introduced to test the performance of the TgGAN. Predicting the future response, label-free learning and learning from noisy data can be realized easily by the TgGAN model. The effects of the number of training data and the collocation points are also discussed. In order to improve the efficiency of TgGAN, the transfer learning algorithm is also employed. Numerical results demonstrate that the TgGAN model is robust and reliable for deep learning of dynamic PDEs.
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Yao, Zhewei, Gholami, Amir, Shen, Sheng, Keutzer, Kurt, Mahoney, Michael W.
We introduce AdaHessian, a second order stochastic optimization algorithm which dynamically incorporates the curvature of the loss function via ADAptive estimates of the Hessian. Second order algorithms are among the most powerful optimization algorithms with superior convergence properties as compared to first order methods such as SGD and ADAM. The main disadvantage of traditional second order methods is their heavier per-iteration computation and poor accuracy as compared to first order methods. To address these, we incorporate several novel approaches in AdaHessian, including: (i) a new variance reduction estimate of the Hessian diagonal with low computational overhead; (ii) a root-mean-square exponential moving average to smooth out variations of the Hessian diagonal across different iterations; and (iii) a block diagonal averaging to reduce the variance of Hessian diagonal elements. We show that AdaHessian achieves new state-of-the-art results by a large margin as compared to other adaptive optimization methods, including variants of ADAM. In particular, we perform extensive tests on CV, NLP, and recommendation system tasks and find that AdaHessian: (i) achieves 1.80\%/1.45\% higher accuracy on ResNets20/32 on Cifar10, and 5.55\% higher accuracy on ImageNet as compared to ADAM; (ii) outperforms ADAMW for transformers by 0.27/0.33 BLEU score on IWSLT14/WMT14 and 1.8/1.0 PPL on PTB/Wikitext-103; and (iii) achieves 0.032\% better score than AdaGrad for DLRM on the Criteo Ad Kaggle dataset. Importantly, we show that the cost per iteration of AdaHessian is comparable to first-order methods, and that it exhibits robustness towards its hyperparameters. The code for AdaHessian is open-sourced and publicly available.
Analog ensemble data assimilation and a method for constructing analogs with variational autoencoders
It is proposed to use analogs of the forecast mean to generate an ensemble of perturbations for use in ensemble optimal interpolation (EnOI) or ensemble variational (EnVar) methods. A new method of constructing analogs using variational autoencoders (VAEs; a machine learning method) is proposed. The resulting analog methods using analogs from a catalog (AnEnOI), and using constructed analogs (cAnEnOI), are tested in the context of a multiscale Lorenz-`96 model, with standard EnOI and an ensemble square root filter for comparison. The use of analogs from a modestly-sized catalog is shown to improve the performance of EnOI, with limited marginal improvements resulting from increases in the catalog size. The method using constructed analogs (cAnEnOI) is found to perform as well as a full ensemble square root filter, and to be robust over a wide range of tuning parameters.
Using competency questions to select optimal clustering structures for residential energy consumption patterns
Toussaint, Wiebke, Moodley, Deshendran
During cluster analysis domain experts and visual analysis are frequently relied on to identify the optimal clustering structure. This process tends to be adhoc, subjective and difficult to reproduce. This work shows how competency questions can be used to formalise expert knowledge and application requirements for context specific evaluation of a clustering application in the residential energy consumption sector. While cluster analysis is an established unsupervised machine learning technique, identifying the optimal set of clusters for a specific application requires extensive experimentation and domain knowledge. Cluster compactness and distinctness are two important attributes that characterise a good cluster set (Sarle et al., 1990) and different metrics, such as the Mean Index Adequacy (MIA), Davies-Bouldin Index (DBI) and the Silhouette Index have been proposed to measure cluster compactness and distinctness.
A nonlocal physics-informed deep learning framework using the peridynamic differential operator
Haghighat, Ehsan, Bekar, Ali Can, Madenci, Erdogan, Juanes, Ruben
The Physics-Informed Neural Network (PINN) framework introduced recently incorporates physics into deep learning, and offers a promising avenue for the solution of partial differential equations (PDEs) as well as identification of the equation parameters. The performance of existing PINN approaches, however, may degrade in the presence of sharp gradients, as a result of the inability of the network to capture the solution behavior globally. We posit that this shortcoming may be remedied by introducing long-range (nonlocal) interactions into the network's input, in addition to the short-range (local) space and time variables. Following this ansatz, here we develop a nonlocal PINN approach using the Peridynamic Differential Operator (PDDO)---a numerical method which incorporates long-range interactions and removes spatial derivatives in the governing equations. Because the PDDO functions can be readily incorporated in the neural network architecture, the nonlocality does not degrade the performance of modern deep-learning algorithms. We apply nonlocal PDDO-PINN to the solution and identification of material parameters in solid mechanics and, specifically, to elastoplastic deformation in a domain subjected to indentation by a rigid punch, for which the mixed displacement--traction boundary condition leads to localized deformation and sharp gradients in the solution. We document the superior behavior of nonlocal PINN with respect to local PINN in both solution accuracy and parameter inference, illustrating its potential for simulation and discovery of partial differential equations whose solution develops sharp gradients.