Energy
Stanford AI Detection System Could Predict Earthquakes
A group of researchers unveiled a new method for using artificial intelligence (AI) to enhance our ability to read seismic waves and, in doing so, improve our understanding of how they begin, and even how they come to a stop. Published in Nature Communications, the paper details a method that automates earthquake detection at the same time as tuning out much of the noise inherent to seismic data. Mostafa Mousavi and a team of researchers use artificial intelligence to focus on millions of tiny subtle shifts in the Earth's crust. They hope that these tiny movements might act as a Rosetta Stone of sorts for deciphering warning signs for big earthquakes. "By improving our ability to detect and locate these very small earthquakes, we can get a clearer view of how earthquakes interact or spread out along the fault, how they get started, even how they stop," Stanford geophysicist Gregory Beroza, one of the paper's authors, explained in a Stanford University press release.
Take a Chance: Managing the Exploitation-Exploration Dilemma in Customs Fraud Detection via Online Active Learning
Kim, Sundong, Mai, Tung-Duong, Khanh, Thi Nguyen Duc, Han, Sungwon, Park, Sungwon, Singh, Karandeep, Cha, Meeyoung
Continual labeling of training examples is a costly task in supervised learning. Active learning strategies mitigate this cost by identifying unlabeled data that are considered the most useful for training a predictive model. However, sample selection via active learning may lead to an exploitation-exploration dilemma. In online settings, profitable items can be neglected when uncertain items are annotated instead. To illustrate this dilemma, we study a human-in-the-loop customs selection scenario where an AI-based system supports customs officers by providing a set of imports to be inspected. If the inspected items are fraud, officers levy extra duties, and these items will be used as additional training data for the next iterations. Inspecting highly suspicious items will inevitably lead to additional customs revenue, yet they may not give any extra knowledge to customs officers. On the other hand, inspecting uncertain items will help customs officers to acquire new knowledge, which will be used as supplementary training resources to update their selection systems. Through years of customs selection simulation, we show that some exploration is needed to cope with the domain shift, and our hybrid strategy of selecting fraud and uncertain items will eventually outperform the performance of the exploitation strategy.
The DigitalTwin from an Artificial Intelligence Perspective
Niggemann, Oliver, Diedrich, Alexander, Kuehnert, Christian, Pfannstiel, Erik, Schraven, Joshua
But two main contradictions remain: First, AI/ML are very heterogeneous, and Services for Cyber-Physical Systems based on Artificial each AI/ML method comes with a specialized model Intelligence and Machine Learning require formalism to capture relevant aspects of the environment a virtual representation of the physical. To reduce and the application domain. Hence, the modeling efforts and to synchronize results, for each question is how a DigitalTwin can provide the correct system, a common and unique virtual representation model to each AI/ML method. The second used by all services during the whole system contradiction is that AI/ML requires explicit, i.e. life-cycle is needed--i.e. a DigitalTwin. In this paper by an algorithm processable knowledge, since compiled such a DigitalTwin, namely the AI reference knowledge in form of simulation libraries, raw model AITwin, is defined. This reference model is data or executables does not help. But most publications verified by using a running example from process refer to these kind of information.
Parameterized Neural Ordinary Differential Equations: Applications to Computational Physics Problems
Such examples include predicting input/output responses, design, and optimization [55]. These ODEs and their solutions often depend on a set of input parameters, and such ODEs are denoted as parameterized ODEs. Examples of such input parameters within the context of fluid dynamics include Reynolds number and Mach number. In many important scenarios, high-fidelity solutions of parameterized ODEs are required to be computed i) for many different input parameter instances (i.e., many-query scenario) or ii) in real time on a new input parameter instance. A single run of a high-fidelity simulation, however, often requires fine spatiotemporal resolutions. Consequently, performing real-time or multiple runs of a high-fidelity simulation can be computationally prohibitive. To mitigate this computational burden, many model-order reduction approaches have been proposed to replace costly high-fidelity simulations. The common goal of these approaches is to build a reduced-dynamical model with lower complexity than that of the high-fidelity model, and to use the reduced model to compute approximate solutions for any new input parameter instance. In general, model-order reduction approaches consist of two components: i) a low-dimensional latent-dynamics model, where the computational complexity is very low, and ii) a (non)linear mapping that constructs high-dimensional approximate states (i.e., solutions) from the low-dimensional states obtained from the latent-dynamics model.
Practical Quasi-Newton Methods for Training Deep Neural Networks
Goldfarb, Donald, Ren, Yi, Bahamou, Achraf
We consider the development of practical stochastic quasi-Newton, and in particular Kronecker-factored block-diagonal BFGS and L-BFGS methods, for training deep neural networks (DNNs). In DNN training, the number of variables and components of the gradient $n$ is often of the order of tens of millions and the Hessian has $n^2$ elements. Consequently, computing and storing a full $n \times n$ BFGS approximation or storing a modest number of (step, change in gradient) vector pairs for use in an L-BFGS implementation is out of the question. In our proposed methods, we approximate the Hessian by a block-diagonal matrix and use the structure of the gradient and Hessian to further approximate these blocks, each of which corresponds to a layer, as the Kronecker product of two much smaller matrices. This is analogous to the approach in KFAC, which computes a Kronecker-factored block-diagonal approximation to the Fisher matrix in a stochastic natural gradient method. Because the indefinite and highly variable nature of the Hessian in a DNN, we also propose a new damping approach to keep the upper as well as the lower bounds of the BFGS and L-BFGS approximations bounded. In tests on autoencoder feed-forward neural network models with either nine or thirteen layers applied to three datasets, our methods outperformed or performed comparably to KFAC and state-of-the-art first-order stochastic methods.
Probabilistic learning on manifolds constrained by nonlinear partial differential equations for small datasets
Soize, Christian, Ghanem, Roger
A novel extension of the Probabilistic Learning on Manifolds (PLoM) is presented. It makes it possible to synthesize solutions to a wide range of nonlinear stochastic boundary value problems described by partial differential equations (PDEs) for which a stochastic computational model (SCM) is available and depends on a vector-valued random control parameter. The cost of a single numerical evaluation of this SCM is assumed to be such that only a limited number of points can be computed for constructing the training dataset (small data). Each point of the training dataset is made up realizations from a vector-valued stochastic process (the stochastic solution) and the associated random control parameter on which it depends. The presented PLoM constrained by PDE allows for generating a large number of learned realizations of the stochastic process and its corresponding random control parameter. These learned realizations are generated so as to minimize the vector-valued random residual of the PDE in the mean-square sense. Appropriate novel methods are developed to solve this challenging problem. Three applications are presented. The first one is a simple uncertain nonlinear dynamical system with a nonstationary stochastic excitation. The second one concerns the 2D nonlinear unsteady Navier-Stokes equations for incompressible flows in which the Reynolds number is the random control parameter. The last one deals with the nonlinear dynamics of a 3D elastic structure with uncertainties. The results obtained make it possible to validate the PLoM constrained by stochastic PDE but also provide further validation of the PLoM without constraint.
Data-driven prediction of multistable systems from sparse measurements
Chu, Bryan, Farazmand, Mohammad
We develop a data-driven method, based on semi-supervised classification, to predict the asymptotic state of multistable systems when only sparse spatial measurements of the system are feasible. Our method predicts the asymptotic behavior of an observed state by quantifying its proximity to the states in a precomputed library of data. To quantify this proximity, we introduce a sparsity-promoting metric-learning (SPML) optimization, which learns a metric directly from the precomputed data. The resulting metric has two important properties: (i) It is compatible with the precomputed library, and (ii) It is computable from sparse measurements. We demonstrate the application of this method on a multistable reaction-diffusion equation which has four asymptotically stable steady states. Classifications based on SPML predict the asymptotic behavior of initial conditions, based on two-point measurements, with over $89\%$ accuracy. The learned optimal metric also determines where these measurements need to be made to ensure accurate predictions.
Improving seasonal forecast using probabilistic deep learning
Pan, Baoxiang, Anderson, Gemma J., Goncalves, AndrE, Lucas, Donald D., Bonfils, CEline J. W., Lee, Jiwoo
The path toward realizing the potential of seasonal forecasting and its socioeconomic benefits depends heavily on improving general circulation model based dynamical forecasting systems. To improve dynamical seasonal forecast, it is crucial to set up forecast benchmarks, and clarify forecast limitations posed by model initialization errors, formulation deficiencies, and internal climate variability. With huge cost in generating large forecast ensembles, and limited observations for forecast verification, the seasonal forecast benchmarking and diagnosing task proves challenging. In this study, we develop a probabilistic deep neural network model, drawing on a wealth of existing climate simulations to enhance seasonal forecast capability and forecast diagnosis. By leveraging complex physical relationships encoded in climate simulations, our probabilistic forecast model demonstrates favorable deterministic and probabilistic skill compared to state-of-the-art dynamical forecast systems in quasi-global seasonal forecast of precipitation and near-surface temperature. We apply this probabilistic forecast methodology to quantify the impacts of initialization errors and model formulation deficiencies in a dynamical seasonal forecasting system. We introduce the saliency analysis approach to efficiently identify the key predictors that influence seasonal variability. Furthermore, by explicitly modeling uncertainty using variational Bayes, we give a more definitive answer to how the El Nino/Southern Oscillation, the dominant mode of seasonal variability, modulates global seasonal predictability.
Scientific intuition inspired by machine learning generated hypotheses
Friederich, Pascal, Krenn, Mario, Tamblyn, Isaac, Aspuru-Guzik, Alan
Machine learning with application to questions in the physical sciences has become a widely used tool, successfully applied to classification, regression and optimization tasks in many areas. Research focus mostly lies in improving the accuracy of the machine learning models in numerical predictions, while scientific understanding is still almost exclusively generated by human researchers analysing numerical results and drawing conclusions. In this work, we shift the focus on the insights and the knowledge obtained by the machine learning models themselves. In particular, we study how it can be extracted and used to inspire human scientists to increase their intuitions and understanding of natural systems. We apply gradient boosting in decision trees to extract human interpretable insights from big data sets from chemistry and physics. In chemistry, we not only rediscover widely know rules of thumb but also find new interesting motifs that tell us how to control solubility and energy levels of organic molecules. At the same time, in quantum physics, we gain new understanding on experiments for quantum entanglement. The ability to go beyond numerics and to enter the realm of scientific insight and hypothesis generation opens the door to use machine learning to accelerate the discovery of conceptual understanding in some of the most challenging domains of science.
ExPAN(N)D: Exploring Posits for Efficient Artificial Neural Network Design in FPGA-based Systems
Nambi, Suresh, Ullah, Salim, Lohana, Aditya, Sahoo, Siva Satyendra, Merchant, Farhad, Kumar, Akash
The recent advances in machine learning, in general, and Artificial Neural Networks (ANN), in particular, has made smart embedded systems an attractive option for a larger number of application areas. However, the high computational complexity, memory footprints, and energy requirements of machine learning models hinder their deployment on resource-constrained embedded systems. Most state-of-the-art works have considered this problem by proposing various low bit-width data representation schemes, optimized arithmetic operators' implementations, and different complexity reduction techniques such as network pruning. To further elevate the implementation gains offered by these individual techniques, there is a need to cross-examine and combine these techniques' unique features. This paper presents ExPAN(N)D, a framework to analyze and ingather the efficacy of the Posit number representation scheme and the efficiency of fixed-point arithmetic implementations for ANNs. The Posit scheme offers a better dynamic range and higher precision for various applications than IEEE $754$ single-precision floating-point format. However, due to the dynamic nature of the various fields of the Posit scheme, the corresponding arithmetic circuits have higher critical path delay and resource requirements than the single-precision-based arithmetic units. Towards this end, we propose a novel Posit to fixed-point converter for enabling high-performance and energy-efficient hardware implementations for ANNs with minimal drop in the output accuracy. We also propose a modified Posit-based representation to store the trained parameters of a network. Compared to an $8$-bit fixed-point-based inference accelerator, our proposed implementation offers $\approx46\%$ and $\approx18\%$ reductions in the storage requirements of the parameters and energy consumption of the MAC units, respectively.