Goto

Collaborating Authors

 Energy


Modeling Continuous Stochastic Processes with Dynamic Normalizing Flows

arXiv.org Machine Learning

Normalizing flows transform a simple base distribution into a complex target distribution and have proved to be powerful models for data generation and density estimation. In this work, we propose a novel type of normalizing flow driven by a differential deformation of the continuous-time Wiener process. As a result, we obtain a rich time series model whose observable process inherits many of the appealing properties of its base process, such as efficient computation of likelihoods and marginals. Furthermore, our continuous treatment provides a natural framework for irregular time series with an independent arrival process, including straightforward interpolation. We illustrate the desirable properties of the proposed model on popular stochastic processes and demonstrate its superior flexibility to variational RNN and latent ODE baselines in a series of experiments on synthetic and real-world data.


A Comparative Study of Machine Learning Models for Predicting the State of Reactive Mixing

arXiv.org Machine Learning

Accurate predictions of reactive mixing are critical for many Earth and environmental science problems. To investigate mixing dynamics over time under different scenarios, a high-fidelity, finite-element-based numerical model is built to solve the fast, irreversible bimolecular reaction-diffusion equations to simulate a range of reactive-mixing scenarios. A total of 2,315 simulations are performed using different sets of model input parameters comprising various spatial scales of vortex structures in the velocity field, time-scales associated with velocity oscillations, the perturbation parameter for the vortex-based velocity, anisotropic dispersion contrast, and molecular diffusion. Outputs comprise concentration profiles of the reactants and products. The inputs and outputs of these simulations are concatenated into feature and label matrices, respectively, to train 20 different machine learning (ML) emulators to approximate system behavior. The 20 ML emulators based on linear methods, Bayesian methods, ensemble learning methods, and multilayer perceptron (MLP), are compared to assess these models. The ML emulators are specifically trained to classify the state of mixing and predict three quantities of interest (QoIs) characterizing species production, decay, and degree of mixing. Linear classifiers and regressors fail to reproduce the QoIs; however, ensemble methods (classifiers and regressors) and the MLP accurately classify the state of reactive mixing and the QoIs. Among ensemble methods, random forest and decision-tree-based AdaBoost faithfully predict the QoIs. At run time, trained ML emulators are $\approx10^5$ times faster than the high-fidelity numerical simulations. Speed and accuracy of the ensemble and MLP models facilitate uncertainty quantification, which usually requires 1,000s of model run, to estimate the uncertainty bounds on the QoIs.


New Project at Jefferson Lab Aims to Use Machine Learning to Improve Up-Time of Particle Accelerators

#artificialintelligence

NEWPORT NEWS, Va., Jan. 30, 2020 โ€“ More than 1,600 nuclear physicists worldwide depend on the Continuous Electron Beam Accelerator Facility for their research. Located at the Department of Energy's Thomas Jefferson National Accelerator Facility in Newport News, Va., CEBAF is a DOE User Facility that is scheduled to conduct research for limited periods each year, so it must perform at its best during each scheduled run. But glitches in any one of CEBAF's tens of thousands of components can cause the particle accelerator to temporarily fault and interrupt beam delivery, sometimes by mere seconds but other times by many hours. Now, accelerator scientists are turning to machine learning in hopes that they can more quickly recover CEBAF from faults and one day even prevent them. Anna Shabalina is a Jefferson Lab staff member and principal investigator on the project, which has been funded by the Laboratory Directed Research & Development program for the fiscal year 2020.


Unsupervised Denoising for Satellite Imagery using Wavelet Subband CycleGAN

arXiv.org Machine Learning

Multi-spectral satellite imaging sensors acquire various spectral band images such as red (R), green (G), blue (B), near-infrared (N), etc. Thanks to the unique spectroscopic property of each spectral band with respective to the objects on the ground, multi-spectral satellite imagery can be used for various geological survey applications. Unfortunately, image artifacts from imaging sensor noises often affect the quality of scenes and have negative impacts on the applications of satellite imagery. Recently, deep learning approaches have been extensively explored for the removal of noises in satellite imagery. Most deep learning denoising methods, however, follow a supervised learning scheme, which requires matched noisy image and clean image pairs that are difficult to collect in real situations. In this paper, we propose a novel unsupervised multispectral denoising method for satellite imagery using wavelet subband cycle-consistent adversarial network (WavCycleGAN). The proposed method is based on unsupervised learning scheme using adversarial loss and cycle-consistency loss to overcome the lack of paired data. Moreover, in contrast to the standard image domain cycleGAN, we introduce a wavelet subband domain learning scheme for effective denoising without sacrificing high frequency components such as edges and detail information. Experimental results for the removal of vertical stripe and wave noises in satellite imaging sensors demonstrate that the proposed method effectively removes noises and preserves important high frequency features of satellite images.


Stargazing with computers: What machine learning can teach us about the cosmos

#artificialintelligence

Gazing up at the night sky in a rural area, you'll probably see the shining moon surrounded by stars. If you're lucky, you might spot the furthest thing visible with the naked eye--the Andromeda galaxy. When the Department of Energy's (DOE) Legacy Survey of Space and Time (LSST) Camera at the National Science Foundation's Vera Rubin Observatory turns on in 2022, it will take photos of 37 billion galaxies and stars over the course of a decade. The output from this huge telescope will swamp researchers with data. In those 10 years, the LSST Camera will take 2,000 photos for each patch of the Southern Sky it covers.


33 unusual problems that can be solved with data science

#artificialintelligence

Automated translation, including translating one programming language into another one (for instance, SQL to Python - the converse is not possible) Spell checks, especially for people writing in multiple languages - lot's of progress to be made here, including automatically recognizing the language when you type, and stop trying to correct the same word every single time (some browsers have tried to change Ning to Nong hundreds of times, and I have no idea why after 50 failures they continue to try - I call this machine unlearning) Detection of earth-like planets - focus on planetary systems with many planets to increase odds of finding inhabitable planets, rather than stars and planets matching our Sun and Earth Distinguishing between noise and signal on millions of NASA pictures or videos, to identify patterns Automated piloting (drones, cars without pilots) Customized, patient-specific medications and diets Predicting and legally manipulating elections Predicting oil demand, oil ...


How to Develop an Imbalanced Classification Model to Detect Oil Spills

#artificialintelligence

Many imbalanced classification tasks require a skillful model that predicts a crisp class label, where both classes are equally important. An example of an imbalanced classification problem where a class label is required and both classes are equally important is the detection of oil spills or slicks in satellite images. The detection of a spill requires mobilizing an expensive response, and missing an event is equally expensive, causing damage to the environment. One way to evaluate imbalanced classification models that predict crisp labels is to calculate the separate accuracy on the positive class and the negative class, referred to as sensitivity and specificity. These two measures can then be averaged using the geometric mean, referred to as the G-mean, that is insensitive to the skewed class distribution and correctly reports on the skill of the model on both classes. In this tutorial, you will discover how to develop a model to predict the presence of an oil spill in satellite images and evaluate it using the G-mean metric. Develop an Imbalanced Classification Model to Detect Oil Spills Photo by Lenny K Photography, some rights reserved. In this project, we will use a standard imbalanced machine learning dataset referred to as the "oil spill" dataset, "oil slicks" dataset or simply "oil."


Video shows ultra-fast robot wings that are powered by sunlight

#artificialintelligence

You've heard of robotic bees, but have you heard of robotic butterflies? Chinese researchers have published a study that focuses on their efforts to develop solar-powered wings that imitate the flapping motion of a butterfly. They were able to develop wings that can do this at a rapid rate using light-driven actuators, and a new video shows all of the different ways they can utilize what they've created. The study was published in the journal ACS Applied Materials & Interfaces on January 16th, and a video put out on Wednesday explains how the project came together. When the wing was exposed to the heat of a strong light source, much like the Sun, the polymer layer on the bottom expanded significantly more than the metallic layer on the top, which caused the wing curl.


Learning Cost Functions for Optimal Transport

arXiv.org Machine Learning

Learning the cost function for optimal transport from observed transport plan or its samples has been cast as a bi-level optimization problem. In this paper, we derive an unconstrained convex optimization formulation for the problem which can be further augmented by any customizable regularization. This novel framework avoids repeatedly solving a forward optimal transport problem in each iteration which has been a thorny computational bottleneck for the bi-level optimization approach. To validate the effectiveness of this framework, we develop two numerical algorithms, one is a fast matrix scaling method based on the Sinkhorn-Knopp algorithm for the discrete case, and the other is a supervised learning algorithm that realizes the cost function as a deep neural network in the continuous case. Numerical results demonstrate promising efficiency and accuracy advantages of the proposed algorithms over existing state of the art methods.


A new hybrid approach for crude oil price forecasting: Evidence from multi-scale data

arXiv.org Machine Learning

Faced with the growing research towards crude oil price fluctuations influential factors following the accelerated development of Internet technology, accessible data such as Google search volume index are increasingly quantified and incorporated into forecasting approaches. In this paper, we apply multi-scale data that including both GSVI data and traditional economic data related to crude oil price as independent variables and propose a new hybrid approach for monthly crude oil price forecasting. This hybrid approach, based on divide and conquer strategy, consists of K-means method, kernel principal component analysis and kernel extreme learning machine , where K-means method is adopted to divide input data into certain clusters, KPCA is applied to reduce dimension, and KELM is employed for final crude oil price forecasting. The empirical result can be analyzed from data and method levels. At the data level, GSVI data perform better than economic data in level forecasting accuracy but with opposite performance in directional forecasting accuracy because of Herd Behavior, while hybrid data combined their advantages and obtain best forecasting performance in both level and directional accuracy. At the method level, the approaches with K-means perform better than those without K-means, which demonstrates that divide and conquer strategy can effectively improve the forecasting performance.