Goto

Collaborating Authors

 Energy


Recurrent Neural Network-based Model for Accelerated Trajectory Analysis in AIMD Simulations

arXiv.org Artificial Intelligence

The presented work demonstrates the training of recurrent neural networks (RNNs) from distributions of atom coordinates in solid state structures that were obtained using ab initio molecular dynamics (AIMD) simulations. AIMD simulations on solid state structures are treated as a multi-variate time-series problem. By referring interactions between atoms over the simulation time to temporary correlations among them, RNNs find patterns in the multi-variate time-dependent data, which enable forecasting trajectory paths and potential energy profiles. Two types of RNNs, namely gated recurrent unit and long short-term memory networks, are considered. The model is described and compared against a baseline AIMD simulation on an iridium oxide slab. Findings demonstrate that both networks can potentially be harnessed for accelerated statistical sampling in computational materials research.


Message Scheduling for Performant, Many-Core Belief Propagation

arXiv.org Artificial Intelligence

--Belief Propagation (BP) is a message-passing algorithm for approximate inference over Probabilistic Graphical Models (PGMs), finding many applications such as computer vision, error-correcting codes, and protein-folding. While general, the convergence and speed of the algorithm has limited its practical use on difficult inference problems. As an algorithm that is highly amenable to parallelization, many-core Graphical Processing Units (GPUs) could significantly improve BP performance. Improving BP through many-core systems is nontrivial: the scheduling of messages in the algorithm strongly affects performance. We present a study of message scheduling for BP on GPUs. We demonstrate that BP exhibits a tradeoff between speed and convergence based on parallelism and show that existing message schedulings are not able to utilize this tradeoff. T o this end, we present a novel randomized message scheduling approach, Randomized BP (RnBP), which outperforms existing methods on the GPU. I NTRODUCTION Probabilistic Graphical Models (PGMs) are powerful, general machine learning models that encode distributions over random variables. PGM Inference, in which we seek to compute some probabilistic beliefs within the system modeled by the PGM, is in general an intractable problem, leading to dependence on approximate algorithms. Belief Propagation (BP) is a widely employed approximate inference algorithms for PGMs [1].


Graph-Partitioning-Based Diffusion Convolution Recurrent Neural Network for Large-Scale Traffic Forecasting

arXiv.org Machine Learning

Traffic forecasting approaches are critical to developing adaptive strategies for mobility. Traffic patterns have complex spatial and temporal dependencies that make accurate forecasting on large highway networks a challenging task. Recently, diffusion convolutional recurrent neural networks (DCRNNs) have achieved state-of-the-art results in traffic forecasting by capturing the spatiotemporal dynamics of the traffic. Despite the promising results, adopting DCRNN for large highway networks still remains elusive because of computational and memory bottlenecks. We present an approach to apply DCRNN for a large highway network. We use a graph-partitioning approach to decompose a large highway network into smaller networks and train them simultaneously on a cluster with graphics processing units (GPU). For the first time, we forecast the traffic of the entire California highway network with 11,160 traffic sensor locations simultaneously. We show that our approach can be trained within 3 hours of wall-clock time using 64 GPUs to forecast speed with high accuracy. Further improvements in the accuracy are attained by including overlapping sensor locations from nearby partitions and finding high-performing hyperparameter configurations for the DCRNN using DeepHyper, a hyperparameter tuning package. We demonstrate that a single DCRNN model can be used to train and forecast the speed and flow simultaneously and the results preserve fundamental traffic flow dynamics. We expect our approach for modeling a large highway network in short wall-clock time as a potential core capability in advanced highway traffic monitoring systems, where forecasts can be used to adjust traffic management strategies proactively given anticipated future conditions.


Exascale Deep Learning for Scientific Inverse Problems

arXiv.org Machine Learning

We introduce novel communication strategies in synchronous distributed Deep Learning consisting of decentralized gradient reduction orchestration and computational graph-aware grouping of gradient tensors. Networks (DNN) models and data sets (Dai et al., 2019), the need for efficient distributed machine learning strategies on massively parallel systems is more significant than On small to moderate-scale systems, with 10's - 100's of GPU/TPU accelerators, these scaling inefficiencies can be difficult to detect and systematically optimize due to system noise and load variability. The scaling inefficiencies of data-parallel implementations are most readily apparent on large-scale systems such as supercomputers with 1,000's-10,000's of accelerators. Extending data-parallelism to the massive scale of super-computing systems is also motivated by the latter's traditional workload consisting of scientific numerical simulations (Kent & Kotliar, 2018). NVLink interconnect, supporting a (peak) bidirectional bandwidth of 100 GB/s, where each 3 V100 GPUs are grouped in a ring topology with all-to-all connections to a POWER9 CPU.


What Trends Are Shaping AI In Energy This Year? 9 Experts Share Their Insights - Disruptor Daily

#artificialintelligence

What other trends are shaping the future of energy extraction, refinement, and consumption. These industry insiders provided their takes on the #1 trend shaping energy this year, and into the future. "Some areas where we see nascent AI is in predictive maintenance and asset monitoring. There are a few who are beginning to look at utilizing AI to analyze images from drones for surveillance and also for acoustic listening." "Advances in the'time-series' AI world (as opposed to AI for images or audio) are shaping the energy industry today. These include techniques for time series forecasting, anomaly detection, optimization etc. Specifically, probabilistic techniques and algorithms are showing significant improvements and becoming the driver of the next wave of optimization and value creation. These techniques augment today's unilateral AI predictions with additional information about the confidence in these predictions. This is not unlike the trend shaping the peer to peer transportation industry."


Artificial Intelligence/Machine Learning are rapidly changing. The materials research community is just beginning to utilize AI and ML in the research process, and it is already clear that this represents a potentially game changing development.

#artificialintelligence

Dr. Benji Maruyama is a Principal Materials Research Engineer in the Air Force Research Laboratory, Materials & Manufacturing Directorate. He is the Leader of the Flexible Materials and Processes Research Team, and leads research on the synthesis and processing science of carbon nanotubes. Dr. Maruyama created and is developing a new method research: Autonomous Research Systems for Materials Development. He is also the point of contact for carbon materials for the Materials and Manufacturing Directorate. His background and interests include carbon nanomaterials, energy storage, field emission, carbon, polymer and metal matrix composites, imaging of complex 3D microstructures and combinatorial experimentation.


20 quantum computing companies making mind-blowing breakthroughs

#artificialintelligence

Consider, for example, that the temperature of most quantum processing chips must be kept as close to absolute zero (roughly -460 degrees Fahrenheit) as possible. Or that some physicists think quantum computing is "the first technology that allows useful tasks to be performed in collaboration between parallel universes." Or that a quantum computer recently "made history go backward." True, it was only a simulation, but still -- brain blowing stuff. Before we get carried away, though, let's consider the foundational basics. Classical computers operate using binary bits, storing data and running processes using ones and zeroes.


Better Language Models and Their Implications

#artificialintelligence

We've trained a large-scale unsupervised language model which generates coherent paragraphs of text, achieves state-of-the-art performance on many language modeling benchmarks, and performs rudimentary reading comprehension, machine translation, question answering, and summarization--all without task-specific training. Our model, called GPT-2 (a successor to GPT), was trained simply to predict the next word in 40GB of Internet text. Due to our concerns about malicious applications of the technology, we are not releasing the trained model. As an experiment in responsible disclosure, we are instead releasing a much smaller model for researchers to experiment with, as well as a technical paper. GPT-2 is a large transformer-based language model with 1.5 billion parameters, trained on a dataset[1] of 8 million web pages. GPT-2 is trained with a simple objective: predict the next word, given all of the previous words within some text. The diversity of the dataset causes this simple goal to contain naturally occurring demonstrations of many tasks across diverse domains. GPT-2 is a direct scale-up of GPT, with more than 10X the parameters and trained on more than 10X the amount of data. GPT-2 displays a broad set of capabilities, including the ability to generate conditional synthetic text samples of unprecedented quality, where we prime the model with an input and have it generate a lengthy continuation. In addition, GPT-2 outperforms other language models trained on specific domains (like Wikipedia, news, or books) without needing to use these domain-specific training datasets. On language tasks like question answering, reading comprehension, summarization, and translation, GPT-2 begins to learn these tasks from the raw text, using no task-specific training data.


AI and Machine Learning for Non Technical People

#artificialintelligence

Want to watch this again later? Sign in to add this video to a playlist. Report Need to report the video? Sign in to report inappropriate content. Report Need to report the video?


Verified Uncertainty Calibration

arXiv.org Machine Learning

Applications such as weather forecasting and personalized medicine demand models that output calibrated probability estimates - those representative of the true likelihood of a prediction. Most models are not calibrated out of the box but are recalibrated by post-processing model outputs. We find in this work that popular recalibration methods like Platt scaling and temperature scaling, are (i) less calibrated than reported and (ii) current techniques cannot estimate how miscalibrated they are. An alternative method, histogram binning, has measurable calibration error but is sample inefficient - it requires $O(B/\epsilon^2)$ samples, compared to $O(1/\epsilon^2)$ for scaling methods, where $B$ is the number of distinct probabilities the model can output. To get the best of both worlds, we introduce the scaling-binning calibrator, which first fits a parametric function that acts like a baseline for variance reduction and then bins the function values to actually ensure calibration. This requires only $O(1/\epsilon^2 + B)$ samples. We then show that methods used to estimate calibration error are suboptimal - we prove that an alternative estimator introduced in the meteorological community requires fewer samples - samples proportional to $\sqrt{B}$ instead of $B$. We validate our approach with multiclass calibration experiments on CIFAR-10 and ImageNet, where we obtain a 35% lower calibration error than histogram binning and, unlike scaling methods, guarantees on true calibration.