Energy
Machine learning is about to transform these industries
TechRepublic's Dan Patterson asks Schneider Electric Chief Digital Officer Herve Coureil about how machine learning will transform industries. The following is an edited transcript of the interview. I mean, it's one thing for us to kind of nerd-out about your particular sector, but there are ancillary industries. What companies, what industries, do you see really taking advantage of machine learning? Herve Coureil: We are working.
CrystalGAN: Learning to Discover Crystallographic Structures with Generative Adversarial Networks
Nouira, Asma, Sokolovska, Nataliya, Crivello, Jean-Claude
Our main motivation is to propose an efficient approach to generate novel multi-element stable chemical compounds that can be used in real world applications. This task can be formulated as a combinatorial problem, and it takes many hours of human experts to construct, and to evaluate new data. Unsupervised learning methods such as Generative Adversarial Networks (GANs) can be efficiently used to produce new data. Cross-domain Generative Adversarial Networks were reported to achieve exciting results in image processing applications. However, in the domain of materials science, there is a need to synthesize data with higher order complexity compared to observed samples, and the state-of-the-art cross-domain GANs can not be adapted directly. In this contribution, we propose a novel GAN called CrystalGAN which generates new chemically stable crystallographic structures with increased domain complexity. We introduce an original architecture, we provide the corresponding loss functions, and we show that the CrystalGAN generates very reasonable data. We illustrate the efficiency of the proposed method on a real original problem of novel hydrides discovery that can be further used in development of hydrogen storage materials.
Distribution System Voltage Control under Uncertainties using Tractable Chance Constraints
Li, Pan, Jin, Baihong, Wang, Dai, Zhang, Baosen
V oltage control plays an important role in the operation of electricity distribution networks, especially with high penetration of distributed energy resources. These resources introduce significant and fast varying uncertainties. In this paper, we focus on reactive power compensation to control voltage in the presence of uncertainties. We adopt a chance constraint approach that accounts for arbitrary correlations between renewable resources at each of the buses. We show how the problem can be solved efficiently using historical samples analogously to the stochastic quasi-gradient methods. We also show that this optimization problem is convex for a wide variety of probabilistic distributions. Compared to conventional per-bus chance constraints, our formulation is more robust to uncertainty and more computationally tractable. We illustrate the results using standard IEEE distribution test feeders. V oltage control is crucial to stable operations of power distribution systems, where it is used to maintain acceptable voltages at all buses under different operating conditions [1]. To control voltage, reactive power is traditionally regulated through tap-changing transformers and switched capacitors [2]. With recent advances in cyber-infrastructure for communication and control, it is also possible to utilize distributed energy resources (DERs, i.e., electric vehicles [3], PV panels [4], [5]) to provide voltage regulation.
Microsoft to tackle AI skills shortage with two new training programs ZDNet
Microsoft has revealed two new training programs to tackle the shortage of AI-related skills in business and academia. What is AI? Everything you need to know about Artificial Intelligence The first of the two programs, Microsoft AI Academy, will run face-to-face and online training sessions for business and public-sector leaders, IT professionals, developers, and startups. "The academy will be helping to develop practical AI skills, learning, and certification for customers and partners," said Cindy Rose, Microsoft UK CEO, speaking at the Future Decoded event in London today. Rose added that Microsoft will use the academy to train up its own staff, including herself. Microsoft's ambition for the academy, she said, is "to empower you and your organization to do more with AI".
Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control
Lowrey, Kendall, Rajeswaran, Aravind, Kakade, Sham, Todorov, Emanuel, Mordatch, Igor
We propose a "plan online and learn offline" framework for the setting where an agent, with an internal model, needs to continually act and learn in the world. Our work builds on the synergistic relationship between local model-based control, global value function learning, and exploration. We study how local trajectory optimization can cope with approximation errors in the value function, and can stabilize and accelerate value function learning. Conversely, we also study how approximate value functions can help reduce the planning horizon and allow for better policies beyond local solutions. Finally, we also demonstrate how trajectory optimization can be used to perform temporally coordinated exploration in conjunction with estimating uncertainty in value function approximation. This exploration is critical for fast and stable learning of the value function. Combining these components enable solutions to complex control tasks, like humanoid locomotion and dexterous in-hand manipulation, in the equivalent of a few minutes of experience in the real world.
Towards a Near Universal Time Series Data Mining Tool: Introducing the Matrix Profile
Towards a Near Universal Time Series Data Mining Tool: Introducing the Matrix Profile by Chin-Chia Michael Yeh Doctor of Philosophy, Graduate Program in Computer Science University of California, Riverside, September 2018 Dr. Eamonn Keogh, Chairperson The last decade has seen a flurry of research on all-pairs-similarity-search (or, self-join) for text, DNA, and a handful of other datatypes, and these systems have been applied to many diverse data mining problems. Surprisingly, however, little progress has been made on addressing this problem for time series subsequences. In this thesis, we have introduced a near universal time series data mining tool called matrix profile which solves the all-pairssimilarity-search problem and caches the output in an easy-to-access fashion. The proposed algorithm is not only parameter-free, exact and scalable, but also applicable for both single and multidimensional time series. By building time series data mining methods on top of matrix profile, many time series data mining tasks (e.g., motif discovery, discord discovery, shapelet discovery, semantic segmentation, and clustering) can be efficiently solved. Because the same matrix profile can be shared by a diverse set of time series data mining methods, matrix profile is versatile and computed-once-use-many-times data structure. We demonstrate the utility of matrix profile for many time series data mining problems, including motif discovery, discord discovery, weakly labeled time series classification, and vi representation learning on domains as diverse as seismology, entomology, music processing, bioinformatics, human activity monitoring, electrical power-demand monitoring, and medicine. We hope the matrix profile is not the end but the beginning of many more time series data mining projects.
Stochastic Modified Equations and Dynamics of Stochastic Gradient Algorithms I: Mathematical Foundations
Li, Qianxiao, Tai, Cheng, E, Weinan
We develop the mathematical foundations of the stochastic modified equations (SME) framework for analyzing the dynamics of stochastic gradient algorithms, where the latter is approximated by a class of stochastic differential equations with small noise parameters. We prove that this approximation can be understood mathematically as an weak approximation, which leads to a number of precise and useful results on the approximations of stochastic gradient descent (SGD), momentum SGD and stochastic Nesterov's accelerated gradient method in the general setting of stochastic objectives. We also demonstrate through explicit calculations that this continuous-time approach can uncover important analytical insights into the stochastic gradient algorithms under consideration that may not be easy to obtain in a purely discrete-time setting. Keywords: stochastic gradient algorithms, modified equations, stochastic differential equations, momentum, Nesterov's accelerated gradient
Mesh-TensorFlow: Deep Learning for Supercomputers
Shazeer, Noam, Cheng, Youlong, Parmar, Niki, Tran, Dustin, Vaswani, Ashish, Koanantakool, Penporn, Hawkins, Peter, Lee, HyoukJoong, Hong, Mingsheng, Young, Cliff, Sepassi, Ryan, Hechtman, Blake
Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers from problems including the inability to train very large models (due to memory constraints), high latency, and inefficiency at small batch sizes. All of these can be solved by more general distribution strategies (model-parallelism). Unfortunately, efficient model-parallel algorithms tend to be complicated to discover, describe, and to implement, particularly on large clusters. We introduce Mesh-TensorFlow, a language for specifying a general class of distributed tensor computations. Where data-parallelism can be viewed as splitting tensors and operations along the "batch" dimension, in Mesh-TensorFlow, the user can specify any tensor-dimensions to be split across any dimensions of a multi-dimensional mesh of processors. A Mesh-TensorFlow graph compiles into a SPMD program consisting of parallel operations coupled with collective communication primitives such as Allreduce. We use Mesh-TensorFlow to implement an efficient data-parallel, model-parallel version of the Transformer sequence-to-sequence model. Using TPU meshes of up to 512 cores, we train Transformer models with up to 5 billion parameters, surpassing state of the art results on WMT'14 English-to-French translation task and the one-billion-word language modeling benchmark. Mesh-Tensorflow is available at https://github.com/tensorflow/mesh .
Reinforcement Learning based Dynamic Model Selection for Short-Term Load Forecasting
With the growing prevalence of smart grid technology, short-term load forecasting (STLF) becomes particularly important in power system operations. There is a large collection of methods developed for STLF, but selecting a suitable method under varying conditions is still challenging. This paper develops a novel reinforcement learning based dynamic model selection (DMS) method for STLF. A forecasting model pool is first built, including ten state-of-the-art machine learning based forecasting models. Then a Q-learning agent learns the optimal policy of selecting the best forecasting model for the next time step, based on the model performance. The optimal DMS policy is applied to select the best model at each time step with a moving window. Numerical simulations on two-year load and weather data show that the Q-learning algorithm converges fast, resulting in effective and efficient DMS. The developed STLF model with Q-learning based DMS improves the forecasting accuracy by approximately 50%, compared to the state-of-the-art machine learning based STLF models.
Practical Batch Bayesian Optimization for Less Expensive Functions
Nguyen, Vu, Gupta, Sunil, Rana, Santu, Li, Cheng, Venkatesh, Svetha
Bayesian optimization (BO) and its batch extensions are successful for optimizing expensive black-box functions. However, these traditional BO approaches are not yet ideal for optimizing less expensive functions when the computational cost of BO can dominate the cost of evaluating the blackbox function. Examples of these less expensive functions are cheap machine learning models, inexpensive physical experiment through simulators, and acquisition function optimization in Bayesian optimization. In this paper, we consider a batch BO setting for situations where function evaluations are less expensive. Our model is based on a new exploration strategy using geometric distance that provides an alternative way for exploration, selecting a point far from the observed locations. Using that intuition, we propose to use Sobol sequence to guide exploration that will get rid of running multiple global optimization steps as used in previous works. Based on the proposed distance exploration, we present an efficient batch BO approach. We demonstrate that our approach outperforms other baselines and global optimization methods when the function evaluations are less expensive.