Goto

Collaborating Authors

 Energy


A Survey of Machine Learning Applied to Computer Architecture Design

arXiv.org Artificial Intelligence

Machine learning has enabled significant benefits in diverse fields, but, with a few exceptions, has had limited impact on computer architecture. Recent work, however, has explored broader applicability for design, optimization, and simulation. Notably, machine learning based strategies often surpass prior state-of-the-art analytical, heuristic, and human-expert approaches. This paper reviews machine learning applied system-wide to simulation and run-time optimization, and in many individual components, including memory systems, branch predictors, networks-on-chip, and GPUs. The paper further analyzes current practice to highlight useful design strategies and identify areas for future work, based on optimized implementation strategies, opportune extensions to existing work, and ambitious long term possibilities. Taken together, these strategies and techniques present a promising future for increasingly automated architectural design.


A Simulation of UAV Power Optimization via Reinforcement Learning

arXiv.org Artificial Intelligence

This paper demonstrates a reinforcement learning approach to the optimization of power consumption in a UAV system in a simplified data collection task. Here, the architecture consists of two common reinforcement learning algorithms, Q-learning and Sarsa, which are implemented through a combination of robot operating system (ROS) and Gazebo. The effect of wind as an influential factor was simulated. The implemented algorithm resulted in reasonable adjustment of UAV actions to the wind field in order to minimize its power consumption during task completion over the domain.


Sequential Training of Neural Networks with Gradient Boosting

arXiv.org Machine Learning

This paper presents a novel technique based on gradient boosting to train a shallow neural network (NN). Gradient boosting is an additive expansion algorithm in which a series of models are trained sequentially to approximate a given function. A one hidden layer neural network can also be seen as an additive model where the scalar product of the responses of the hidden layer and its weights provide the final output of the network. Instead of training the network as a whole, the proposed algorithm trains the network sequentially in $T$ steps. First, the bias term of the network is initialized with a constant approximation that minimizes the average loss of the data. Then, at each step, a portion of the network, composed of $K$ neurons, is trained to approximate the pseudo-residuals on the training data computed from the previous iteration. Finally, the $T$ partial models and bias are integrated as a single NN with $T \times K$ neurons in the hidden layer. We show that the proposed algorithm is more robust to overfitting than a standard neural network with respect to the number of neurons of the last hidden layer. Furthermore, we show that the proposed method design permits to reduce the number of neurons to be used without a significant reduction of its generalization ability. This permits to adapt the model to different classification speed requirements on the fly. Extensive experiments in classification and regression tasks, as well as in combination with a deep convolutional neural network, are carried out showing a better generalization performance than a standard neural network.


Hyperspectral Image Classification With Context-Aware Dynamic Graph Convolutional Network

arXiv.org Machine Learning

In hyperspectral image (HSI) classification, spatial context has demonstrated its significance in achieving promising performance. However, conventional spatial context-based methods simply assume that spatially neighboring pixels should correspond to the same land-cover class, so they often fail to correctly discover the contextual relations among pixels in complex situations, and thus leading to imperfect classification results on some irregular or inhomogeneous regions such as class boundaries. To address this deficiency, we develop a new HSI classification method based on the recently proposed Graph Convolutional Network (GCN), as it can flexibly encode the relations among arbitrarily structured non-Euclidean data. Different from traditional GCN, there are two novel strategies adopted by our method to further exploit the contextual relations for accurate HSI classification. First, since the receptive field of traditional GCN is often limited to fairly small neighborhood, we proposed to capture long range contextual relations in HSI by performing successive graph convolutions on a learned region-induced graph which is transformed from the original 2D image grids. Second, we refine the graph edge weight and the connective relationships among image regions by learning the improved adjacency matrix and the 'edge filter', so that the graph can be gradually refined to adapt to the representations generated by each graph convolutional layer. Such updated graph will in turn result in accurate region representations, and vice versa. The experiments carried out on three real-world benchmark datasets demonstrate that the proposed method yields significant improvement in the classification performance when compared with some state-of-the-art approaches.


PyDEns: a Python Framework for Solving Differential Equations with Neural Networks

arXiv.org Machine Learning

Recently, a lot of papers proposed to use neural networks to approximately solve partial differential equations (PDEs). Yet, there has been a lack of flexible framework for convenient experimentation. In an attempt to fill the gap, we introduce a PyDEns-module open-sourced on GitHub. Coupled with capabilities of BatchFlow, open-source framework for convenient and reproducible deep learning, PyDEns-module allows to 1) solve partial differential equations from a large family, including heat equation and wave equation 2) easily search for the best neural-network architecture among the zoo, that includes ResNet and DenseNet 3) fully control the process of model-training by testing different point-sampling schemes. With that in mind, our main contribution goes as follows: implementation of a ready-to-use and open-source numerical solver of PDEs of a novel format, based on neural networks.


Locally adaptive activation functions with slope recovery term for deep and physics-informed neural networks

arXiv.org Machine Learning

Locally adaptive activation functions with slope recovery term for deep and physics-informed neural networks Ameya D. Jagtap 1, Kenji Kawaguchi 2 and George Em Karniadakis 1,3, 1 Division of Applied Mathematics, Brown University, 182 George Street, Providence, RI 02912, USA. 2 Massachusetts Institute of T echnology, 77 Massachusetts Ave, Cambridge, MA 02139, USA. 3 Pacific Northwest National Laboratory, Richland, WA 99354, USA.Abstract We propose two approaches of locally adaptive activation functions namely, layer-wise and neuron-wise locally adaptive activation functions, which improve the performance of deep and physics-informed neural networks. The local adaptation of activation function is achieved by introducing scalable hyper-parameters in each layer (layer-wise) and for every neuron separately (neuron-wise), and then optimizing it using the stochastic gradient descent algorithm. Introduction of neuron-wise activation function acts like a vector activation function as opposed to the traditional scalar activation function given by fixed, global and layer-wise activations. In order to further increase the training speed, an activation slope based slope recovery term is added in the loss function, which further accelerate convergence, thereby reducing the training cost. For numerical experiments, a nonlinear discontinuous function is approximated using a deep neural network with layer-wise and neuron-wise locally adaptive activation functions with and without the slope recovery term and compared with its global counterpart. Moreover, solution of the nonlinear Burgers equation, which exhibits steep gradients, is also obtained using the proposed methods. On the theoretical side, we prove that in the proposed method the gradient descent algorithms are not attracted to sub-optimal critical points or local minima under practical conditions on the initialization and learning rate. Furthermore, the proposed adaptive activation functions with the slope recovery are shown to accelerate the training process in standard deep learning benchmarks using CIFAR-10, CIFAR-100, SVHN, MNIST, KMNIST, Fashion-MNIST, and Semeion data sets with and without data augmentation. Keywords: Machine learning, bad minima, stochastic gradients, accelerated training, PINN, deep learning benchmarks. 1. Introduction In recent years, research on neural networks (NNs) has intensified around the world due to their successful applications in many diverse fields such as speech recognition [13], computer vision [16], natural language translation [25], etc. Training of NN is performed on data sets before using it in the actual applications.


Probabilistic Forecasting using Deep Generative Models

arXiv.org Machine Learning

The Analog Ensemble (AnEn) method tries to estimate the probability distribution of the future state of the atmosphere with a set of past observations that correspond to the best analogs of a deterministic Numerical Weather Prediction (NWP). This model post-processing method has been successfully used to improve the forecast accuracy for several weather-related applications including air quality, and short-term wind and solar power forecasting, to name a few. In order to provide a meaningful probabilistic forecast, the AnEn method requires storing a historical set of past predictions and observations in memory for a period of at least several months and spanning the seasons relevant for the prediction of interest. Although the memory and computing costs of the AnEn method are less expensive than using a brute-force dynamical ensemble approach, for a large number of stations and large datasets, the amount of memory required for AnEn can easily become prohibitive. Furthermore, in order to find the best analogs associated with a certain prediction produced by a NWP model, the current approach requires searching over the entire dataset by applying a certain metric. This approach requires applying the metric over the entire historical dataset, which may take a substantial amount of time. In this work, we investigate an alternative way to implement the AnEn method using deep generative models. By doing so, a generative model can entirely or partially replace the dataset of pairs of predictions and observations, reducing the amount of memory required to produce the probabilistic forecast by several orders of magnitude. Furthermore, the generative model can generate a meaningful set of analogs associated with a certain forecast in constant time without performing any search, saving a considerable amount of time even in the presence of huge historical datasets.


Determining offshore wind installation times using machine learning and open data

arXiv.org Machine Learning

The installation process of offshore wind turbines requires the use of expensive jack-up vessels. These vessels regularly report their position via the Automatic Identification System (AIS). This paper introduces a novel approach of applying machine learning to AIS data from jack-up vessels. We apply the new method to 13 offshore wind farms in Danish, German and British waters. For each of the wind farms we identify individual turbine locations, individual installation times, time in transit and time in harbor for the respective vessel. This is done in an automated way exclusively using AIS data with no prior knowledge of turbine locations, thus enabling a detailed description of the entire installation process.


Which wildfires will burn out of control? Machine learning can help

#artificialintelligence

A satellite image of Alaska captured in August 2005 shows the extent of smoke coverage from wildfires in the state's boreal forests. The blazes are likely to become large in exceptionally hot and dry conditions and when there's a high percentage of black spruce trees in the affected areas – key factors in a new predictive model developed by UCI scientists. An interdisciplinary team of scientists at the University of California, Irvine has developed a new technique for predicting the final size of a wildfire from the moment of ignition. Built around a machine learning algorithm, the model can help in forecasting whether a blaze is going to be small, medium or large by the time it has run its course – knowledge useful to those in charge of allocating scarce firefighting resources. The researchers' work is highlighted in a study published today in the International Journal of Wildland Fire.


AI Drones Risk Mitigation for Midstream Operations

#artificialintelligence

Drones are all the rage today. Not a day goes by that we don't read about someone using a drone for something (good or bad) somewhere on Earth. AI or Artificial Intelligence has also made a resurgence. I say that because there was a time, not too long ago (about 7 years ago) when AI was also very popular and a number of movies featuring AI were produced by Hollywood. Now if we combine the two, what do we get?