Energy
How Parallel Processing Solves Our Biggest Computational Problems
Take all the help you can get. If parallel computing has a central tenet, that might be it. Some of the crazy-complex computations asked of today's hardware are so demanding that the compute burden must be borne by multiple processors, effectively "parallelizing" whatever task is being performed. Perhaps the most notable push toward parallelism happened around 2006, when tech hardware powerhouse Nvidia approached Wen-mei Hwu, a professor of electrical and computer engineering at the University of Illinois-Urbana Champaign. Nvidia was designing graphics processing units (GPUs) -- which, thanks to large numbers of threads and cores, had far higher memory bandwidth than the traditional central processing unit (CPUs) -- as a way to process huge numbers of pixels.
IoT News - Smart Cities are Getting Smarter in Surprising Ways - IoT Business News
New technologies and approaches spur five key smart cities strategy shifts. It's time for smart cities to embrace new technologies and approaches to combat a growing list of challenges, states global tech market advisory firm, ABI Research. Cities have faced challenges like congestion, pollution, and safety for decades, and most have a plan to combat them. While they continue to face these traditional issues, new threats such as cyberattacks, climate change, and other emerging problems are mounting. "This new reality requires new approaches, leveraging a range of new technologies to create true strategy shifts," says Dominique Bonte, Vice President at ABI Research.
How to train artificial intelligence that won't destroy the environment
There's been a reckoning in recent years when it comes to measuring bias in machine learning. We now know that these "unbiased" automated tools are actually far from unprejudiced, and there's a growing demand that researchers think about how their products might screw over or endanger the lives of others before they unleash them on society. It's not just the final products we should be worried about, however, but also the consequences of building them. As the world burns in Facebook feeds and in backyards, the carbon footprints of even the most innocuous things are coming under scrutiny. It's sparked debates around AC units, straws, face scrubs, plastic bags, air travel.
Transfer Learning in Spatial-Temporal Forecasting of the Solar Magnetic Field
Machine learning techniques have been widely used in attempts to forecast several solar datasets. Most of these approaches employ supervised machine learning algorithms which are, in general, very data hungry. This hampers the attempts to forecast some of these data series, particularly the ones that depend on (relatively) recent space observations. Here we focus on an attempt to forecast the solar surface longitudinally averaged radial magnetic field distribution using a form of spatial-temporal neural networks. Given that the recording of these spatial-temporal datasets only started in 1975 and are therefore quite short, the forecasts are predictably quite modest. However, given that there is a potential physical relationship between sunspots and the magnetic field, we employ another machine learning technique called transfer learning which has recently received considerable attention in the literature. Here, this approach consists in first training the source spatial-temporal neural network on the much longer time/latitude sunspot area dataset, which starts in 1874, then transferring the trained set of layers to a target network, and continue training the latter on the magnetic field dataset. The employment of transfer learning in the field of computer vision is known to obtain a generalized set of feature filters that can be reused for other datasets and tasks. Here we obtain a similar result, whereby we first train the network on the spatial-temporal sunspot area data, then the first few layers of the neural network are able to identify the two main features of the solar cycle, i.e. the amplitude variation and the migration to the equator, and therefore can be used to train on the magnetic field dataset and forecast better than a prediction based only on the historical magnetic field data.
Degrees of freedom for off-the-grid sparse estimation
Clarice Poon, Gabriel Peyr e † November 12, 2019 Abstract A central question in modern machine learning and imaging sciences is to quantify the number of effective parameters of vastly over-parameterized models. The degrees of freedom is a mathematically convenient way to define this number of parameters. Its computation and properties are well understood when dealing with discretized linear models, possibly regularized using sparsity. In this paper, we argue that this way of thinking is plagued when dealing with models having very large parameter spaces. In this case it makes more sense to consider "off-the-grid" approaches, using a continuous parameter space. This type of approach is the one favoured when training multi-layer perceptrons, and is also becoming popular to solve super-resolution problems in imaging. Training these off-the-grid models with a sparsity inducing prior can be achieved by solving a convex optimization problem over the space of measures, which is often called the Beurling Lasso (Blasso), and is the continuous counterpart of the celebrated Lasso parameter selection method. In previous works [41, 19], the degrees of freedom for the Lasso was shown to coincide with the size of the smallest solution support. Our main contribution is a proof of a continuous counterpart to this result for the Blasso. While in dimension d, each of the k nonzero recovered atom in the recovered measure carries over d 1 parameters ( d for the position and 1 for the weight), a surprising implication of our new formula it that the degrees of freedom for these off-the-grid models is in general strictly smaller ( d 1)k . Our findings thus suggest that discretized methods actually vastly overestimate the number of intrinsic continuous degrees of freedom. Our second contribution is a detailed study of the case of sampling Fourier coefficients in 1D, which corresponds to a super-resolution problem. We show that our formula for the degrees of freedom is valid outside of a set of measure zero of observations, which in turn justifies its use to compute an unbiased estimator of the prediction risk using the Stein Unbiased Risk Estimator (SURE). We also report numerical results for both the case of Fourier sampling and the learning of a multilayers perceptron with a single hidden layer.
Hierarchical Clustering for Smart Meter Electricity Loads based on Quantile Autocovariances
Alonso, Andrés M., Nogales, F. Javier, Ruiz, Carlos
In order to improve the efficiency and sustainability of electricity systems, most countries worldwide are deploying advanced metering infrastructures, and in particular household smart meters, in the residential sector. This technology is able to record electricity load time series at a very high frequency rates, information that can be exploited to develop new clustering models to group individual households by similar consumptions patterns. To this end, in this work we propose three hierarchical clustering methodologies that allow capturing different characteristics of the time series. These are based on a set of "dissimilarity" measures computed over different features: quantile auto-covariances, and simple and partial autocorrelations. The main advantage is that they allow summarizing each time series in a few representative features so that they are computationally efficient, robust against outliers, easy to automatize, and scalable to hundreds of thousands of smart meters series. We evaluate the performance of each clustering model in a real-world smart meter dataset with thousands of half-hourly time series. The results show how the obtained clusters identify relevant consumption behaviors of households and capture part of their geo-demographic segmentation. Moreover, we apply a supervised classification procedure to explore which features are more relevant to define each cluster.
Deep Transfer Learning for Thermal Dynamics Modeling in Smart Buildings
Jiang, Zhanhong, Lee, Young M.
--Thermal dynamics modeling has been a critical issue in building heating, ventilation, and air-conditioning (HV AC) systems, which can significantly affect the control and maintenance strategies. Due to the uniqueness of each specific building, traditional thermal dynamics modeling approaches heavily depending on physics knowledge cannot generalize well. This study proposes a deep supervised domain adaptation (DSDA) method for thermal dynamics modeling of building indoor temperature evolution and energy consumption. A long short term memory network based Sequence to Sequence scheme is pre-trained based on a large amount of data collected from a building and then adapted to another building which has a limited amount of data by applying the model fine-tuning. We use four publicly available datasets: SML and AHU for temperature evolution, long-term datasets from two different commercial buildings, termed as Building 1 and Building 2 for energy consumption. We show that the deep supervised domain adaptation is effective to adapt the pre-trained model from one building to another building and has better predictive performance than learning from scratch with only a limited amount of data.
FANN-on-MCU: An Open-Source Toolkit for Energy-Efficient Neural Network Inference at the Edge of the Internet of Things
Wang, Xiaying, Magno, Michele, Cavigelli, Lukas, Benini, Luca
The growing number of low-power smart devices in the Internet of Things is coupled with the concept of "Edge Computing", that is moving some of the intelligence, especially machine learning, towards the edge of the network. Enabling machine learning algorithms to run on resource-constrained hardware, typically on low-power smart devices, is challenging in terms of hardware (optimized and energy-efficient integrated circuits), algorithmic and firmware implementations. This paper presents FANN-on-MCU, an open-source toolkit built upon the Fast Artificial Neural Network (FANN) library to run lightweight and energy-efficient neural networks on microcontrollers based on both the ARM Cortex-M series and the novel RISC-V-based Parallel Ultra-Low-Power (PULP) platform. The toolkit takes multi-layer perceptrons trained with FANN and generates code targeted at execution on low-power microcontrollers either with a floating-point unit (i.e., ARM Cortex-M4F and M7F) or without (i.e., ARM Cortex M0-M3 or PULP-based processors). This paper also provides an architectural performance evaluation of neural networks on the most popular ARM Cortex-M family and the parallel RISC-V processor called Mr. Wolf. The evaluation includes experimental results for three different applications using a self-sustainable wearable multi-sensor bracelet. Experimental results show a measured latency in the order of only a few microseconds and a power consumption of few milliwatts while keeping the memory requirements below the limitations of the targeted microcontrollers. In particular, the parallel implementation on the octa-core RISC-V platform reaches a speedup of 22x and a 69% reduction in energy consumption with respect to a single-core implementation on Cortex-M4 for continuous real-time classification.
Deep geometric knowledge distillation with graphs
Lassance, Carlos, Bontonou, Myriam, Hacene, Ghouthi Boukli, Gripon, Vincent, Tang, Jian, Ortega, Antonio
In most cases deep learning architectures are trained disregarding the amount of operations and energy consumption. However, some applications, like embedded systems, can be resource-constrained during inference. A popular approach to reduce the size of a deep learning architecture consists in distilling knowledge from a bigger network (teacher) to a smaller one (student). Directly training the student to mimic the teacher representation can be effective, but it requires that both share the same latent space dimensions. In this work, we focus instead on relative knowledge distillation (RKD), which considers the geometry of the respective latent spaces, allowing for dimension-agnostic transfer of knowledge. Specifically we introduce a graph-based RKD method, in which graphs are used to capture the geometry of latent spaces. Using classical computer vision benchmarks, we demonstrate the ability of the proposed method to efficiently distillate knowledge from the teacher to the student, leading to better accuracy for the same budget as compared to existing RKD alternatives.
AI Ethics for Systemic Issues: A Structural Approach
van der Loeff, Agnes Schim, Bassi, Iggy, Kapila, Sachin, Gamper, Jevgenij
The debate on AI ethics largely focuses on technical improve ments and stronger regulation to prevent accidents or misuse of AI, with soluti ons relying on holding individual actors accountable for responsible AI devel opment. While useful and necessary, we argue that this "agency" approach disrega rds more indirect and complex risks resulting from AI's interaction with the soci o-economic and political context. This paper calls for a "structural" approach to assessing AI's effects in order to understand and prevent such systemic risks where no individual can be held accountable for the broader negative impacts. This i s particularly relevant for AI applied to systemic issues such as climate change and f ood security which require political solutions and global cooperation. To pro perly address the wide range of AI risks and ensure'AI for social good', agency-foc used policies must be complemented by policies informed by a structural approa ch.