Energy
Continuous-in-Depth Neural Networks
Queiruga, Alejandro F., Erichson, N. Benjamin, Taylor, Dane, Mahoney, Michael W.
Recent work has attempted to interpret residual networks (ResNets) as one step of a forward Euler discretization of an ordinary differential equation, focusing mainly on syntactic algebraic similarities between the two systems. Discrete dynamical integrators of continuous dynamical systems, however, have a much richer structure. We first show that ResNets fail to be meaningful dynamical integrators in this richer sense. We then demonstrate that neural network models can learn to represent continuous dynamical systems, with this richer structure and properties, by embedding them into higher-order numerical integration schemes, such as the Runge Kutta schemes. Based on these insights, we introduce ContinuousNet as a continuous-in-depth generalization of ResNet architectures. ContinuousNets exhibit an invariance to the particular computational graph manifestation. That is, the continuous-in-depth model can be evaluated with different discrete time step sizes, which changes the number of layers, and different numerical integration schemes, which changes the graph connectivity. We show that this can be used to develop an incremental-in-depth training scheme that improves model quality, while significantly decreasing training time. We also show that, once trained, the number of units in the computational graph can even be decreased, for faster inference with little-to-no accuracy drop.
Bayesian learning of orthogonal embeddings for multi-fidelity Gaussian Processes
Tsilifis, Panagiotis, Pandita, Piyush, Ghosh, Sayan, Andreoli, Valeria, Vandeputte, Thomas, Wang, Liping
We present a Bayesian approach to identify optimal transformations that map model input points to low dimensional latent variables. The "projection" mapping consists of an orthonormal matrix that is considered a priori unknown and needs to be inferred jointly with the GP parameters, conditioned on the available training data. The proposed Bayesian inference scheme relies on a two-step iterative algorithm that samples from the marginal posteriors of the GP parameters and the projection matrix respectively, both using Markov Chain Monte Carlo (MCMC) sampling. In order to take into account the orthogonality constraints imposed on the orthonormal projection matrix, a Geodesic Monte Carlo sampling algorithm is employed, that is suitable for exploiting probability measures on manifolds. We extend the proposed framework to multi-fidelity models using GPs including the scenarios of training multiple outputs together. We validate our framework on three synthetic problems with a known lower-dimensional subspace. The benefits of our proposed framework, are illustrated on the computationally challenging three-dimensional aerodynamic optimization of a last-stage blade for an industrial gas turbine, where we study the effect of an 85-dimensional airfoil shape parameterization on two output quantities of interest, specifically on the aerodynamic efficiency and the degree of reaction.
Physics-informed Tensor-train ConvLSTM for Volumetric Velocity Forecasting
Huang, Yu, Tang, Yufei, Zhuang, Hanqi, VanZwieten, James, Cherubin, Laurent
According to the National Academies, a weekly forecast of velocity, vertical structure, and duration of the Loop Current (LC) and its eddies is critical for understanding the oceanography and ecosystem, and for mitigating outcomes of anthropogenic and natural disasters in the Gulf of Mexico (GoM). However, this forecast is a challenging problem since the LC behaviour is dominated by long-range spatial connections across multiple timescales. In this paper, we extend spatiotemporal predictive learning, showing its effectiveness beyond video prediction, to a 4D model, i.e., a novel Physics-informed Tensor-train ConvLSTM (PITT-ConvLSTM) for temporal sequences of 3D geospatial data forecasting. Specifically, we propose 1) a novel 4D higher-order recurrent neural network with empirical orthogonal function analysis to capture the hidden uncorrelated patterns of each hierarchy, 2) a convolutional tensor-train decomposition to capture higher-order space-time correlations, and 3) to incorporate prior physic knowledge that is provided from domain experts by informing the learning in latent space. The advantage of our proposed method is clear: constrained by physical laws, it simultaneously learns good representations for frame dependencies (both short-term and long-term high-level dependency) and inter-hierarchical relations within each time frame. Experiments on geospatial data collected from the GoM demonstrate that PITT-ConvLSTM outperforms the state-of-the-art methods in forecasting the volumetric velocity of the LC and its eddies for a period of over one week.
Generative Ensemble-Regression: Learning Stochastic Dynamics from Discrete Particle Ensemble Observations
Yang, Liu, Daskalakis, Constantinos, Karniadakis, George Em
We propose a new method for inferring the governing stochastic ordinary differential equations by observing particle ensembles at discrete and sparse time instants, i.e., multiple "snapshots". Particle coordinates at a single time instant, possibly noisy or truncated, are recorded in each snapshot but are unpaired across the snapshots. By training a generative model that generates "fake" sample paths, we aim to fit the observed particle ensemble distributions with a curve in the probability measure space, which is induced from the inferred particle dynamics. We employ different metrics to quantify the differences between distributions, like the sliced Wasserstein distances and the adversarial losses in generative adversarial networks. We refer to this approach as generative "ensemble-regression", in analogy to the classic "point-regression", where we infer the dynamics by performing regression in the Euclidean space, e.g. linear/logistic regression. We illustrate the ensemble-regression by learning the drift and diffusion terms of particle ensembles governed by stochastic ordinary differential equations with Brownian motions and L\'evy processes up to 20 dimensions. We also discuss how to treat cases with noisy or truncated observations, as well as the scenario of paired observations, and we prove a theorem for the convergence in Wasserstein distance for continuous sample spaces.
Event Prediction in the Big Data Era: A Systematic Survey
Events are occurrences in specific locations, time, and semantics that nontrivially impact either our society or the nature, such as civil unrest, system failures, and epidemics. It is highly desirable to be able to anticipate the occurrence of such events in advance in order to reduce the potential social upheaval and damage caused. Event prediction, which has traditionally been prohibitively challenging, is now becoming a viable option in the big data era and is thus experiencing rapid growth. There is a large amount of existing work that focuses on addressing the challenges involved, including heterogeneous multi-faceted outputs, complex dependencies, and streaming data feeds. Most existing event prediction methods were initially designed to deal with specific application domains, though the techniques and evaluation procedures utilized are usually generalizable across different domains. However, it is imperative yet difficult to cross-reference the techniques across different domains, given the absence of a comprehensive literature survey for event prediction. This paper aims to provide a systematic and comprehensive survey of the technologies, applications, and evaluations of event prediction in the big data era. First, systematic categorization and summary of existing techniques are presented, which facilitate domain experts' searches for suitable techniques and help model developers consolidate their research at the frontiers. Then, comprehensive categorization and summary of major application domains are provided. Evaluation metrics and procedures are summarized and standardized to unify the understanding of model performance among stakeholders, model developers, and domain experts in various application domains. Finally, open problems and future directions for this promising and important domain are elucidated and discussed.
State-of-the-art Techniques in Deep Edge Intelligence
Lodhi, Ahnaf Hannan, Akgรผn, Barฤฑล, รzkasap, รznur
The potential held by the gargantuan volumes of data being generated across networks worldwide has been truly unlocked by machine learning techniques and more recently Deep Learning. The advantages offered by the latter have seen it rapidly becoming a framework of choice for various applications. However, the centralization of computational resources and the need for data aggregation have long been limiting factors in the democratization of Deep Learning applications. Edge Computing is an emerging paradigm that aims to utilize the hitherto untapped processing resources available at the network periphery. Edge Intelligence (EI) has quickly emerged as a powerful alternative to enable learning using the concepts of Edge Computing. Deep Learning-based Edge Intelligence or Deep Edge Intelligence (DEI) lies in this rapidly evolving domain. In this article, we provide an overview of the major constraints in operationalizing DEI. The major research avenues in DEI have been consolidated under Federated Learning, Distributed Computation, Compression Schemes and Conditional Computation. We also present some of the prevalent challenges and highlight prospective research avenues.
Human brain has a 'limit' on how much information it can process
The human brain has a limit on how much information it can process at once due to a finite energy supply, a new study reveals. UK neuroscientists say that energy supply to the brain remains constant and can't exceed an upper limit, however challenging a task is. But as the brain uses more energy in processing the task at hand, less energy is supplied to processing outside our immediate focus, they say. This results in what's known as'inattentional blindness' โ when stimuli that's available in plain sight doesn't register, even if it's valuable to us. This can help explain why we are sometimes unable to concentrate on what our family members are telling us while we're playing video games or watching TV.
WORX Landroid M robotic mower Review : Automatic electronic yard care โ IAM Network
This summer I've been testing several lawn mowers, the most unique of which is this robot from WORX. This is the WORX Landroid M robotic mower, a fully automated, cordless, rechargeable battery powered piece of equipment that'll do all your work for you. The biggest obstacle you'll face is setup, and that's pretty straightforward if you follow the directions step-by-step. The Parts Included in our review is the basic WORX Landroid M robotic mower and a few add-ons. If you're looking at the WORX website (or WORX in a store) there are at least two versions of this Landroid M, one with GPS, one without.
Predicting Energy Production
As created for AI4IMPACT's Deep Learning Datathon 2020, TEAM DEFAULT has created a neural-network-based deep learning model used for predicting energy production demand in France. The model was created using Smojo, on AI4IMPACT's innovative cloud-based learning and model deployment system. Our model was able to achieve a 0.131 test loss which beat persistence loss of 0.485 by a quite a fair margin. As the energy market becomes increasingly liberalized across the world, the free and open market has seen an uptick and importance for optimized energy demand. New and existing entrants turn to data and various methods to forecast energy consumption in hopes of turning over a profit.
What's that? Reinforcement Learning in the Real-world?
Reinforcement Learning offers a distinctive way of solving the Machine Learning puzzle. It's sequential decision-making ability, and suitability to tasks requiring a trade-off between immediate and long-term returns are some components that make it desirable in settings where supervised-learning or unsupervised learning approaches would, in comparison, not fit as well. By having agents start with zero knowledge then learn qualitatively good behaviour through interaction with the environment, it's almost fair to say Reinforcement Learning (RL) is the closest thing we have to Artificial General Intelligence yet. We can see RL being used in robotics control, treatment design in healthcare, among others; but why aren't we boasting of many RL agents being scaled up to real-world production systems? There's a reason why games, like Atari, are such nice RL benchmarks -- they let us care only about maximizing the score and not worry about designing a reward function.