Energy
Artificial intelligence helps prevent disruptions in fusion devices
Fusion devices called tokamaks run increased risk of disruptions as researchers, aiming to maximize fusion power to create on Earth the fusion that powers the sun and stars, bump up against the operational limits of the facilities. Scientists thus must be able to boost fusion power without hitting those limits. This capability will be crucial for ITER, the large international tokamak under construction in France to demonstrate the practicality of fusion energy. Fusion reactions combine light elements in the form of plasma -- the hot, charged state of matter composed of free electrons and atomic nuclei that makes up 99 percent of the visible universe -- to generate massive amounts of energy. Scientists around the world are seeking to create fusion for a virtually inexhaustible supply of safe and clean power to generate electricity.
Search-Guided, Lightly-Supervised Training of Structured Prediction Energy Networks
Rooshenas, Amirmohammad, Zhang, Dongxu, Sharma, Gopal, McCallum, Andrew
In structured output prediction tasks, labeling ground-truth training output is often expensive. However, for many tasks, even when the true output is unknown, we can evaluate predictions using a scalar reward function, which may be easily assembled from human knowledge or non-differentiable pipelines. But searching through the entire output space to find the best output with respect to this reward function is typically intractable. In this paper, we instead use efficient truncated randomized search in this reward function to train structured prediction energy networks (SPENs), which provide efficient test-time inference using gradient-based search on a smooth, learned representation of the score landscape, and have previously yielded state-of-the-art results in structured prediction. In particular, this truncated randomized search in the reward function yields previously unknown local improvements, providing effective supervision to SPENs, avoiding their traditional need for labeled training data. Papers published at the Neural Information Processing Systems Conference.
Differentiable Convex Optimization Layers
Agrawal, Akshay, Amos, Brandon, Barratt, Shane, Boyd, Stephen, Diamond, Steven, Kolter, J. Zico
Recent work has shown how to embed differentiable optimization problems (that is, problems whose solutions can be backpropagated through) as layers within deep learning architectures. This method provides a useful inductive bias for certain problems, but existing software for differentiable optimization layers is rigid and difficult to apply to new settings. In this paper, we propose an approach to differentiating through disciplined convex programs, a subclass of convex optimization problems used by domain-specific languages (DSLs) for convex optimization. We introduce disciplined parametrized programming, a subset of disciplined convex programming, and we show that every disciplined parametrized program can be represented as the composition of an affine map from parameters to problem data, a solver, and an affine map from the solver's solution to a solution of the original problem (a new form we refer to as affine-solver-affine form). We then demonstrate how to efficiently differentiate through each of these components, allowing for end-to-end analytical differentiation through the entire convex program.
Seeing the Wind: Visual Wind Speed Prediction with a Coupled Convolutional and Recurrent Neural Network
Cardona, Jennifer, Howland, Michael, Dabiri, John
Wind energy resource quantification, air pollution monitoring, and weather forecasting all rely on rapid, accurate measurement of local wind conditions. Visual observations of the effects of wind---the swaying of trees and flapping of flags, for example---encode information regarding local wind conditions that can potentially be leveraged for visual anemometry that is inexpensive and ubiquitous. Here, we demonstrate a coupled convolutional neural network and recurrent neural network architecture that extracts the wind speed encoded in visually recorded flow-structure interactions of a flag and tree in naturally occurring wind. Predictions for wind speeds ranging from 0.75-11 m/s showed agreement with measurements from a cup anemometer on site, with a root-mean-squared error approaching the natural wind speed variability due to atmospheric turbulence. Generalizability of the network was demonstrated by successful prediction of wind speed based on recordings of other flags in the field and in a controlled wind tunnel test.
Graph Structured Prediction Energy Networks
Graber, Colin, Schwing, Alexander
For joint inference over multiple variables, a variety of structured prediction techniques have been developed to model correlations among variables and thereby improve predictions. However, many classical approaches suffer from one of two primary drawbacks: they either lack the ability to model high-order correlations among variables while maintaining computationally tractable inference, or they do not allow to explicitly model known correlations. To address this shortcoming, we introduce'Graph Structured Prediction Energy Networks,' for which we develop inference techniques that allow to both model explicit local and implicit higher-order correlations while maintaining tractability of inference. We apply the proposed method to tasks from the natural language processing and computer vision domain and demonstrate its general utility. Papers published at the Neural Information Processing Systems Conference.
VisuoSpatial Foresight for Multi-Step, Multi-Task Fabric Manipulation
Hoque, Ryan, Seita, Daniel, Balakrishna, Ashwin, Ganapathi, Aditya, Tanwani, Ajay Kumar, Jamali, Nawid, Yamane, Katsu, Iba, Soshi, Goldberg, Ken
Robotic fabric manipulation has applications in cloth and cable management, senior care, surgery and more. Existing fabric manipulation techniques, however, are designed for specific tasks, making it difficult to generalize across different but related tasks. We address this problem by extending the recently proposed Visual Foresight framework to learn fabric dynamics, which can be efficiently reused to accomplish a variety of different fabric manipulation tasks with a single goal-conditioned policy. We introduce VisuoSpatial Foresight (VSF), which extends prior work by learning visual dynamics on domain randomized RGB images and depth maps simultaneously and completely in simulation. We experimentally evaluate VSF on multi-step fabric smoothing and folding tasks both in simulation and on the da Vinci Research Kit (dVRK) surgical robot without any demonstrations at train or test time. Furthermore, we find that leveraging depth significantly improves performance for cloth manipulation tasks, and results suggest that leveraging RGBD data for video prediction and planning yields an 80% improvement in fabric folding success rate over pure RGB data. Supplementary material is available at https://sites.google.com/view/fabric-vsf/.
Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting
Li, Shiyang, Jin, Xiaoyong, Xuan, Yao, Zhou, Xiyou, Chen, Wenhu, Wang, Yu-Xiang, Yan, Xifeng
Time series forecasting is an important problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. In this paper, we propose to tackle such forecasting problem with Transformer. Although impressed by its performance in our preliminary study, we found its two major weaknesses: (1) locality-agnostics: the point-wise dot- product self-attention in canonical Transformer architecture is insensitive to local context, which can make the model prone to anomalies in time series; (2) memory bottleneck: space complexity of canonical Transformer grows quadratically with sequence length L, making directly modeling long time series infeasible. In order to solve these two issues, we first propose convolutional self-attention by producing queries and keys with causal convolution so that local context can be better incorporated into attention mechanism. Then, we propose LogSparse Transformer with only O(L(log L) 2) memory cost, improving forecasting accuracy for time series with fine granularity and strong long-term dependencies under constrained memory budget.
Intel to Release Neuromorphic-Computing System
Neuromorphic chips are expected to be the predominant computing architecture for new, advanced forms of artificial-intelligence deployments by 2025, according to technology research firm Gartner Inc. By that year, Gartner predicts, the technology is expected to displace graphics processing units, one of the main computer chips used for AI systems, especially neural networks. Neural networks are used in speech recognition and understanding, as well as computer vision. With neuromorphic computing, it is possible to train machine-learning models using a fraction of the data it takes to train them on traditional computing hardware. That means the models learn similarly to the way human babies learn, by seeing an image or toy once and being able to recognize it forever, said Mike Davies, director of Intel's Neuromorphic Computing Lab.
Why Truly Smart Future Tech Needs Radically New AI Chips - insideBIGDATA
Much distance remains between today's AI hardware and the technologies needed to enable smart cities and autonomous vehicles (AVs) to function smoothly. To bring new innovations to reality, AI chipmakers must concentrate on honing the building-block technologies needed to enable the truly smart AV and IoT systems of the future. That means building processors efficient and compact enough to compute and interpret reams of data in real-time. To achieve maximum benefit, these processors must be embedded within the edge devices themselves. Improving performance and speed while keeping compute cost and power demands low is the central challenge.