AITopics

arXiv.org Artificial IntelligenceJun-15-2020

Learn to Effectively Explore in Context-Based Meta-RL

Zhang, Jin, Wang, Jianhao, Hu, Hao, Chen, Yingfeng, Fan, Changjie, Zhang, Chongjie

Meta reinforcement learning (meta-RL) provides a principled approach for fast adaptation to novel tasks by extracting prior knowledge from previous tasks. Under such settings, it is crucial for the agent to perform efficient exploration during adaptation to collect useful experiences. However, existing methods suffer from poor adaptation performance caused by inefficient exploration mechanisms, especially in sparse-reward problems. In this paper, we present a novel off-policy context-based meta-RL approach that efficiently learns a separate exploration policy to support fast adaptation, as well as a context-aware exploitation policy to maximize extrinsic return. The explorer is motivated by an information-theoretical intrinsic reward that encourages the agent to collect experiences that provide rich information about the task. Experiment results on both MuJoCo and Meta-World benchmarks show that our method significantly outperforms baselines by performing efficient exploration strategies.

deep learning, exploration, upstream oil & gas, (20 more...)

arXiv.org Artificial Intelligence

2006.0817

Genre: Research Report (0.50)

Industry:

Energy > Oil & Gas > Upstream (0.50)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Cideron, Geoffrey, Pierrot, Thomas, Perrin, Nicolas, Beguir, Karim, Sigaud, Olivier

QD-RL: Efficient Mixing of Quality and Diversity in Reinforcement Learning

arXiv.org Artificial IntelligenceJun-15-2020

We propose a novel reinforcement learning algorithm,QD-RL, that incorporates the strengths of off-policy RL algorithms into Quality Diversity (QD) approaches. Quality-Diversity methods contribute structural biases by decoupling the search for diversity from the search for high return, resulting in efficient management of the exploration-exploitation trade-off. However, these approaches generally suffer from sample inefficiency as they call upon evolutionary techniques. QD-RL removes this limitation by relying on off-policy RL algorithms. More precisely, we train a population of off-policy deep RL agents to simultaneously maximize diversity inside the population and the return of the agents. QD-RL selects agents from the diversity-return Pareto Front, resulting in stable and efficient population updates. Our experiments on the Ant-Maze environment show that QD-RL can solve challenging exploration and control problems with deceptive rewards while being more than 15 times more sample efficient than its evolutionary counterparts.

artificial intelligence, diversity, upstream oil & gas, (18 more...)

arXiv.org Artificial Intelligence

2006.08505

Genre: Research Report > New Finding (0.68)

Industry:

Energy > Oil & Gas > Upstream (0.49)
Leisure & Entertainment > Games (0.46)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Deep covariate-learning: optimising information extraction from terrain texture for geostatistical modelling applications

Kirkwood, Charlie

Where data is available, it is desirable in geostatistical modelling to make use of additional covariates, for example terrain data, in order to improve prediction accuracy in the modelling task. While elevation itself may be important, additional explanatory power for any given problem can be sought (but not necessarily found) by filtering digital elevation models to extract higher-order derivatives such as slope angles, curvatures, and roughness. In essence, it would be beneficial to extract as much task-relevant information as possible from the elevation grid. However, given the complexities of the natural world, chance dictates that the use of 'off-the-shelf' filters is unlikely to derive covariates that provide strong explanatory power to the target variable at hand, and any attempt to manually design informative covariates is likely to be a trial-and-error process -- not optimal. In this paper we present a solution to this problem in the form of a deep learning approach to automatically deriving optimal task-specific terrain texture covariates from a standard SRTM 90m gridded digital elevation model (DEM). For our target variables we use point-sampled geochemical data from the British Geological Survey: concentrations of potassium, calcium and arsenic in stream sediments. We find that our deep learning approach produces covariates for geostatistical modelling that have surprisingly strong explanatory power on their own, with R-squared values around 0.6 for all three elements (with arsenic on the log scale). These results are achieved without the neural network being provided with easting, northing, or absolute elevation as inputs, and purely reflect the capacity of our deep neural network to extract task-specific information from terrain texture. We hope that these results will inspire further investigation into the capabilities of deep learning within geostatistical applications.

deep learning, neural network, upstream oil & gas, (19 more...)

2005.11194

Country:

Europe > United Kingdom > England (0.68)
North America > United States (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report (1.00)

Industry:

Materials > Metals & Mining (0.68)
Energy > Oil & Gas > Upstream (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kirichenko, Polina, Izmailov, Pavel, Wilson, Andrew Gordon

Why Normalizing Flows Fail to Detect Out-of-Distribution Data

Detecting out-of-distribution (OOD) data is crucial for robust machine learning systems. Normalizing flows are flexible deep generative models that often surprisingly fail to distinguish between in- and out-of-distribution data: a flow trained on pictures of clothing assigns higher likelihood to handwritten digits. We investigate why normalizing flows perform poorly for OOD detection. We demonstrate that flows learn local pixel correlations and generic image-to-latent-space transformations which are not specific to the target image dataset. We show that by modifying the architecture of flow coupling layers we can bias the flow towards learning the semantic structure of the target data, improving OOD detection. Our investigation reveals that properties that enable flows to generate high-fidelity images can have a detrimental effect on OOD detection.

deep learning, likelihood, upstream oil & gas, (18 more...)

2006.08545

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Chowdhury, Sayeed Shafayet, Lee, Chankyu, Roy, Kaushik

Towards Understanding the Effect of Leak in Spiking Neural Networks

Over the past few years, the advancements of deep artificial neural networks (ANNs) have led to remarkable success in various cognitive tasks (e.g., vision, language and behavior). In some cases, neural networks have outperformed the conventional algorithms and achieved human-level performance [1, 2]. However, recent ANNs are becoming extremely compute-intensive and often do not generalize well to previously unseen data during training. On the other hand, human brain can reliably learn and compute intricate cognitive tasks with only a few watts of power budget. Recently, Spiking Neural Networks (SNNs) have been explored toward realizing robust and energy-efficient machine intelligence guided by the cues from neuroscience experiments [3]. SNNs are categorized as the new generation neural networks [4] based on their neuronal functionalities. A variety of spiking neuron models largely resemble biological neuronal mechanisms, which transmit information through discrete spatiotemporal events (or spikes). These spiking neuron models can be characterized by their internal state called the membrane potential. A spiking neuron integrates the inputs over time and fires a spike-output whenever the membrane potential exceeds a threshold.

deep learning, neural network, neuron model, (19 more...)

2006.08761

Country:

North America > United States > Indiana > Tippecanoe County (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry:

Energy > Oil & Gas (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.89)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Wandel, Nils, Weinmann, Michael, Klein, Reinhard

Unsupervised Deep Learning of Incompressible Fluid Dynamics

Fast and stable fluid simulations are an essential prerequisite for applications ranging from computer aided aerodynamic design of automobiles or airplanes to simulations of physical effects in CGI to research in meteorology. Recent differentiable fluid simulations allow gradient based methods to optimize e.g. fluid control systems in an informed manner. Solving the partial differential equations governed by the dynamics of the underlying physical systems, however, is a challenging task and current numerical approximation schemes still come at high computational costs. In this work, we propose an unsupervised framework that allows powerful deep neural networks to learn the dynamics of incompressible fluids end to end on a grid-based representation. For this purpose, we introduce a loss function that penalizes residuals of the incompressible Navier Stokes equations. After training, the framework yields models that are capable of fast and differentiable fluid simulations and can handle various fluid phenomena such as the Magnus effect and K\'arm\'an vortex streets. Besides demonstrating its real-time capability on a GPU, we exploit our approach in a control optimization scenario.

deep learning, neural network, upstream oil & gas, (19 more...)

2006.08762

Country:

North America > United States (0.14)
Africa > Ethiopia (0.14)

Genre: Research Report (0.50)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Transportation (0.73)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Sivaraman, Aishwarya, Farnadi, Golnoosh, Millstein, Todd, Broeck, Guy Van den

Counterexample-Guided Learning of Monotonic Neural Networks

The widespread adoption of deep learning is often attributed to its automatic feature construction with minimal inductive bias. However, in many real-world tasks, the learned function is intended to satisfy domain-specific constraints. We focus on monotonicity constraints, which are common and require that the function's output increases with increasing values of specific input features. We develop a counterexample-guided technique to provably enforce monotonicity constraints at prediction time. Additionally, we propose a technique to use monotonicity as an inductive bias for deep learning. It works by iteratively incorporating monotonicity counterexamples in the learning process. Contrary to prior work in monotonic learning, we target general ReLU neural networks and do not further restrict the hypothesis space. We have implemented these techniques in a tool called COMET. Experiments on real-world datasets demonstrate that our approach achieves state-of-the-art results compared to existing monotonic learners, and can improve the model quality compared to those that were trained without taking monotonicity constraints into account.

artificial intelligence, deep learning, machine learning, (17 more...)

2006.08852

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.50)

Industry:

Information Technology (0.46)
Energy > Power Industry (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Rasouli, Mohammad, Sun, Tao, Rajagopal, Ram

FedGAN: Federated Generative Adversarial Networks for Distributed Data

We propose Federated Generative Adversarial Network (FedGAN) for training a GAN across distributed sources of non-independent-and-identically-distributed data sources subject to communication and privacy constraints. Our algorithm uses local generators and discriminators which are periodically synced via an intermediary that averages and broadcasts the generator and discriminator parameters. We theoretically prove the convergence of FedGAN with both equal and two time-scale updates of generator and discriminator, under standard assumptions, using stochastic approximations and communication efficient stochastic gradient descents. We experiment FedGAN on toy examples (2D system, mixed Gaussian, and Swiss role), image datasets (MNIST, CIFAR-10, and CelebA), and time series datasets (household electricity consumption and electric vehicle charging sessions). We show FedGAN converges and has similar performance to general distributed GAN, while reduces communication complexity. We also show its robustness to reduced communications.

artificial intelligence, fedgan, machine learning, (15 more...)

2006.07228

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (1.00)
Transportation > Electric Vehicle (1.00)
Energy > Power Industry (1.00)
Automobiles & Trucks (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)

Vass, Johannes, Lackner, Marie-Louise, Musliu, Nysret

Exact and Metaheuristic Approaches for the Production Leveling Problem

arXiv.org Artificial IntelligenceJun-15-2020

In this paper we introduce a new problem in the field of production planning which we call the Production Leveling Problem. The task is to assign orders to production periods such that the load in each period and on each production resource is balanced, capacity limits are not exceeded and the orders' priorities are taken into account. Production Leveling is an important intermediate step between long-term planning and the final scheduling of orders within a production period, as it is responsible for selecting good subsets of orders to be scheduled within each period. A formal model of the problem is proposed and NP-hardness is shown by reduction from Bin Backing. As an exact method for solving moderately sized instances we introduce a MIP formulation. For solving large problem instances, metaheuristic local search is investigated. A greedy heuristic and two neighborhood structures for local search are proposed, in order to apply them using Variable Neighborhood Descent and Simulated Annealing. Regarding exact techniques, the main question of research is, up to which size instances are solvable within a fixed amount of time. For the metaheuristic approaches the aim is to show that they produce near-optimal solutions for smaller instances, but also scale well to very large instances. A set of realistic problem instances from an industrial partner is contributed to the literature, as well as random instance generators. The experimental evaluation conveys that the proposed MIP model works well for instances with up to 250 orders. Out of the investigated metaheuristic approaches, Simulated Annealing achieves the best results. It is shown to produce solutions with less than 3% average optimality gap on small instances and to scale well up to thousands of orders and dozens of periods and products. The presented metaheuristic methods are already being used in the industry.

artificial intelligence, machine learning, optimization problem, (16 more...)

arXiv.org Artificial Intelligence

2006.08731

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)