AITopics

arXiv.org Artificial IntelligenceJul-15-2020

Active World Model Learning with Progress Curiosity

Kim, Kuno, Sano, Megumi, De Freitas, Julian, Haber, Nick, Yamins, Daniel

World models are self-supervised predictive models of how the world evolves. Humans learn world models by curiously exploring their environment, in the process acquiring compact abstractions of high bandwidth sensory inputs, the ability to plan across long temporal horizons, and an understanding of the behavioral patterns of other agents. In this work, we study how to design such a curiosity-driven Active World Model Learning (AWML) system. To do so, we construct a curious agent building world models while visually exploring a 3D physical environment rich with distillations of representative real-world agents. We propose an AWML system driven by $\gamma$-Progress: a scalable and effective learning progress-based curiosity signal. We show that $\gamma$-Progress naturally gives rise to an exploration policy that directs attention to complex but learnable dynamics in a balanced manner, thus overcoming the "white noise problem". As a result, our $\gamma$-Progress-driven controller achieves significantly higher AWML performance than baseline controllers equipped with state-of-the-art exploration strategies such as Random Network Distillation and Model Disagreement.

agent, deep learning, upstream oil & gas, (18 more...)

arXiv.org Artificial Intelligence

2007.07853

Country: North America > United States (0.46)

Genre: Research Report > Experimental Study (0.46)

Industry:

Education (0.93)
Health & Medicine > Therapeutic Area > Neurology (0.68)
Energy > Oil & Gas > Upstream (0.66)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Siahkoohi, Ali, Rizzuti, Gabrio, Witte, Philipp A., Herrmann, Felix J.

Faster Uncertainty Quantification for Inverse Problems with Conditional Normalizing Flows

arXiv.org Machine LearningJul-15-2020

In inverse problems, we often have access to data consisting of paired samples $(x,y)\sim p_{X,Y}(x,y)$ where $y$ are partial observations of a physical system, and $x$ represents the unknowns of the problem. Under these circumstances, we can employ supervised training to learn a solution $x$ and its uncertainty from the observations $y$. We refer to this problem as the "supervised" case. However, the data $y\sim p_{Y}(y)$ collected at one point could be distributed differently than observations $y'\sim p_{Y}'(y')$, relevant for a current set of problems. In the context of Bayesian inference, we propose a two-step scheme, which makes use of normalizing flows and joint data to train a conditional generator $q_{\theta}(x|y)$ to approximate the target posterior density $p_{X|Y}(x|y)$. Additionally, this preliminary phase provides a density function $q_{\theta}(x|y)$, which can be recast as a prior for the "unsupervised" problem, e.g.~when only the observations $y'\sim p_{Y}'(y')$, a likelihood model $y'|x$, and a prior on $x'$ are known. We then train another invertible generator with output density $q'_{\phi}(x|y')$ specifically for $y'$, allowing us to sample from the posterior $p_{X|Y}'(x|y')$. We present some synthetic results that demonstrate considerable training speedup when reusing the pretrained network $q_{\theta}(x|y')$ as a warm start or preconditioning for approximating $p_{X|Y}'(x|y')$, instead of learning from scratch. This training modality can be interpreted as an instance of transfer learning. This result is particularly relevant for large-scale inverse problems that employ expensive numerical simulations.

deep learning, inverse problem, upstream oil & gas, (19 more...)

2007.07985

Genre: Research Report (0.41)

Industry: Energy > Oil & Gas > Upstream (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)

#artificialintelligenceJul-14-2020, 18:17:21 GMT

Breakthrough ML Approach Produces 50X Higher-Resolution Climate Data – IAM Network

Researchers at the US Department of Energy's (DOE's) National Renewable Energy Laboratory (NREL) have developed a novel machine learning approach to quickly enhance the resolution of wind velocity data by 50 times and solar irradiance data by 25 times--an enhancement that has never been achieved before with climate data. The researchers took an alternative approach by using adversarial training, in which the model produces physically realistic details by observing entire fields at a time, providing high-resolution climate data at a much faster rate. This approach will enable scientists to complete renewable energy studies in future climate scenarios faster and with more accuracy. "To be able to enhance the spatial and temporal resolution of climate forecasts hugely impacts not only energy planning, but agriculture, transportation, and so much more," said Ryan King, a senior computational scientist at NREL who specializes in physics-informed deep learning. Recommended AI News: Interlink Electronics Welcomes Aboard Edward Suski As Chief Product Officer King and NREL colleagues Karen Stengel, Andrew Glaws, and Dylan Hettinger authored a new article detailing their approach, titled "Adversarial super-resolution of climatological wind and solar data," which appears in the journal Proceedings of the National Academy of Sciences of the United States …

artificial intelligence, iam network, machine learning, (4 more...)

#artificialintelligence

Country: North America > United States (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceJul-14-2020, 16:38:38 GMT

AI Being Applied to Improve Health, Better Predict Life of Batteries - AI Trends

AI techniques are being applied by researchers aiming to extend the life and monitor the health of batteries, with the aim of powering the next generation of electric vehicles and consumer electronics. Researchers at Cambridge and Newcastle Universities have designed a machine learning method that can predict battery health with ten times the accuracy of the current industry standard, according to an account in ScienceDaily. The promise is to develop safer and more reliable batteries. In a new way to monitor batteries, the researchers sent electrical pulses into them and monitored the response. The measurements were then processed by a machine learning algorithm to enable a prediction of the battery's health and useful life.

artificial intelligence, battery, machine learning, (14 more...)

#artificialintelligence

Country: Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)

Genre: Research Report > New Finding (0.33)

Industry:

Energy > Energy Storage (1.00)
Electrical Industrial Apparatus (1.00)
Transportation > Ground > Road (0.62)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Khurjekar, Ishan D., Harley, Joel B.

Uncertainty Aware Deep Neural Network for Multistatic Localization with Application to Ultrasonic Structural Health Monitoring

arXiv.org Artificial IntelligenceJul-14-2020

Guided ultrasonic wave localization uses spatially distributed multistatic sensor arrays and generalized beamforming strategies to detect and locate damage across a structure. The propagation channel is often very complex. Methods can compare data with models of wave propagation to locate damage. Yet, environmental uncertainty (e.g., temperature or stress variations) often degrade accuracies. This paper uses an uncertainty-aware deep neural network framework to learn robust localization models and represent uncertainty. We use mixture density networks to generate damage location distributions based on training data uncertainty. This is in contrast with most localization methods, which output point estimates. We compare our approach with matched field processing (MFP), a generalized beamforming framework. The proposed approach achieves a localization error of 0.0625 m as compared to 0.1425 m with MFP when data has environmental uncertainty and noise. We also show that the predictive uncertainty scales as environmental uncertainty increases to provide a statistically meaningful metric for assessing localization accuracy.

artificial intelligence, localization, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2007.06814

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Energy > Oil & Gas > Upstream (0.69)
Health & Medicine > Consumer Health (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Curi, Sebastian, Berkenkamp, Felix, Krause, Andreas

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Model-based reinforcement learning algorithms with probabilistic dynamical models are amongst the most data-efficient learning methods. This is often attributed to their ability to distinguish between epistemic and aleatoric uncertainty. However, while most algorithms distinguish these two uncertainties for {\em learning} the model, they ignore it when {\em optimizing} the policy. In this paper, we show that ignoring the epistemic uncertainty leads to greedy algorithms that do not explore sufficiently. In turn, we propose a {\em practical optimistic-exploration algorithm} (\alg), which enlarges the input space with {\em hallucinated} inputs that can exert as much control as the {\em epistemic} uncertainty in the model affords. We analyze this setting and construct a general regret bound for well-calibrated models, which is provably sublinear in the case of Gaussian Process models. Based on this theoretical foundation, we show how optimistic exploration can be easily combined with state-of-the-art reinforcement learning algorithms and different probabilistic models. Our experiments demonstrate that optimistic exploration significantly speeds up learning when there are penalties on actions, a setting that is notoriously difficult for existing model-based reinforcement learning algorithms.

algorithm, deep learning, upstream oil & gas, (18 more...)

2006.08684

Country:

Asia (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Massachusetts > Middlesex County (0.14)
(3 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (1.00)
Media > Television (0.93)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Tarbouriech, Jean, Pirotta, Matteo, Valko, Michal, Lazaric, Alessandro

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

A common assumption in reinforcement learning (RL) is to have access to a generative model (i.e., a simulator of the environment), which allows to generate samples from any desired state-action pair. Nonetheless, in many settings a generative model may not be available and an adaptive exploration strategy is needed to efficiently collect samples from an unknown environment by direct interaction. In this paper, we study the scenario where an algorithm based on the generative model assumption defines the (possibly time-varying) amount of samples $b(s,a)$ required at each state-action pair $(s,a)$ and an exploration strategy has to learn how to generate $b(s,a)$ samples as fast as possible. Building on recent results for regret minimization in the stochastic shortest path (SSP) setting (Cohen et al., 2020; Tarbouriech et al., 2020), we derive an algorithm that requires $\tilde{O}( B D + D^{3/2} S^2 A)$ time steps to collect the $B = \sum_{s,a} b(s,a)$ desired samples, in any unknown and communicating MDP with $S$ states, $A$ actions and diameter $D$. Leveraging the generality of our strategy, we readily apply it to a variety of existing settings (e.g., model estimation, pure exploration in MDPs) for which we obtain improved sample-complexity guarantees, and to a set of new problems such as best-state identification and sparse reward discovery.

optimization problem, requirement, upstream oil & gas, (19 more...)

2007.06437

Country:

North America > United States (0.28)
Europe > United Kingdom > England (0.14)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.34)

Industry: Energy > Oil & Gas > Upstream (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.67)

Abeille, Marc, Lazaric, Alessandro

Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation

We study the exploration-exploitation dilemma in the linear quadratic regulator (LQR) setting. Inspired by the extended value iteration algorithm used in optimistic algorithms for finite MDPs, we propose to relax the optimistic optimization of \ofulq and cast it into a constrained \textit{extended} LQR problem, where an additional control variable implicitly selects the system dynamics within a confidence interval. We then move to the corresponding Lagrangian formulation for which we prove strong duality. As a result, we show that an $\epsilon$-optimistic controller can be computed efficiently by solving at most $O\big(\log(1/\epsilon)\big)$ Riccati equations. Finally, we prove that relaxing the original \ofu problem does not impact the learning performance, thus recovering the $\tilde{O}(\sqrt{T})$ regret of \ofulq. To the best of our knowledge, this is the first computationally efficient confidence-based algorithm for LQR with worst-case optimal regret guarantees.

constraint-based reasoning, efficient optimistic exploration, upstream oil & gas, (17 more...)

2007.06482

Country:

Europe > Sweden (0.14)
Europe > Austria > Vienna (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Araújo, João Pedro, Figueiredo, Mário, Botto, Miguel Ayala

Single-partition adaptive Q-learning

This paper introduces single-partition adaptive Q-learning (SPAQL), an algorithm for model-free episodic reinforcement learning (RL), which adaptively partitions the state-action space of a Markov decision process (MDP), while simultaneously learning a time-invariant policy (i. e., the mapping from states to actions does not depend explicitly on the episode time step) for maximizing the cumulative reward. The trade-off between exploration and exploitation is handled by using a mixture of upper confidence bounds (UCB) and Boltzmann exploration during training, with a temperature parameter that is automatically tuned as training progresses. The algorithm is an improvement over adaptive Q-learning (AQL). It converges faster to the optimal solution, while also using fewer arms. Tests on episodes with a large number of time steps show that SPAQL has no problems scaling, unlike AQL. Based on this empirical evidence, we claim that SPAQL may have a higher sample efficiency than AQL, thus being a relevant contribution to the field of efficient model-free RL methods.

agent, artificial intelligence, upstream oil & gas, (20 more...)

2007.06741

Country:

Europe > Portugal (0.14)
Europe > France (0.14)
North America > United States > New York > New York County > New York City (0.14)
(3 more...)

Genre: Research Report (0.83)

Industry: Energy > Oil & Gas > Upstream (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)