Energy
Distribution Regression for Sequential Data
Lemercier, Maud, Salvi, Cristopher, Damoulas, Theodoros, Bonilla, Edwin V., Lyons, Terry
Distribution regression refers to the supervised learning problem where labels are only available for groups of inputs instead of individual inputs. In this paper, we develop a rigorous mathematical framework for distribution regression where inputs are complex data streams. Leveraging properties of the expected signature and a recent signature kernel trick for sequential data from stochastic analysis, we introduce two new learning techniques, one feature-based and the other kernel-based. Each is suited to a different data regime in terms of the number of data streams and the dimensionality of the individual streams. We provide theoretical results on the universality of both approaches and demonstrate empirically their robustness to irregularly sampled multivariate time-series, achieving state-of-the-art performance on both synthetic and real-world examples from thermodynamics, mathematical finance and agricultural science.
An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits
Tirinzoni, Andrea, Pirotta, Matteo, Restelli, Marcello, Lazaric, Alessandro
In the contextual linear bandit setting, algorithms built on the optimism principle fail to exploit the structure of the problem and have been shown to be asymptotically suboptimal. In this paper, we follow recent approaches of deriving asymptotically optimal algorithms from problem-dependent regret lower bounds and we introduce a novel algorithm improving over the state-of-the-art along multiple dimensions. We build on a reformulation of the lower bound, where context distribution and exploration policy are decoupled, and we obtain an algorithm robust to unbalanced context distributions. Then, using an incremental primal-dual approach to solve the Lagrangian relaxation of the lower bound, we obtain a scalable and computationally efficient algorithm. Finally, we remove forced exploration and build on confidence intervals of the optimization problem to encourage a minimum level of exploration that is better adapted to the problem structure. We demonstrate the asymptotic optimality of our algorithm, while providing both problem-dependent and worst-case finite-time regret guarantees. Our bounds scale with the logarithm of the number of arms, thus avoiding the linear dependence common in all related prior works. Notably, we establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits. Finally, we verify that our algorithm obtains better empirical performance than state-of-the-art baselines.
Instance-Wise Minimax-Optimal Algorithms for Logistic Bandits
Abeille, Marc, Faury, Louis, Calauzènes, Clément
Logistic Bandits have recently attracted substantial attention, by providing an uncluttered yet challenging framework for understanding the impact of non-linearity in parametrized bandits. It was shown by Faury et al. (2020) that the learning-theoretic difficulties of Logistic Bandits can be embodied by a large (sometimes prohibitively) problem-dependent constant $\kappa$, characterizing the magnitude of the reward's non-linearity. In this paper we introduce a novel algorithm for which we provide a refined analysis. This allows for a better characterization of the effect of non-linearity and yields improved problem-dependent guarantees. In most favorable cases this leads to a regret upper-bound scaling as $\tilde{\mathcal{O}}(d\sqrt{T/\kappa})$, which dramatically improves over the $\tilde{\mathcal{O}}(d\sqrt{T}+\kappa)$ state-of-the-art guarantees. We prove that this rate is minimax-optimal by deriving a $\Omega(d\sqrt{T/\kappa})$ problem-dependent lower-bound. Our analysis identifies two regimes (permanent and transitory) of the regret, which ultimately re-conciliates Faury et al. (2020) with the Bayesian approach of Dong et al. (2019). In contrast to previous works, we find that in the permanent regime non-linearity can dramatically ease the exploration-exploitation trade-off. While it also impacts the length of the transitory phase in a problem-dependent fashion, we show that this impact is mild in most reasonable configurations.
Gradient Flows in Dataset Space
Alvarez-Melis, David, Fusi, Nicolò
The current practice in machine learning is traditionally model-centric, casting problems as optimization over model parameters, all the while assuming the data is either fixed, or subject to extrinsic and inevitable change. On one hand, this paradigm fails to capture important existing aspects of machine learning, such as the substantial data manipulation (\emph{e.g.}, augmentation) that goes into most state-of-the-art pipelines. On the other hand, this viewpoint is ill-suited to formalize novel data-centric problems, such as model-agnostic transfer learning or dataset synthesis. In this work, we view these and other problems through the lens of \textit{dataset optimization}, casting them as optimization over data-generating distributions. We approach this class of problems through Wasserstein gradient flows in probability space, and derive practical and efficient particle-based methods for a flexible but well-behaved class of objective functions. Through various experiments on synthetic and real datasets, we show that this framework provides a principled and effective approach to dataset shaping, transfer, and interpolation.
Super-Resolution Reconstruction of Interval Energy Data
High-resolution data are desired in many data-driven applications; however, in many cases only data whose resolution is lower than expected are available due to various reasons. It is then a challenge how to obtain as much useful information as possible from the low-resolution data. In this paper, we target interval energy data collected by Advanced Metering Infrastructure (AMI), and propose a Super-Resolution Reconstruction (SRR) approach to upsample low-resolution (hourly) interval data into higher-resolution (15-minute) data using deep learning. Our preliminary results show that the proposed SRR approaches can achieve much improved performance compared to the baseline model.
A Perspective on Machine Learning Methods in Turbulence Modelling
This work presents a review of the current state of research in data-driven turbulence closure modeling. It offers a perspective on the challenges and open issues, but also on the advantages and promises of machine learning methods applied to parameter estimation, model identification, closure term reconstruction and beyond, mostly from the perspective of Large Eddy Simulation and related techniques. We stress that consistency of the training data, the model, the underlying physics and the discretization is a key issue that needs to be considered for a successful ML-augmented modeling strategy. In order to make the discussion useful for non-experts in either field, we introduce both the modeling problem in turbulence as well as the prominent ML paradigms and methods in a concise and self-consistent manner. Following, we present a survey of the current data-driven model concepts and methods, highlight important developments and put them into the context of the discussed challenges.
Towards Safe Policy Improvement for Non-Stationary MDPs
Chandak, Yash, Jordan, Scott M., Theocharous, Georgios, White, Martha, Thomas, Philip S.
Many real-world sequential decision-making problems involve critical systems with financial risks and human-life risks. While several works in the past have proposed methods that are safe for deployment, they assume that the underlying problem is stationary. However, many real-world problems of interest exhibit non-stationarity, and when stakes are high, the cost associated with a false stationarity assumption may be unacceptable. We take the first steps towards ensuring safety, with high confidence, for smoothly-varying non-stationary decision problems. Our proposed method extends a type of safe algorithm, called a Seldonian algorithm, through a synthesis of model-free reinforcement learning with time-series analysis. Safety is ensured using sequential hypothesis testing of a policy's forecasted performance, and confidence intervals are obtained using wild bootstrap.
How Technology is Transforming the Circular Economy
When it comes to expanding the circular economy, the IT industry is one of the effort's greatest enablers -- especially when technology provides scalable solutions that drive real value. In fact, global sustainability experts have identified seven distinct types of digital technology that already do or soon will play a critical role in implementing and furthering thousands of circularity initiatives: digital access, cloud, cognitive, blockchain, fast internet, IoT and digital reality. However, the computing power that drives these technologies can use an enormous amount of resources -- from electricity to power equipment and water for cooling facilities to physical materials and manufacturing impacts. To take leadership and develop credibility as a circular solution, IT-based businesses need to apply their expertise towards developing solutions that not only help other circularity initiatives in other industries, but also "clean their own house." Cloud technology holds perhaps the greatest potential for promoting sustainability within the IT industry itself.
Batch Exploration with Examples for Scalable Robotic Reinforcement Learning
Chen, Annie S., Nam, HyunJi, Nair, Suraj, Finn, Chelsea
Learning from diverse offline datasets is a promising path towards learning general purpose robotic agents. However, a core challenge in this paradigm lies in collecting large amounts of meaningful data, while not depending on a human in the loop for data collection. One way to address this challenge is through task-agnostic exploration, where an agent attempts to explore without a task-specific reward function, and collect data that can be useful for any downstream task. While these approaches have shown some promise in simple domains, they often struggle to explore the relevant regions of the state space in more challenging settings, such as vision based robotic manipulation. This challenge stems from an objective that encourages exploring everything in a potentially vast state space. To mitigate this challenge, we propose to focus exploration on the important parts of the state space using weak human supervision. Concretely, we propose an exploration technique, Batch Exploration with Examples (BEE), that explores relevant regions of the state-space, guided by a modest number of human provided images of important states. These human provided images only need to be collected once at the beginning of data collection and can be collected in a matter of minutes, allowing us to scalably collect diverse datasets, which can then be combined with any batch RL algorithm. We find that BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot, and observe that compared to task-agnostic and weakly-supervised exploration techniques, it (1) interacts more than twice as often with relevant objects, and (2) improves downstream task performance when used in conjunction with offline RL.
Necessary and sufficient conditions for causal feature selection in time series with latent common causes
Mastakouri, Atalanti A., Schölkopf, Bernhard, Janzing, Dominik
We study the identification of direct and indirect causes on time series and provide conditions in the presence of latent variables, which we prove to be necessary and sufficient under some graph constraints. Our theoretical results and estimation algorithms require two conditional independence tests for each observed candidate time series to determine whether or not it is a cause of an observed target time series. We provide experimental results in simulations, as well as real data. Our results show that our method leads to very low false positives and relatively low false negative rates, outperforming the widely used Granger causality.