Dahmen, Manuel
Task-optimal data-driven surrogate models for eNMPC via differentiable simulation and optimization
Mayfrank, Daniel, Ahn, Na Young, Mitsos, Alexander, Dahmen, Manuel
We present a method for end-to-end learning of Koopman surrogate models for optimal performance in control. In contrast to previous contributions that employ standard reinforcement learning (RL) algorithms, we use a training algorithm that exploits the potential differentiability of environments based on mechanistic simulation models. We evaluate the performance of our method by comparing it to that of other controller type and training algorithm combinations on a literature known eNMPC case study. Our method exhibits superior performance on this problem, thereby constituting a promising avenue towards more capable controllers that employ dynamic surrogate models.
Multivariate Scenario Generation of Day-Ahead Electricity Prices using Normalizing Flows
Hilger, Hannes, Witthaut, Dirk, Dahmen, Manuel, Gorjao, Leonardo Rydin, Trebbien, Julius, Cramer, Eike
Abstract: Trading on electricity markets requires accurate information about the realization of electricity prices and the uncertainty attached to the predictions. We present a probabilistic forecasting approach for day-ahead electricity prices using the fully data-driven deep generative model called normalizing flows. Our modeling approach generates full-day scenarios of day-ahead electricity prices based on conditional features such as residual load forecasts. Furthermore, we propose extended feature sets of prior realizations and a periodic retraining scheme that allows the normalizing flow to adapt to the changing conditions of modern electricity markets. In particular, we investigate the impact of the energy crisis ensuing from the Russian invasion of Ukraine. Our results highlight that the normalizing flow generates high-quality scenarios that reproduce the true price distribution and yield highly accurate forecasts. Additionally, our analysis highlights how our improvements towards adaptations in changing regimes allow the normalizing flow to adapt to changing market conditions and enables continued sampling of high-quality day-ahead price scenarios.
End-to-End Reinforcement Learning of Koopman Models for Economic Nonlinear Model Predictive Control
Mayfrank, Daniel, Mitsos, Alexander, Dahmen, Manuel
(Economic) nonlinear model predictive control ((e)NMPC) requires dynamic system models that are sufficiently accurate in all relevant state-space regions. These models must also be computationally cheap enough to ensure real-time tractability. Data-driven surrogate models for mechanistic models can be used to reduce the computational burden of (e)NMPC; however, such models are typically trained by system identification for maximum average prediction accuracy on simulation samples and perform suboptimally as part of actual (e)NMPC. We present a method for end-to-end reinforcement learning of dynamic surrogate models for optimal performance in (e)NMPC applications, resulting in predictive controllers that strike a favorable balance between control performance and computational demand. We validate our method on two applications derived from an established nonlinear continuous stirred-tank reactor model. We compare the controller performance to that of MPCs utilizing models trained by the prevailing maximum prediction accuracy paradigm, and model-free neural network controllers trained using reinforcement learning. We show that our method matches the performance of the model-free neural network controllers while consistently outperforming models derived from system identification. Additionally, we show that the MPC policies can react to changes in the control setting without retraining.
A Recursively Recurrent Neural Network (R2N2) Architecture for Learning Iterative Algorithms
Doncevic, Danimir T., Mitsos, Alexander, Guo, Yue, Li, Qianxiao, Dietrich, Felix, Dahmen, Manuel, Kevrekidis, Ioannis G.
Abstract: Meta-learning of numerical algorithms for a given task consists of the data-driven identification and adaptation of an algorithmic structure and the associated hyperparameters. To limit the complexity of the meta-learning problem, neural architectures with a certain inductive bias towards favorable algorithmic structures can, and should, be used. We generalize our previously introduced Runge-Kutta neural network to a recursively recurrent neural network (R2N2) superstructure for the design of customized iterative algorithms. In contrast to off-the-shelf deep learning approaches, it features a distinct division into modules for generation of information and for the subsequent assembly of this information towards a solution. Local information in the form of a subspace is generated by subordinate, inner, iterations of recurrent function evaluations starting at the current outer iterate. The update to the next outer iterate is computed as a linear combination of these evaluations, reducing the residual in this space, and constitutes the output of the network. We demonstrate that regular training of the weight parameters inside the proposed superstructure on input/output data of various computational problem classes yields iterations similar to Krylov solvers for linear equation systems, Newton-Krylov solvers for nonlinear equation systems, and Runge-Kutta integrators for ordinary differential equations. Due to its modularity, the superstructure can be readily extended with functionalities needed to represent more general classes of iterative algorithms traditionally based on Taylor series expansions.
Nonlinear Isometric Manifold Learning for Injective Normalizing Flows
Cramer, Eike, Rauh, Felix, Mitsos, Alexander, Tempone, Raรบl, Dahmen, Manuel
Some of the published Normalizing flows are deep generative models approaches assume a dimensionality-reducing (DGM) that represent the probability distribution map to be known and available a priori [12, 13]. of high-dimensional data sets as a change of variables Other works use compositions of manifold learning of a multivariate Gaussian [1, 2]. Using the inverse models and normalizing flows that are trained simultaneously, of this transformation, normalizing flows can e.g., the M-Flow [6], Noisy Injective compute the probability density functions (PDFs) Flows [14], piecewise injective flows called Trumpets explicitly, thus enabling training via the statistically [15], and neural manifold ordinary differential consistent and asymptotically efficient [3] likelihood equations [16].
Multivariate Probabilistic Forecasting of Intraday Electricity Prices using Normalizing Flows
Cramer, Eike, Witthaut, Dirk, Mitsos, Alexander, Dahmen, Manuel
Electricity is traded on various markets with different time horizons and regulations. Short-term intraday trading becomes increasingly important due to the higher penetration of renewables. In Germany, the intraday electricity price typically fluctuates around the day-ahead price of the European Power EXchange (EPEX) spot markets in a distinct hourly pattern. This work proposes a probabilistic modeling approach that models the intraday price difference to the day-ahead contracts. The model captures the emerging hourly pattern by considering the four 15 min intervals in each day-ahead price interval as a four-dimensional joint probability distribution. The resulting nontrivial, multivariate price difference distribution is learned using a normalizing flow, i.e., a deep generative model that combines conditional multivariate density estimation and probabilistic regression. Furthermore, this work discusses the influence of different external impact factors based on literature insights and impact analysis using explainable artificial intelligence (XAI). The normalizing flow is compared to an informed selection of historical data and probabilistic forecasts using a Gaussian copula and a Gaussian regression model. Among the different models, the normalizing flow identifies the trends with the highest accuracy and has the narrowest prediction intervals. Both the XAI analysis and the empirical experiments highlight that the immediate history of the price difference realization and the increments of the day-ahead price have the most substantial impact on the price difference.
Graph Machine Learning for Design of High-Octane Fuels
Rittig, Jan G., Ritzert, Martin, Schweidtmann, Artur M., Winkler, Stefanie, Weber, Jana M., Morsch, Philipp, Heufer, K. Alexander, Grohe, Martin, Mitsos, Alexander, Dahmen, Manuel
Fuels with high-knock resistance enable modern spark-ignition engines to achieve high efficiency and thus low CO2 emissions. Identification of molecules with desired autoignition properties indicated by a high research octane number and a high octane sensitivity is therefore of great practical relevance and can be supported by computer-aided molecular design (CAMD). Recent developments in the field of graph machine learning (graph-ML) provide novel, promising tools for CAMD. We propose a modular graph-ML CAMD framework that integrates generative graph-ML models with graph neural networks and optimization, enabling the design of molecules with desired ignition properties in a continuous molecular space. In particular, we explore the potential of Bayesian optimization and genetic algorithms in combination with generative graph-ML models. The graph-ML CAMD framework successfully identifies well-established high-octane components. It also suggests new candidates, one of which we experimentally investigate and use to illustrate the need for further auto-ignition training data.
Physical Pooling Functions in Graph Neural Networks for Molecular Property Prediction
Schweidtmann, Artur M., Rittig, Jan G., Weber, Jana M., Grohe, Martin, Dahmen, Manuel, Leonhard, Kai, Mitsos, Alexander
Graph neural networks (GNNs) are emerging in chemical engineering for the end-to-end learning of physicochemical properties based on molecular graphs. A key element of GNNs is the pooling function which combines atom feature vectors into molecular fingerprints. Most previous works use a standard pooling function to predict a variety of properties. However, unsuitable pooling functions can lead to unphysical GNNs that poorly generalize. We compare and select meaningful GNN pooling methods based on physical knowledge about the learned properties. The impact of physical pooling functions is demonstrated with molecular properties calculated from quantum mechanical computations. We also compare our results to the recent set2set pooling approach. We recommend using sum pooling for the prediction of properties that depend on molecular size and compare pooling functions for properties that are molecular size-independent. Overall, we show that the use of physical pooling functions significantly enhances generalization.
Graph Neural Networks for Temperature-Dependent Activity Coefficient Prediction of Solutes in Ionic Liquids
Rittig, Jan G., Hicham, Karim Ben, Schweidtmann, Artur M., Dahmen, Manuel, Mitsos, Alexander
Ionic liquids (ILs) are important solvents for sustainable processes and predicting activity coefficients (ACs) of solutes in ILs is needed. Recently, matrix completion methods (MCMs), transformers, and graph neural networks (GNNs) have shown high accuracy in predicting ACs of binary mixtures, superior to well-established models, e.g., COSMO-RS and UNIFAC. GNNs are particularly promising here as they learn a molecular graph-to-property relationship without pretraining, typically required for transformers, and are, unlike MCMs, applicable to molecules not included in training. For ILs, however, GNN applications are currently missing. Herein, we present a GNN to predict temperature-dependent infinite dilution ACs of solutes in ILs. We train the GNN on a database including more than 40,000 AC values and compare it to a state-of-the-art MCM. The GNN and MCM achieve similar high prediction performance, with the GNN additionally enabling high-quality predictions for ACs of solutions that contain ILs and solutes not considered during training.