React to Surprises: Stable-by-Design Neural Feedback Control and the Youla-REN

Barbara, Nicholas H., Wang, Ruigang, Megretski, Alexandre, Manchester, Ian R.

arXiv.org Artificial Intelligence 

-- We study parameterizations of stabilizing nonlinear policies for learning-based control. We propose a structure based on a nonlinear version of the Y oula-Ku ˇ cera parameterization combined with robust neural networks such as the recurrent equilibrium network (REN). The resulting parameterizations are unconstrained, and hence can be searched over with first-order optimization methods, while always ensuring closed-loop stability by construction. We study the combination of (a) nonlinear dynamics, (b) partial observation, and (c) incremental closed-loop stability requirements (contraction and Lipschitzness). We find that with any two of these three difficulties, a contracting and Lipschitz Y oula parameter always leads to contracting and Lipschitz closed loops. However, if all three hold, then incremental stability can be lost with exogenous disturbances. Instead, a weaker condition is maintained, which we call d-tube contraction and Lipschitzness. We further obtain converse results showing that the proposed pa-rameterization covers all contracting and Lipschitz closed loops for certain classes of nonlinear systems. Numerical experiments illustrate the utility of our parameterization when learning controllers with built-in stability certificates for: (i) "economic" rewards without stabilizing effects; (ii) short training horizons; and (iii) uncertain systems. Deep reinforcement learning (RL) is an emerging technology for general-purpose nonlinear control design via simulation. It has been successfully applied in many complex domains ranging from strategy games [1] to robotics [2], [3] to nuclear fusion [4]. The standard approach is to minimize empirical estimates of an expected cost over repeated episodes with random disturbances, typically using deep neural networks for black-box policy parameterizations [5].