Proximal Reliability Optimization for Reinforcement Learning

Patwardhan, Narendra, Wang, Zequn

arXiv.org Machine Learning 

In recent years, reinforcement learning has seen incremental growth in replacing classical dynamic programming in the field of control engineering due to it making limited to no assumptions about the dynamics of the system. Instead, it depends upon universal approximating capabilities of the control structure to develop a good control function through trial and error experimentation. The challenge of this approach is to efficiently carry out the exploration, which allows the controller to adapt to a control strategy with satisfactory global performance. We can envision the implausibility of directly employing reinforcement learning approach in designing a controller for a physical system, as the controller may crash during thousands or even tens of thousands of trials needed before it finds a stable control function, thereby making it an impractical practice for designing robust controllers. Since conducting trials, in reality, is often infeasible, usually, a mathematical model of the physical system is constructed in the form of a simulator, the controller is designed for the model, and then the controller is implemented on the physical system. If there are substantial differences between the model and the physical system, often called the reality gap, then the controller may operate with compromised performance and possibly be unstable. Physical systems often possess underlying dynamics that are difficult to measure accurately such as friction, density distribution, and unknown torques. Furthermore, the dynamics of the system often change over time; the change can be gradual such as when devices wear or new systems break-in or the change can be abrupt as in the catastrophic failure of a sub-component or the replacement of an old part with a new one.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found