Policy Gradient Methods for Discrete Time Linear Quadratic Regulator With Random Parameters
–arXiv.org Artificial Intelligence
Linear Quadratic (LQ) control problem for discrete time with random parameters whose study goes back to Kalman [12] finds applications in a wide range of practical problems, such as random sampling of a diffusion process in digital control [17], sampling of a system with noise caused [6] and economic systems [1]. Consequently, extensive results has been carried out in this area [6, 2, 4, 15, 3]. However, the literatures cited above assumes a priori knowledge of model parameters, which is unrealistic in many practical scenarios. Therefore, solving such problem without statistical information of model parameters are of great importance from both theoretical and practical perspectives. Recent years have witnessed a huge growth in learning approaches, among which the reinforcement learning (RL) method has garnered a great deal of attention from researchers [8, 18, 9, 10, 7, 14]. There are two categories of RL-methods: the model-based RL and the model-free RL. The model-based RL approach estimates the transition dynamics by observing or conducting experiments and then designs the control policy using the estimated parameters [16, 5].
arXiv.org Artificial Intelligence
Mar-29-2023
- Country:
- Asia
- China > Shanghai
- Shanghai (0.04)
- Middle East > Jordan (0.04)
- China > Shanghai
- Europe > United Kingdom
- England > Cambridgeshire > Cambridge (0.04)
- Asia
- Genre:
- Research Report (1.00)
- Technology: