Green Simulation Assisted Policy Gradient to Accelerate Stochastic Process Control
Zheng, Hua, Xie, Wei, Feng, M. Ben
–arXiv.org Artificial Intelligence
This study is motivated by the critical challenges in the biopharmaceutical manufacturing, including high complexity, high uncertainty, and very limited process data. Each experiment run is often very expensive. To support the optimal and robust process control, we propose a general green simulation assisted policy gradient (GS-PG) framework for both online and offline learning settings. Basically, to address the key limitations of state-of-art reinforcement learning (RL), such as sample inefficiency and low reliability, we create a mixture likelihood ratio based policy gradient estimation that can leverage on the information from historical experiments conducted under different inputs, including process model coefficients and decision policy parameters. Then, to accelerate the learning of optimal and robust policy, we further propose a variance reduction based sample selection method that allows GS-PG to intelligently select and reuse most relevant historical trajectories. The selection rule automatically updates the samples to be reused during the learning of process mechanisms and the search for optimal policy. Our theoretical and empirical studies demonstrate that the proposed framework can perform better than the state-of-art policy gradient approach and accelerate the optimal robust process control for complex stochastic systems under high uncertainty.
arXiv.org Artificial Intelligence
Oct-17-2021
- Country:
- North America
- United States
- New York > New York County
- New York City (0.04)
- Massachusetts
- Suffolk County > Boston (0.04)
- Middlesex County
- California > San Diego County
- San Diego (0.04)
- New York > New York County
- Canada > Ontario
- Waterloo Region > Waterloo (0.04)
- United States
- Europe
- France (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- North America
- Genre:
- Research Report
- New Finding (0.45)
- Experimental Study (0.45)
- Research Report
- Industry:
- Technology: