Reinfier and Reintrainer: Verification and Interpretation-Driven Safe Deep Reinforcement Learning Frameworks

Yang, Zixuan, Zheng, Jiaqi, Chen, Guihai

Oct-19-2024–arXiv.org Artificial Intelligence

Ensuring verifiable and interpretable safety of deep reinforcement learning (DRL) is crucial for its deployment in real-world applications. Existing approaches like verification-in-the-loop training, however, face challenges such as difficulty in deployment, inefficient training, lack of interpretability, and suboptimal performance in property satisfaction and reward performance. In this work, we propose a novel verification-driven interpretation-in-the-loop framework Reintrainer to develop trustworthy DRL models, which are guaranteed to meet the expected constraint properties. Specifically, in each iteration, this framework measures the gap between the on-training model and predefined properties using formal verification, interprets the contribution of each input feature to the model's output, and then generates the training strategy derived from the on-the-fly measure results, until all predefined properties are proven. Additionally, the low reusability of existing verifiers and interpreters motivates us to develop Reinfier, a general and fundamental tool within Reintrainer for DRL verification and interpretation. Reinfier features breakpoints searching and verification-driven interpretation, associated with a concise constraint-encoding language DRLP. Evaluations demonstrate that Reintrainer outperforms the state-of-the-art on six public benchmarks in both performance and property guarantees. Our framework can be accessed at https://github.com/Kurayuri/Reinfier.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

Oct-19-2024

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Illinois > Cook County
    - Chicago (0.04)
  - California > San Francisco County
    - San Francisco (0.04)
- Europe > United Kingdom
  - England > Cambridgeshire > Cambridge (0.04)
- Asia > China
  - Jiangsu Province > Nanjing (0.04)

Genre:
- Research Report > New Finding (0.46)

Industry:
- Information Technology (0.45)

Technology:
- Information Technology > Artificial Intelligence > Machine Learning
  - Reinforcement Learning (1.00)
  - Neural Networks > Deep Learning (0.67)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found