Reviews: Text-Based Interactive Recommendation via Constraint-Augmented Reinforcement Learning

Neural Information Processing Systems 

Eq. (3), Eq. (5) and its model details) is consistent with the target task. The reward and constraints are reasonably designed. The experimental setting is remarkable (especially the Online Evaluation by simulator and the four proposed evaluation metrics) and the results are positive. However, this paper still has the following minor issues.