Review for NeurIPS paper: Learning to summarize with human feedback

May-28-2025, 16:22:42 GMT–Neural Information Processing Systems

Weaknesses: However, I have two major concerns: 1. As also mentioned by the authors, this paper is basically an expanded analysis of [3, 58]. Basically, the key techniques of classification-based reward and PPO have been explored in [58], and the major extension is that this paper uses a larger and better-engineered model, and adapts an online setting to the offline setting. Therefore, I feel this paper has very little novelty in the sense of machine learning. The authors are very honest about this in the Related Work (Line 86), though.

human feedback, neurips paper, rouge score, (6 more...)

Neural Information Processing Systems

May-28-2025, 16:22:42 GMT

Conferences Web Page

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (0.97)