Automatic Analysis of Substantiation in Scientific Peer Reviews

Guo, Yanzhu, Shang, Guokan, Rennard, Virgile, Vazirgiannis, Michalis, Clavel, Chloé

Nov-20-2023–arXiv.org Artificial Intelligence

With the increasing amount of problematic peer reviews in top AI conferences, the community is urgently in need of automatic quality control measures. In this paper, we restrict our attention to substantiation -- one popular quality aspect indicating whether the claims in a review are sufficiently supported by evidence -- and provide a solution automatizing this evaluation process. To achieve this goal, we first formulate the problem as claim-evidence pair extraction in scientific peer reviews, and collect SubstanReview, the first annotated dataset for this task. SubstanReview consists of 550 reviews from NLP conferences annotated by domain experts. On the basis of this dataset, we train an argument mining system to automatically analyze the level of substantiation in peer reviews. We also perform data analysis on the SubstanReview dataset to obtain meaningful insights on peer reviewing quality in NLP conferences over recent years.

annotator, dataset, substantiation, (15 more...)

arXiv.org Artificial Intelligence

Nov-20-2023

arXiv.org PDF

Add feedback

Country:
- North America
  - United States
    - Washington > King County
      - Seattle (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
  - Canada > Ontario
    - Toronto (0.04)
- Europe
  - United Kingdom > England
    - Cambridgeshire > Cambridge (0.04)
  - France
    - Île-de-France > Paris
      - Paris (0.04)
    - Provence-Alpes-Côte d'Azur > Bouches-du-Rhône
      - Marseille (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Middle East > UAE
    - Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre:
- Research Report > New Finding (0.46)

Technology:
- Information Technology > Artificial Intelligence
  - Natural Language
    - Large Language Model (1.00)
    - Explanation & Argumentation (0.69)
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)