review process
- North America > United States > Texas > Brazos County > College Station (0.04)
- Asia > Macao (0.04)
- Law (0.68)
- Government (0.46)
Reducing research bureaucracy in UK higher education: Can generative AI assist with the internal evaluation of quality?
Fletcher, Gordon, Khan, Saomai Vu, Fletcher, Aldus Greenhill
This paper examines the potential for generative artificial intelligence (GenAI) to assist with internal review processes for research quality evaluations in UK higher education and particularly in preparation for the Research Excellence Framework (REF). Using the lens of function substitution in the Viable Systems Model, we present an experimental methodology using ChatGPT to score and rank business and management papers from REF 2021 submissions, "reverse engineering" the assessment by comparing AI-generated scores with known institutional results. Through rigourous testing of 822 papers across 11 institutions, we established scoring boundaries that aligned with reported REF outcomes: 49% between 1* and 2*, 59% between 2* and 3*, and 69% between 3* and 4*. The results demonstrate that AI can provide consistent evaluations that help identify borderline evaluation cases requiring additional human scrutiny while reducing the substantial resource burden of traditional internal review processes. We argue for application through a nuanced hybrid approach that maintains academic integrity while addressing the multi-million pound costs associated with research evaluation bureaucracy. While acknowledging these limitations including potential AI biases, the research presents a promising framework for more efficient, consistent evaluations that could transform current approaches to research assessment.
- Information Technology > Security & Privacy (0.46)
- Education > Educational Setting (0.35)
How to Find Fantastic AI Papers: Self-Rankings as a Powerful Predictor of Scientific Impact Beyond Peer Review
Su, Buxin, Collina, Natalie, Wen, Garrett, Li, Didong, Cho, Kyunghyun, Fan, Jianqing, Zhao, Bingxin, Su, Weijie
Peer review in academic research aims not only to ensure factual correctness but also to identify work of high scientific potential that can shape future research directions. This task is especially critical in fast-moving fields such as artificial intelligence (AI), yet it has become increasingly difficult given the rapid growth of submissions. In this paper, we investigate an underexplored measure for identifying high-impact research: authors' own rankings of their multiple submissions to the same AI conference. Grounded in game-theoretic reasoning, we hypothesize that self-rankings are informative because authors possess unique understanding of their work's conceptual depth and long-term promise. To test this hypothesis, we conducted a large-scale experiment at a leading AI conference, where 1,342 researchers self-ranked their 2,592 submissions by perceived quality. Tracking outcomes over more than a year, we found that papers ranked highest by their authors received twice as many citations as their lowest-ranked counterparts; self-rankings were especially effective at identifying highly cited papers (those with over 150 citations). Moreover, we showed that self-rankings outperformed peer review scores in predicting future citation counts. Our results remained robust after accounting for confounders such as preprint posting time and self-citations. Together, these findings demonstrate that authors' self-rankings provide a reliable and valuable complement to peer review for identifying and elevating high-impact research in AI.
- North America > United States > Pennsylvania (0.04)
- North America > United States > North Carolina (0.04)
- North America > United States > New York (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.95)
3D Printing Supplementary Material
Figure 1: The Slice-100K dataset consists of STL files and their G-code counterparts. However, we do foresee some potential negative societal impacts. We provide additional visualizations to understand the distribution of STL models in Slice-100K. Slicing: We utilize Prusa's Slicer for generating G-code from STL files. Finetuning implementation: For finetuning our translation model, we use a batch size of 32 with 8 gradient accumulation steps.
- Machinery > Industrial Machinery (0.50)
- Law (0.47)
- Government (0.47)
Insights from the ICLR Peer Review and Rebuttal Process
Kargaran, Amir Hossein, Nikeghbal, Nafiseh, Yang, Jing, Ousidhoum, Nedjma
Peer review is a cornerstone of scientific publishing, including at premier machine learning conferences such as ICLR. As submission volumes increase, understanding the nature and dynamics of the review process is crucial for improving its efficiency, effectiveness, and the quality of published papers. We present a large-scale analysis of the ICLR 2024 and 2025 peer review processes, focusing on before- and after-rebuttal scores and reviewer-author interactions. We examine review scores, author-reviewer engagement, temporal patterns in review submissions, and co-reviewer influence effects. Combining quantitative analyses with LLM-based categorization of review texts and rebuttal discussions, we identify common strengths and weaknesses for each rating group, as well as trends in rebuttal strategies that are most strongly associated with score changes. Our findings show that initial scores and the ratings of co-reviewers are the strongest predictors of score changes during the rebuttal, pointing to a degree of reviewer influence. Rebuttals play a valuable role in improving outcomes for borderline papers, where thoughtful author responses can meaningfully shift reviewer perspectives. More broadly, our study offers evidence-based insights to improve the peer review process, guiding authors on effective rebuttal strategies and helping the community design fairer and more efficient review processes. Our code and score changes data are available at https://github.com/papercopilot/iclr-insights.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > California (0.14)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- (12 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- North America > United States > New York > Richmond County > New York City (0.14)
- North America > United States > New York > Queens County > New York City (0.14)
- North America > United States > New York > New York County > New York City (0.14)
- (23 more...)
- Research Report > Experimental Study (1.00)
- Questionnaire & Opinion Survey (1.00)
3D Printing Supplementary Material
Figure 1: The Slice-100K dataset consists of STL files and their G-code counterparts. However, we do foresee some potential negative societal impacts. We provide additional visualizations to understand the distribution of STL models in Slice-100K. Slicing: We utilize Prusa's Slicer for generating G-code from STL files. Finetuning implementation: For finetuning our translation model, we use a batch size of 32 with 8 gradient accumulation steps.
- Machinery > Industrial Machinery (0.50)
- Law (0.47)
- Government (0.47)
We thank the reviewers for their encouraging and instructive comments, and the AC for guiding the review process
We thank the reviewers for their encouraging and instructive comments, and the AC for guiding the review process. Gray (2013), and may look a bit too complicated. We will add a remark in line with our comment above. Note that the assumption on encoder gap is very mild. R2: It is not clear that sparsity-promoting encoders are the right models to be studying. Ours is the first work to address this.