Scientific Peer Review Process
The job of your group is to consider the peer review process.
There have been several high-profile incidents of dramatic failures in the process. In the Sokal Affair, a physics professor submitted a paper to Social Text to investigate “a leading North American journal of cultural studies – whose editorial collective includes such luminaries as Fredric Jameson and Andrew Ross – [would] publish an article liberally salted with nonsense if (a) it sounded good and (b) it flattered the editors’ ideological preconceptions”. The paper was published, of course.
In response to this, several MIT scholars designed a tool, SCIGen, which generated gibberish papers, and had one published at WMSCI 2005, admittedly a weak conference. What was less publicized that in the decade following, over 120 papers generated by SCIGen appeared in credible conferences from IEEE and Springer.
These are not isolated incidents. In my experience, all but one of my papers were not read or understood by their reviewers (the exception being IEEE Transactions on Circuits and Systems II, where the reviewers not only read the paper, but re-did all of my derivations, caught error, and provided very thoughtful feedback on both the concept and the writing). I include a sample review below.
- If you’ve published articles, what has your experience with the peer review process been?
- What incentives does academia give reviewers to do a good job? What pressures are there on academics’ time? Will academics who spend a lot of time on reviews get academic jobs? Tenure?
- Consider three otherwise identical academics: (1) One does careful work, free of errors (2) One does sloppy work with many accidental errors (3) One occasionally fakes results. Who is more likely to be selected for academic jobs? For tenure?
- Does this process result in good science? If not, how would you improve on it? Consider the incentives of academics, university administrators, conference organizers, reviewers, and other participants in the system.
For participants unfamiliar with the review process, I attach a sample set of reviews from a paper about deployment of a MOOC recommender system submitted to an IEEE conference.
----------------------- REVIEW 1 --------------------- PAPER: 184 TITLE: Point-of-need-help at Scale: Recommender Systems OVERALL EVALUATION: 1 (weak accept) Relevance: 5 (excellent) Originality: 4 (good) Research significance: 2 (poor) Technical Quality: 2 (poor) Research context/knowledge of the field: 4 (good) Form - Organization and readability: 4 (good) Form - Grammar and style: 4 (good) Best Paper Nomination: 1 (Definitely not) ----------- REVIEW ----------- The paper aims to promote the use of student crowsourcing to obtain quality recomendations. Authors claim that students who arrived at an incorrect answer and later a correct answer could submit a remediation which would be seen by future students who made the same mistake.
Comment: This sentence is confusing our system with one from the prior work section. The rest of the review continues to confuses the two for the rest of the review
however, the authors speak nothing about how are testing the student remediation to be sure that they are correct. This is a crucial point but it is not treated in the paper. It would be very interesting to see a grafic with percentage of correct remediation and incorrect remediation. Explanation of Figure 3 is not understood. Which are the correlations shown in Figure 3? It is missing the explanation about the recomendation algorithm used. The page contained the source code, https://github.com/ANONYMIZED/ANONYMIZED is not operative.
Comment: We replaced identifying information throughout the paper with ‘ANONYMIZED.’
----------------------- REVIEW 2 --------------------- PAPER: 184 TITLE: Point-of-need-help at Scale: Recommender Systems OVERALL EVALUATION: -2 (reject) Relevance: 5 (excellent) Originality: 3 (fair) Research significance: 2 (poor) Technical Quality: 2 (poor) Research context/knowledge of the field: 3 (fair) Form - Organization and readability: 3 (fair) Form - Grammar and style: 4 (good) Best Paper Nomination: 1 (Definitely not) ----------- REVIEW ----------- The paper describes a system for resource suggestion by participants on online courses. The authors do not describe in depth the theoretical foundations of their work, and the analysis of the results is very preliminary and somewhat ad-hoc. The presented qualitative results are limited to a very restricted scale, given that the number of participants was small
Comment: The paper was the first use of a recommender system in a MOOC. It was the largest deployment of such a recommender system to-date, with thousands of students; prior work in RECSYS used classrooms, typically with dozens of students. The reviewers somehow missed both that this was in a MOOC, the number of students given in the submitted paper, and in all of the plots. The rest of the review continues to presume we did this in one residential classroom.
and the observations were limited to one course. Thus, the main point of the paper, the applicability of the proposed recommender in large-scale environments, is not proved by the presented experiments. Furthermore, the presented results did not show any significant effect of the system in student performance. The paper is well-written and the goal and methodology of the presented research is clear (disregarding the aforementioned limitations). Overall, the discussed work needs a complete, more detailed and in larger scale redesign of the experiments in order to assess the main hypothesis posed by the authors.