Sunday, December 3, 2023

More on why I am not a fan of pre-registration

This is a draft follow-up to my earlier post on prediction, accomodation, and pre-registration: https://judgmentmisguided.blogspot.com/2018/05/prediction-accommodation-and-pre.html

I argued there that some of the appeal of pre-registration is the result of a philosophical mistake, the idea that prediction of a result is better than post-hoc accomodation of the result once it is found, holding constant the fit of the result to its explanation.

Here I comment on pre-registration from the perspective of editor of Judgment and Decision Making, a journal concerned largely with applied cognitive psychology. I try to answer some common points made by the defenders of pre-registration.

1. As editor, I find myself arguing with authors who pre-registered their data analysis, when I think that their pre-registration is just wrong. Typically, the pre-reg (pre-registration document) ignores our statistics guidelines at https://jbaron.org/journal/stat.htm. For example, it proposes some sort of statistical control, or test for removable interactions. Although it is true that authors do not need to do just what they say in the pre-reg, they must still explain why they changed, and some authors still want to fully report both their pre-registered analysis and what I think is the correct one.

I don't see why pre-registration matters here. For example, one common issue is what to exclude from the main analysis. Often the pre-reg specifies what will be excluded, such as the longest 10\% of responses times, but I often judge this idea to be seriously inferior to something else, such as using a log transform. (The longest times may even reflect the most serious responding, and their outsized influence on statistics can usually be largely eliminate by transformation.)  The author might argue that both should be reported because the 10\% idea was thought of beforehand. But does it matter when you think of it? If it is such an obvious alternative to using the log, then you could think of it after collecting the data. (This is related to my blog post mentioned earlier.) If the main analysis will now be based on logs, it doesn't even matter if the decision to use 10\% was thought of after finding that it yielded clearer results (p-hacking).

2. It may be argued that pre-registration encourages researchers to think ahead. It might do that, but it would be a subtle effect, as it may lead to thinking about issues that would be considered anyway.

The most common failure to think ahead is to neglect alternative explanations of an expected result. You can find that in pre-registrations as well as submitted papers. Maybe pre-registration helps a little, like a nudge. But the most common alternative explanations I see seem to be things like reversed causality (mediator vs. DV), or third-variable causality, in mediation analysis. Pre-regs sometimes propose mediation analyses without thinking of these potential problems. Another common alternative explanation is that interactions are due to scaling effects (hence "removable"). I have never see anyone think of this in advance. Most people haven't heard of this problem (despite Loftus's 1978 paper in Memory and Cognition). Nor the problem with statistical control (again pointed out decades ago, by Kahneman among many others), which they also put in pre-regs.

3. Does pre-registration protect against p-hacking anyway?  Psychology papers are usually multi-study. You can pre-register one study at a time, and that is what I usually (always?) see. So you don't have to report the related studies you did that didn't work, even if you pre-registered each one, although honest reporting would do that anyway. This is a consequence of the more general problem that pre-registration does not require making public the results whether the study works or not. Unlike some clinical trials, you can pre-register a study, do it, find that the result fails to support the hypothesis tested, and put study in the file drawer. In principle, you can even pre-register two ways of doing the same study or analysis and then refer to the pre-registration that fits better when you write the paper. (I suspect that this has never happened. Unlike failing to report those studies that failed, this would probably be considered unethical. But, if a journal starts to REQUIRE pre-registration, the temptation might be much greater.)

4. What do you do to detect p-hacking, without a pre-reg?  I ask whether the analysis done is reasonable or whether some alternative approach would be much better. If a reasonable analysis yields p=.035 for the main hypothesis test, this is a weak result anyway, and it doesn't matter whether it was chosen because some other reasonable analysis yielded p=.051. Weak results are often so strongly consistent with what we already know that they are still very likely to be real. If they are surprising, it is time to ask for another study. Rarely, I find that it is helpful to look at the data; this sometimes happens when the result is reasonable but the analysis looks contrived, so I wonder what is going on.

Pre-registration inhibits the very helpful process of looking at the data before deciding how to proceed with the analysis. This exploration is so much part of my own approach to research that I could not possibly pre-register anything about data analysis. For example, in determining exclusions I often look at something like the distribution of (mean log) response times for the responses to individual items. I often find a cluster of very fast responders, separate from the rest. Sometimes the subjects in these clusters give the same response to every question, or their responses are insensitive to compelling variations that ought to affect everyone. I do this before looking at the effects of removing these subjects on the final results.

5. It seems arrogant to put your own judgment ahead of the authors'.

When it comes to judging other people's papers as editor, I think that relationship between author and editor is not one of equality. I do not need to give equal weight to the author's judgment as reflected in the pre-reg, just as I do not need to give equal weight to the reviewers' opinions and my own.

When I handle a paper, I am the editor. It is my job to enforce my own standards, not to enforce some consensus in which everyone is equal. (I see no point in that when the apparent consensus produces so much junk. "Peer reviewers" are in such demand that any warm body will do. The situation is worst in grant-review panels, which usually don't have an "editor" in charge.) Some editors are better than others. There is nothing I can do about that. (Two of the best were Ray Nickerson and Frank Yates, both of whom died recently.) Journal editors are like judges in the legal system, CEOs of corporations, or deans of universities. They are given special status. We hope they live up to their status.