The GRIM test is very simple. The acronym stands for “granularity-related inconsistency of means (GRIM) test” – it evaluates whether reported averages can be made out of their reported sample sizes, and it works on integer data and small N.

Here is how this works:

Let’s make a pretend sample of twelve undergraduates, with ages as follows:

17,19,19,20,20,21,21,21,21,22,24,26

The average age is 20.92 (2dp), and we run the experiment on a Monday.

However, the youngest person in our sample is about to turn 18. At midnight, their age ticks over, […]

[W]e run the experiment again on Tuesday. Now our has the following age data:

18,19,19,20,20,21,21,21,21,22,24,26

The average age is 21 exactly.

Now, consider this: the sum of ages just changed by one unit, which is the smallest amount possible. It was 251 (which divided by 12 is 20.92), and with the birthday of the youngest member, became 252 (which divided by 12 is 21 exactly).

So if the mean cannot be the product of a division by 12, the data must be fake. The authors collected 260 phsychology papers and checked the reported stats to see if the results are even possible. Many are not.

Published inScience