Statistical reasoning in the Post Office trial

In the Post Office Trial, one ground of the Post Office’s defence was based on statistical reasoning about losses reported by subpostmasters and how likely it was that these resulted from bugs in Horizon. My criticism of this approach below was originally published on Twitter on 14 June 2019.

… different conclusions
PG you’ve relied on a false premise to reach a conclusion.
DW no
PG quotes @sjmurdoch’s tweets about lottery winners. Says you walk into a room and meet a load of lottery winners and you’re really surprised until you see a sign on the door saying...
— Nick Wallis (@nickwallis) June 13, 2019

Reasoning about causes given effects is hard because it requires thinking backwards in time. Let’s instead think forwards, assuming a cause and working out the effect. We can use the result to work out the relative likelihood of causes (through Bayes’ theorem).

Scenario 1: assume claimants’ shortfalls are not due to Horizon bugs. We, therefore, expect the effect of bugs to be evenly spread over claimants and non-claimants.

It’s likely that the large shortfalls reported by claimants are hence not due to their small share of Horizon-bug effects. But remember, we assumed this in the first place, so this exercise didn’t teach us anything.

The Post Office expert witness only considered this scenario, but it misses the alternative explanation – Scenario 2: assume claimants’ shortfalls are due to Horizon bugs.

Bugs may affect subpostmasters randomly, but the affected subpostmasters are much more likely to subsequently join the litigation. It’s therefore not statistically valid to share the effect of bugs equally over claimants and non-claimants.

It’s therefore likely that a large proportion of Horizon bugs are indeed the cause of claimants’ shortfalls. But we assumed this in the first place, so we don’t learn anything new in this case either.

The likelihood of seeing the effects (not in dispute) is roughly the same regardless of which cause we assume. Bayes’ theorem tells us that we, therefore, don’t learn anything new about the likelihood of the cause (which is the topic of dispute) either.

Bottom line: if the statistics are done properly, this line of argument is a waste of time, and the judge was right to bring it to an end.

Postscript

The judgement, published in December 2019, indeed largely dismisses the statistical argument made by the Post Office, for reasons including the one I made above:

Dr Worden [(expert witness for the Post Office)] also assumed that the claimants are a random sample of [subpostmasters (SPMs)]. That this is one of his assumptions can be identified in the following way. His analysis of likelihood of any bug impacting any SPM’s branch accounts would apply equally to any group of SPMs chosen at random from those SPMs with smaller branches. The only adjustment or account taken of the claimant SPMs’ particular characteristics is that they are, generally, from smaller branches, and for this Dr Worden applies what he calls a scaling factor. However, the claimants are not a random sample of SPMs, nor are they even a random sample of SPMs from smaller branches. As a sample, they have already been filtered or selected in that these particular SPMs already complain of bugs, defects and errors in Horizon having affected their branch accounts. This means that they are not a random sample. The way this would be expressed in statistical terms is that the claimant SPMs do not accurately represent the population of SPMs as a whole (or even the population of SPMs who had smaller branches). The claimants are essentially self-selected, from those who believe they have experienced shortfalls and discrepancies in their accounts from the impact of bugs, errors and defects, and who have been prepared to join the litigation. The group has a bias, in statistical terms. They plainly cannot be treated, in statistical terms, as though they are a random group of 587 SPMs.
...
The section 8 analysis is, in my judgment, so riddled with plainly insupportable assumptions as to make it of no evidential value. It is the mathematical or arithmetic equivalent of stating that, given there are 3 million sets of branch accounts, and given there are so many sets of branch accounts of which no complaint is made, the Horizon system is mostly right, most of the time. It is a little more sophisticated than that, but not by very much.

Mathematical interpretation

We can also express this discussion mathematically. Let’s call the claimants’ position, that their shortfalls are due to Horizon bugs, \(h\) and call the Post Office’s position, that there’s some other cause for the shortfalls, \(o\). To decide whether the claimants or the Post Office are correct, we would like to know the relative likelihood of the two parties’ positions, given the shortfalls experienced by the claimants \(s\):

\[\mathrm{P}(h|s) \over \mathrm{P}(o|s)\]

Evaluating the above requires reasoning backwards in time (i.e. what’s the likelihood of the scenario given the evidence), but Bayes law tells us how to invert the conditional probabilities to give us an easier problem to solve (i.e. what’s the likelihood of the evidence given the scenario):

\[{\mathrm{P}(h|s) \over \mathrm{P}(o|s)} = {\mathrm{P}(s|h) \over \mathrm{P}(s|o)} \times {\mathrm{P}(h) \over \mathrm{P}(o)}\]

In other words, the relative likelihood of the two parties’ positions (the posterior) is equal to \(\mathrm{P}(s|h) \over \mathrm{P}(s|o)\) (the Bayes factor) multiplied by our prior belief about the parties’ positions. Let’s calculate the Bayes factor.

First, consider \(\mathrm{P}(s|o)\). This is the probability of seeing these shortfalls, assuming that Horizon bugs did not cause them. This is scenario 1 discussed above and is quite plausible because bugs will be evenly spread over claimants and non-claimants alike, and so the large shortfalls likely have a cause that’s not Horizon bugs.

Then consider \(\mathrm{P}(s|h)\). This is the probability of seeing these shortfalls, assuming that Horizon bugs did cause them. This is scenario 2 discussed above and is also quite plausible since bugs will disproportionately affect claimants because that’s how the group was formed.

Therefore \(\mathrm{P}(s|o)\) and \(\mathrm{P}(s|h)\) are approximately equal and so \(\mathrm{P}(s|h) \over \mathrm{P}(s|o)\) is close to 1. The posterior we want to find out, \(\mathrm{P}(h|s) \over \mathrm{P}(o|s)\) is, therefore, approximately \(\mathrm{P}(h) \over \mathrm{P}(o)\). This is our prior belief before seeing the evidence, and so the argument put forward by the Post Office doesn’t tell us anything we didn’t already know.

Statistical reasoning in the Post Office trial

Postscript

Mathematical interpretation

You May Also Enjoy

The legal rule that computers are presumed to be operating correctly – unforeseen and unjust consequences

What went wrong with Horizon: learning from the Post Office trial

Designing for dispute resolution

Evidence critical systems: designing for dispute resolution