Statistics are ubiquitous, used to convey information about everything from money markets to medicine. With their clear scales and round numbers, they appeal to our intuition and seem simple to grasp. Yet this veneer of simplicity is often misleading.
Imagine you’re given a HIV test, which you’re told is 99.99 per cent accurate. It comes back positive. So what are the odds you have HIV? Instinct tells most of us that we almost certainly have the disease. Yet the actual answer is closer to 50 per cent for most patients.
If you’re perplexed by that result you’re in good company; most people, including many medical professionals, tend to be equally flummoxed.
Being HIV-negative
The seemingly bizarre result is a consequence of Baye’s theorem, a mathematical framework for combining conditional probabilities. The test’s accuracy depends on the chance that a person has the virus in the first place.
For a typical low-risk person, the chance of having HIV is about 1 in 10,000. Imagine that 10,000 people walk in for a HIV test; one of them has the virus and will almost certainly test positive. But in the remaining 9,999 another will test positive because of the limits of the test’s accuracy. That leaves two positive tests, only one of which – in other words, 50 per cent – is a true positive.
This result is not down to the test’s being inadequate. The HIV test in our example is very accurate, but the rarity of the illness makes the conditional probability much lower than what we may intuitively expect.
In fact, the likelihood of a particular subject being infected is inextricably entangled with the precision of the result.
Consider the same test administered to a high-risk population, such as drug users. Here the infection rate is about 1.5 per cent, so if 10,000 high-risk people get tested about 150 will have the virus and test positive. Of the remaining patients there will be one false positive. In this instance the odds on having HIV with a positive test are 150 out of 151, or 99.38 per cent – markedly different from the low-risk situation.
Despite their ubiquity, statistical concepts can be difficult to grasp. Baye’s theorem in the HIV-test example illustrates the way that statistics’ seeming simplicity hides layers of complexity and can lead us to erroneous conclusions in science, politics, economics and other arenas.
Death by statistics
This is not just academic: we live in an age when statistics decide everything from medical treatments to government action. When mistakes are made there can be a high human cost.
This is well illustrated by the case of Roy Meadow, the British paediatrician well known for his conjecture that "one sudden infant death is a tragedy, two is suspicious and three is murder, until proved otherwise". Meadow's work was highly regarded, influencing for a time the thinking of many social workers. Yet some of his conclusions were based on a misreading of statistics.
In the UK in 1999, Sally and Stephen Clark lost two young sons to what appeared to be sudden infant death syndrome. Exacerbating this grief, Sally was accused of murder, and Meadow gave evidence against her, saying that for a middle-class, nonsmoking family such as the Clarks, the likelihood of an occurrence of the syndrome was one in 8,543. He further asserted that the chance of two cases in the one family was one in 8,543 squared – or almost one in 73 million.
The media snapped up this soundbite as proof of guilt. Based largely on Meadow’s testimony, Clark was vilified by the press and convicted of murdering her children.
This verdict horrified statisticians, for good reason: multiplying probabilities together is correct for independent events, such as coin flips and roulette-wheels spins. But it fails horribly when the events are not independent. Sudden infant death syndrome tends to run in families, perhaps because of a genetic or environmental factor, and in this case the assumption that the two deaths were independent was nonsensical.
The prosecution and national media reasoned that Sally Clark was guilty, but they committed a statistical faux pas so common in courts that it is known as the prosecutor's fallacy; even if the probability given by Meadows had been correct, the inference drawn from it was quite wrong.
Although multiple cases of sudden infant death syndrome may be rare, so too are multiple maternal infanticides. Both explanations need to be compared with one other to determine likelihood. On their own, figures such as one in 73 million tell us nothing about which hypothesis is more likely.
The Royal Statistical Society strongly criticised the prosecution’s abuse of statistics, but its protestations fell on deaf ears.
Stephen Clark dedicated himself to the case full time, and his wife’s conviction was finally overturned in 2003. By this stage she had spent more than three years in jail and suffered protracted grief and a number of serious psychological disorders. She died in 2007 of acute alcohol intoxication.
Her appalling story is a reminder that numbers matter, and it is vital to understand that statistics divorced of context and qualification are fertile grounds for confusion.
If a newspaper headline screams that eating a certain type of food doubles your risk of a particular cancer, it might seem reasonable to stop eating it. But if the initial odds of getting this cancer are only one in 10 million, then the subsequent odds become one in five million – a very minor increase.
Both figures are correct, but the former is relative risk and the latter absolute risk.
Guessing lottery numbers
Our psychology, too, can mislead us. Truly random events have no “memory” of previous events, and our tenacity to extrapolate from our observations can mislead us. For example, in a national lottery, the sequence 1-2-3-4-5-6 is as likely to tumble from the machine as any other combination, yet picking this combination intuitively feels less likely than a wider spread of numbers.
House price rises and falls
In economics and politics, statistical mistakes are rife, often perpetuated by an uncritical media. If a house valued at €200,000 falls in value by 50 per cent in one year and rises by 50 per cent the next, it may be reported that the house has recovered its former market value. Yet this is false: at the end of the first year, the house is worth only €100,000. Increasing by 50 per cent the next year, it rises to €150,000: 75 per cent of its initial value.
Numbers devoid of context can convey misleading impressions, and it can take some finesse and clever questions to see the true message. If we are to benefit from statistical analysis we owe it to ourselves to improve our understanding, lest we fall victim to misconception.