Saturday, July 21, 2012

How to update your beliefs in light of new evidence

This post is the second in a series on probability and statistical inference, and follows this one, in which I asked readers a question that I had once put to applicants to Oxford in physics interviews. If you have not yet answered the question in the poll, you should click on the link above and try it first! The poll will remain open for anyone who would like to try answering the question before reading the answer below.

The question I asked was:
You learn about the existence of some rare disease, X, which is known to affect 1 in every 10,000 people. Being a bit of a hypochondriac, you are afraid that you may have disease X, so you go to your doctor for a blood test. The doctor tells you that the test for this disease is accurate 99.9% of the time. To your horror, the test result comes back positive. What is the probability you have disease X?
The correct answer to this question is 0.091, or a likelihood of 9.1%. (At the time of writing, this is the most popular answer in the poll, but more than half of the respondents chose one of the other 4 answers.)

Someone suggested to me that the question was a little unfair, because it could only be answered by someone who had already studied the relevant mathematics (Bayes' theorem). I don't think this is true, which is why I chose this problem as an introduction to this series of posts in which I want to explain and discuss the importance of Bayesian approach to evidence! To demonstrate this, I will first show how to arrive at the answer without any mention of the theorem, and then move to a more advanced discussion.

Suppose that everyone in the country underwent the same blood test. Based on the incidence rate of disease X, we would expect on average 1 person in every 10,000 to actually have the disease, and the remaining 9,999 to be completely healthy. Since the test to detect disease X is accurate 99.9% of the time, the likelihood that a healthy person would receive a false positive result from the test is 0.1%. So on average 0.1% of the 9,999 healthy people receive a test result that wrongly says they are suffering from X. 0.1% of 9,999 is 9.9, or almost 10 people. Let's assume that the test has exactly zero probability of giving a false negative result (though you could take this to be 0.1% as well without making any real difference to the calculation). This means the one genuinely ill person in every 10,000 also gets a positive test result.

Overall, then, out of every 10,000 people taking the test, on average 11 will receive a positive test result, though in reality only one of them has disease X. So, given your test result, the odds that you are the unlucky one actually suffering from X are simply 1 in 11, or approximately 9.1%.

Of course all questions are simple once you know the answer. The key point here was to be clear about exactly what question was being asked. Taking any person at random before they had been tested, if you were to ask "what is the likelihood that the test will give the correct result for this person?" the answer would indeed be 99.9%. But what we wanted to know was really an after the fact assessment: "Given that the test result has come back positive, and knowing how accurate the test is, how should we update our prior belief that this person has disease X?" I chose the numbers specifically to make the posterior probability (9.1%) come out to be quite different from both the prior likelihood (0.01%) and the likelihood that the test should give a correct result for any random person (99.9%).

The method for obtaining the answer sketched out above is obviously easy to understand, but it is a bit tedious to count out all possibilities in this manner every time you want to answer such a question. Bayes' Theorem is a general mathematical formulation that encodes the same logic and can quickly and conveniently be used to obtain the answer any time we need to update our prior beliefs after being confronted by new data. For a simple statement of the theorem as applicable to our example, let us denote by $P(A)$ the prior probability we associate with some proposition $A$, in this case $A$ being "have disease X", by $P(\bar{A})$ the probability of its converse, and by $P(B)$ the probability of statement $B$, in this case "test result for X is positive". Conditional probabilities, such as "the probability that $A$ is true given that $B$ is known to be true," are written as $P(A|B)$. Probabilities of combined statements (assuming they are independent) are obtained by multiplying to probabilities of the components, and the total probability for a statement to be true is the sum of the probabilities of the individual ways in which it can be true so that, for instance, $P(B)=P(B|A)P(A)+P(B|\bar{A})P(\bar{A})$.

Bayes' theorem then tells us that the answer to the question we posed is: $$P(A|B)=\frac{P(B|A)P(A)}{P(B)}\;.$$ Plug in the numbers for the example above and you should get approximately $P(A|B)=0.091$ out at the end. The difference made by asking the right question is the difference between calculating $P(A|B)$ and $P(B|A)$.

So far so good, and I hope I haven't bored anyone who already knew this. Now the more interesting part. Bayes' theorem itself is of course uncontroversial and incontrovertible — it follows immediately from the basic laws of probability. What is less universally accepted is the correct interpretation of it. You might have noticed that I didn't refer to $A$ and $B$ as 'events' which may 'occur,' but as logical statements or propositions which may be true or false. This is the Bayesian interpretation, in which probability is regarded as a measure of our confidence in the truth of a given statement. The alternative 'frequentist' interpretation is that the probability of an event is a measure of the frequency with which it occurs in a given ensemble.

The problem with the frequentist interpretation can be understood by thinking of an experiment involving tossing an unbiased coin. Our naive expectation is that the probability of getting a heads on any given toss should be 0.5, or 50%, which is entirely consistent with the Bayesian interpretation. But a strict frequentist interpretation requires us to actually toss the coin several times and count the fraction of tosses that result in a heads. But if I toss a coin 10 times and get 10 heads, does this mean the coin is biased? What about 20 straight heads? Even with a very large number of tosses, not only will the percentage of heads obtained never be precisely 50%, it is also possible (though less likely the larger the number of tosses) to get a number very different from 50%! On the other hand, some events are inherently non-repeatable, so that the entire ensemble of trials from which the frequentist would divine his probability must necessarily be hypothetical.

The Bayesian interpretation lends itself more readily to the understanding that we actually have some degree of belief in a proposition, that is dependent on the information available to us. When the information changes, our confidence in the truth or falsity of the proposition changes — in the example above, the information from the test result causes us to update our prior belief, which in turn was predicated on some other information (the prevalence rate of disease X that I gave you).

Thinking about things this way highlights the importance of prior beliefs, in a way that the frequentist approach does not make explicitly clear. Take for instance the recent "faster-than-light" neutrino anomaly from the OPERA experiment. The quality of the evidence was very convincing — a result with 6$\sigma$ significance, corresponding to a much higher confidence that the observation was not due to chance than the 99.9% that I gave you for the test for disease X. And yet physicists around the world disbelieved the result, in one case offering to eat a pair of boxer shorts on TV if it turned out to be true. Why? Because their prior belief in the correctness of Einstein's special relativity (built up as a result of countless previous experiments) was so strong that a single experimental result like this was not able to alter the posterior much.

Of course, as it turned out they were right to disbelieve, as the experimental result turned out to be due to mistakes including a faulty cable.

Outside of science too I believe we are all instinctively, though naively, Bayesian. I say naively because questions like the one I posed are liable to seem counter-intuitive to anyone who does not approach them explicitly using Bayes' Theorem. But we also all know people with very strong prior beliefs in some proposition who do not allow evidence to the contrary to change their minds. Equally, they may require only flimsy evidence to confirm a belief that is already quite firmly held — think Othello and Desdemona. In fact it appears there is some neuroscience research suggesting that people do tend to approach logical decision-making from a Bayesian perspective.

Acknowledging Bayes' theorem and explicitly using this process of updating beliefs in the face of new evidence would probably help people understand and justify the prior beliefs they hold, thus making their decisions more rational. Of all the lessons that science can teach us, I'd argue that this is the most useful one for daily life.

In the next post in this series (which will not be for some time!) I'll talk about Bayesian inference in cosmology.


  1. Interesting. Suppose that you have a compelling piece of evidence to the effect that a therapeutic recommendation (say, a statin) will reduce your risk of heart related events. The prior would be the likelihood of an event without therapy and the posterior would be the likelihood of an event given treatment. Question: suppose you now have a second piece of evidence (say, a new randomized clinical trial) that says the same thing as the first piece of evidence. How does this change the probability? Finally, suppose instead that you do some research and come across a clinical trial that disagrees with the first clinical trial (i.e., it says that statins do not increase the risk of heart events. How does this change the probability. Responses welcome.

  2. Before responding, a quick note on terminology: "likelihood" has a very specific meaning in probability and especially Bayesian methods, so to avoid confusion, the convention is to talk of "prior odds" for your level of believe before the new data arrive, and "posterior odds" for what your updated level of belief is.

    So: simply put - if the new trial is in line with the original, then it simply adds evidential weight to the case for effectiveness. If the new trial contradicts the first, it subtracts evidential weight. And as commonsense would suggest, the amount of evidential weight depends on both the size of the effect found (eg 80% improvement is more impressive than 10%) AND the size of the trial (so a 80% improvement found from a trial involving just 10 patients does not pack the evidential punch of a 10% improvement found in a trial of 1,000 patients).

    There's a vast literature on how these qualitative features are captured quantitatively.