## Probabilities are counterintuitive

Why are probabilities so unintuitive? Check this out, from Mammogram Math in the NYTimes Magazine, discussing how misunderstandings about probabilities affects the public’s ability to understand policy choices (via Brett):

Assume there is a screening test for a certain cancer that is 95 percent accurate; that is, if someone has the cancer, the test will be positive 95 percent of the time. Let’s also assume that if someone doesn’t have the cancer, the test will be positive just 1 percent of the time. Assume further that 0.5 percent — one out of 200 people — actually have this type of cancer. Now imagine that you’ve taken the test and that your doctor somberly intones that you’ve tested positive. Does this mean you’re likely to have the cancer? Surprisingly, the answer is no.

To see why, let’s suppose 100,000 screenings for this cancer are conducted. Of these, how many are positive? On average, 500 of these 100,000 people (0.5 percent of 100,000) will have cancer, and so, since 95 percent of these 500 people will test positive, we will have, on average, 475 positive tests (.95 x 500). Of the 99,500 people without cancer, 1 percent will test positive for a total of 995 false-positive tests (.01 x 99,500 = 995). Thus of the total of 1,470 positive tests (995 + 475 = 1,470), most of them (995) will be false positives, and so the probability of having this cancer given that you tested positive for it is only 475/1,470, or about 32 percent! This is to be contrasted with the probability that you will test positive given that you have the cancer, which by assumption is 95 percent.

Counterintuitive, right?

Here’s another one:
Last week, Boing Boing carried a post about how delicious raw eggs are (apparently very delicious) and the risks of getting salmonela from them. According to the post, 1 in 20,000 raw eggs has salmonela in it. So how risky is it to eat them? The discussion went like this. Note that each of these people *thinks* they know something about calculating probabilities and is confident enough to comment about it on Boing Boing. For what it’s worth, I’m with @SamSam on this question. But then, Probabilities was my lowest math grade for exactly this reason.

Here’s the thread from Boing Boing:

peterbruells | #3 | 08:09 on Fri, Dec. 4
1:20,000 huh? So that’s 55 years of eating one raw egg per day. (Not that I like raw eggs)

BCJ replied to comment from peterbruells | #57 | 09:53 on Fri, Dec. 4
you mean 27.5 years, not 55 years. At 27.5 years you will have eaten 10000 eggs Once you have eaten 10001 eggs, it will be more likely than not that you will have eaten an egg with salmonella.

SamSam replied to comment from BCJ | #66 | 10:13 on Fri, Dec. 4
@BCJ: Nope. After eating 10,000 eggs, what is the probability that you will have had at least one with salmonella? It’s 1 – the probability that every egg has been disease free.

The probability of one egg being disease free is 1-(1/20000) = 0.99995.

The probability of 10,000 eggs being disease free is 0.99995 * 10,000 = 0.6065 = 60.7%

So after eating 10,000 eggs, you will still have only a 40% chance of having ever consumed salmonella.

However, by 20,000 eggs you will have had a 63% chance of having eaten salmonella, so your magic 50.1% number is between those two values.

mattxb replied to comment from SamSam | #109 | 15:46 on Fri, Dec. 4
@SamSam: Nope.

@BCJ: Nope. After eating 10,000 eggs, what is the probability that you will have had at least one with salmonella? It’s 1 – the probability that every egg has been disease free.

No, that’s the probability that all the eggs you’ve eaten are disease free; you’re calculating [egg 1 is disease free] AND [egg 2 is disease free] AND … which is (1-1/20000)*(1-1/20000)… = (1 – 1/20000)^N, N being the number of eggs and ^N meaning to the power of N.
What you want to find is the probability of at least one egg being diseased. This is [egg 1 is diseased] OR [egg 2 is diseased] … That calculation is (1/20000)+(1/20000)+… = (1/20000)*N, N being the number of eggs consumed.

The value of N for which you are 50% likely to have eaten an egg is the solution of (1/20000)*N = 0.5, which is N = 10000.

SamSam replied to comment from mattxb | #139 | 20:40 on Sun, Dec. 6
@mattxb: No no no no.

What you want to find is the probability of at least one egg being diseased. This is [egg 1 is diseased] OR [egg 2 is diseased] … That calculation is (1/20000)+(1/20000)+… = (1/20000)*N, N being the number of eggs consumed.

I’m sorry, but that is completely incorrect.

Here, how about this: what is the probability of throwing at least one six with three throws of a dice?

Your method will say 1/6 + 1/6 + 1/6 = 1/2, which sounds correct if you are new to statistics, but is completely wrong.

What is the probability of throwing at least 1 six with six throws? 100%? That’s what your method will predict. (1/6) * 6 = 1. But I’m sure you agree that that’s incorrect.

How about the probability of getting a head with two flips of a coin? You agree that it’s not 1/2 * 2 = 100%, right?

Does the probability of eating a salmonella egg double if you eat two of them? Well, does the probability of getting a head double (from 50%, remember) if you flip twice?

No, the probability of getting at least 1 head is exactly equal to 1 – [the probability of flipping no heads], which is 1 – (1/2 * 1/2) = 3/4.

The probability of getting at least 1 salmonella egg is exactly 1 – [the probability of getting no salmonella eggs], which is the calculation I performed for you before.

You can’t add probabilities together, or soon you’ll start proving that there’s a 110% chance of something happening…

🙂

## One thought on “Probabilities are counterintuitive”

1. Anonymous says: