# An Easier Counterintuitive Conditional Probability Problem (with and Without Bayes’ Theorem)

Bayes’ Theorem. Image found here.

Given my recent posts about difficult counterintuitive probability problems (a topic from which I’ll now take a break for a while1), I thought it’d be fun to briefly look at a problem that ceases to be counterintuitive once explained. This is a variation on a question commonly given when teaching Bayes’ theorem. I’ll apply the theorem at the end of the post, but will mostly rely on more intuitive methods. Here’s the question:

One percent of 40-year-old women have breast cancer. The chance that a mammography machine correctly diagnoses breast cancer is 80%. That same machine has a 9.6% chance of giving a false positive. Suppose a 40-year-old woman goes in for a regular mammography screening and is diagnosed with breast cancer. What is the probability she has breast cancer?

Most people answer something like: “The machine detects the cancer 80% of the time, so the probability must be 80%!… Or maybe 70% given the false positive rate.” The correct answer is around 7.8%.

Apparently, even a great majority of medical doctors guess too high on these sorts of questions. According to this video, Explaining Bayesian Problems Using Visualizations (whose visualizations, by Luana Micallef, are useful towards developing an intuition for how the numbers are derived for these problems), 95% of doctor’s surveyed about the above question guessed 70% to 80%.

I don’t know where that study came from, but here’s reference, at a Cornell course blog, to another study involving a similar question: Doctors don’t understand Bayes’ Theorem. In this one, the correct answer was 10%. Only 21% of doctors (i.e., out of 1000 gynecologists) got the  correct answer, and nearly half gave an answer of 90%.

There’s also an excellent discussion of doctors’ lack of explicit Bayesian probability skills in Daniel Levitin’s 2014 book The Organized Mind. Levitin also notes, however, that some doctors intuitively “apply Bayesian inferencing without really knowing they’re doing it” (page 248). This points to an important feature of Bayesian probability: it really does have intuitive underpinnings. Indeed, the solution to the above problem is intuitive—perhaps even obvious—once you see it.

## Scenario 1: Solving with Frequencies

I’ll first solve the problem by applying its theoretical frequencies to 1000 randomly chosen 40-year-old women. Note that, from here on, I round 9.6% to 10% for convenience; this will change the answer slightly.

Also notice that there are further questions we might like to ask, such as about the woman’s socio-economic situation or medical history. We might ask about what country she’s from, or whether she’s one hour away from her 41st birthday. We might ask whether her doctor’s mammography machine tends to go on the fritz, or if the technicians in that office are sufficiently skilled, or if the staff sometimes accidentally mixes up patient records. We can’t answer these questions, so we’ll work with what we have.

[1] Suppose 1000 40-year-old women are randomly selected for mammography testing.
[2] Of those 1000 women, 1%—or ten—will have breast cancer, and the remaining 990 won’t.
[3] Of the ten women with breast cancer, 80% will be correctly diagnosed. That’s eight women.
[4] Of the 990 women without breast cancer, 10% will get a false positive. That’s 99 women.
[5] We now have 107 women diagnosed with breast cancer. Of those: eight have it, 99 don’t.
[6] We have our answer: eight out of 107 diagnosed women will turn out to have cancer. That’s 8/107, or about 7.5%. In other words, if a randomly screened 40-year-old woman is diagnosed with breast cancer, she is essentially now put into that set of 107 women; for eight of those 107, the diagnosis is correct. So, we could rephrase the initial question as: If a woman makes it into the set of 40-year-old women diagnosed with breast cancer, what is the probability that she’s within the portion of those correctly diagnosed? It’s 8/107.
[7] We could now repeat this process. That is, suppose the 107 women went to get a second opinion at a different clinic (with the same probabilities in place for accuracy and false positives). Of those, eight have breast cancer, while 99 don’t. Of the eight, 80%, or 6.4, will be correctly re-diagnosed. Of the 99, 10%, or 9.9, will be incorrectly re-diagnosed. We can now calculate the probability that a re-diagnoses are correct by dividing 6.4 by the total number of women re-diagnosed, which was 6.4+9.9=16.3. That’s 6.4/16.3, which is about 39%.

It seems to me that, put this way, the answer becomes quite intuitive. I can also express this with basic probability math, which may be less intuitive on its own, but makes more sense in light of the above scenario.

## Scenario 2: Solving with Basic Probability Math

In a sense, you can think of what I’ll do here as examining a hypothetical scenario that has one person instead of 1000. So, instead of dealing with 1% of 1000 people to start with, we’re dealing with 1% of 1 person. The basic rules I’m using here are: we find the probability of two independent events occurring by multiplying their individual probabilities (i.e., the P(A and B) = P(A)×P(B)); and we find the probability of one or another mutually exclusive events occurring by adding their individual probabilities (i.e., the P(A or B) = P(A)+P(B)).

[1] 1/100 40-year-old women have breast cancer, 4/5 of whom will be correctly diagnosed. In other words, this is the the probability of a 40-year-old woman having breast cancer and being correctly diagnosed (given that she has breast cancer). That’s (1/100)×(4/5)=1/125
[3] 99/100 women do not have breast cancer, 1/10 of whom will get a false positive. In other words, this is the the probability of a 40-year-old woman not having breast cancer and getting a false positive. This comes to (99/100)×(1/10)=99/1000.
[4] If a 40-year-old women is diagnosed, the probability of the diagnoses being correct is probability worked out in [1] divided by the sum of the probabilities worked out in [1] and [2]. This is like dividing the number of women correctly diagnosed by the number of overall women who’ve been diagnosed correctly or not: (1/125)/[(1/125)+(99/1000)] = 107/125000, which simplifies to 8/107, or around 7.5%.

Feel free to repeat this process, as above, to get 64/163.

## Scenario 3: Solving with Bayes’ Theorem

I’ll now demonstrate the problem using Bayes’ theorem (or “rule” or “law”). Bayes’ theorem allows us to update our probabilities based on new evidence. It goes like this:

Where:

P(A) is the the probability of having breast cancer as a 40-year-old woman;
P(B) is the probability of being diagnosed with breast cancer via mammography, regardless of having breast cancer or not;
P(A|B) can be read several ways: “the probability of A given B” or “the probability of A conditional on B” or “the probability of A happening given that B has happened” or “the probability of A being true given that B is true” or  “the probability of hypothesis A being true given evidence B” and so on. In this case it’s the probability of being a 40-year-old woman with breast cancer given a positive mammography diagnosis;
P(B|A) is the probability of being diagnosed via mammography given that one has breast cancer.

We’re looking for the P(A|B), so we’ll need to figure out the probabilities for the others first:

P(A) = .01
P(B) = This is a little harder to figure out. There’s a formula for it: P(A)×P(B|A) + P(not-A)×P(B|not-A). Instead, I’ll just think it through. We already know that, out of 1000 women, 8 will be correctly diagnosed and 99 will be incorrectly diagnosed. That’s 107/1000.
P(B|A) = .8

Now we just plug those in:

P(A|B) = (.8×.01)/(107/1000)=8/107

Suppose she goes in for a re-test. We can now re-run the formula to reflect the updated P(A), which has gone from 1/100 to 8/107 (in Bayesian terms, we’ve updated our “prior” probability). So we can now update our “posterior” probability, or P(A|B), where B is getting positively diagnosed, but with the implication that the woman was previously diagnosed (that implication is built into P(A)’s change from 1/100 to 8/107); in other words, the probability of getting re-diagnosed.

P(A) = 8/107
P(B) = Again, the formula for this is: P(A)×P(B|A) + P(not-A)×P(B|not-A). I’ll use it this time, keeping in mind that P(B|A) hasn’t changed: (8/107)(.8) + (99/107)(.1)=163/1070. I can also just think it through to get the same thing. We know that, of the 107 diagnosed out of 1000, 8 are correctly diagnosed to begin with, and that .8 of those will be correctly re-diagnosed, or 6.4.  And we know that .1 of the 99 of those 107 women incorrectly diagnosed will be incorrectly re-diagnosed, which comes to 9.9. What we need to do is divide the number of all those re-diagnosed—16.3—by the number of all those diagnosed in the first place: 16.3/107, or 163/1070 if we want whole numbers.
P(B|A) = .8

Now we just plug in:

P(A|B) = (.8×(8/107))/(16.3/107) = 64/163. Or about 39%. Same thing we got in [7] of Scenario 1.

## Conclusion and a Harder Problem

This was a pretty straightforward application of Bayes’ theorem. For one thing, we didn’t need to estimate any subjective probabilities—i.e., probabilities in response to limited evidence, that amount to, at best, rigorously formulated inductive inferences, and at worst (though sometimes this is the best we can do) on-the-spot, intuitive guesses. If you’d like to learn more about this, I recommend Dan Morris’ fun and easy-to-read 2016 book Bayes’ Theorem Examples: A Visual Introduction For Beginners. In it, you’ll find the following challenging problem, which I won’t solve here; notice the role of estimating and guesswork:

You are a soldier and have recently shipped out across the Atlantic on your fourth peacekeeping tour. A few weeks into your mission you are on patrol and see an injured family across the road from you.

You are about to go to them when suddenly there is a surprise attack and you find yourself pigeonholed against a burnt out vehicle. You stop to listen and are suddenly filled with horror as you see a truck turning the corner. There is no doubt it is an enemy vehicle, but you didn’t have time to see if the truck was rigged with a gunner on the back—and if there is you don’t want to be caught in the open.

You quickly do some mental calculations and recall what you learned in your debrief. The rebels have roughly 54 dilapidated trucks and 22 of them are rigged with guns in the back. Rebels in a truck are one thing, but rebels in a truck rigged with a gun on the back? You don’t want to be caught in the open with that.

You pop your head out to get a better look and a wave of bullets hits the vehicle in front of you. The rebel truck is now about 150 yards away, but you are still uncertain if the shots came from the truck or somewhere else. If the truck is rigged with a gun, the chance of it having fired at you is pretty high, maybe at 80%.

You continue to think. Considering how heavy the firepower was and the environment you are in, you peg the possibility of being shot at 50%. What should you do? Should you risk crossing the street to help the family?2

#### Footnotes:

1. I’ll continue thinking about probability, but, before writing more, I need to dig deeper into its rigorous mathematical and philosophical literature.
2. Morris, Dan. Bayes Theorem: A Visual Introduction For Beginners (Kindle Locations 380-392). Blue Windmill. Kindle Edition.