The Infinite Monkey Cage podcast episode “Science’s Epic Fails” (1/31/2017) ends with a rushed discussion of the classic two-child probability problem, posed as follows by one of the show’s two hosts:
You bump into someone in the street, they have two children, one is a boy. What are the chances that the other is also a boy?
A guest on the show says 1/2. The host who stated the question corrects him: it’s 1/3. Another person—the other host, I presume—interjects that it depends on how the question is posed.
The interjector is correct: the answer to a two-child problem may be 1/2 or 1/3, depending on how it’s posed. The above formulation is ambiguous, so there’s no grounds for answering 1/2 or 1/3. I suspect that the host claims 1/3 because that’s (incorrectly) the default expected answer for two-child type problems. And I suspect the guest answered 1/2 on the rationale that we generally assume even odds for a randomly chosen child being a boy or a girl; however, when the answer is indeed 1/2, it’s not due to that rationale.
I’ll give a brief and, I hope, intuitive explanation of both answers, but if you’re still not convinced or would like to go deeper, check out my more extensive post, which also features a variation in which the observed child is born on a Tuesday (the “competing” probabilities there become 13/27 and 1/2): Two-Child Problem (when one is a girl named Florida born on a Tuesday).
Two-Child Problem = 1/2
Let’s first consider a 1/2 formulation. For convenience, I’ll use frequencies and will imagine the person we bump into is named Tina. To disambiguate the statement of the problem, we need only add that, when we bump into Tina, she has a boy with her whom we know to be one of her two children (how we know this isn’t important, so long as it doesn’t give us more evidence than the problem grants; for example, it shouldn’t be because we see her at a mother-son picnic, for reasons that will be apparent in a moment).
Imagine there are 80 instances in which we bump into Tina. If we conveniently assume that birth rates of boys and girls are practically the same, then the 80 instances can be grouped into four equally probable scenarios:
In 20 instances, Tina has two boys.
In 20 instances, Tina has a girl and a boy, and the boy was born first.
In 20 instances, Tina has a girl and a boy, and the girl was born first.
In 20 instances, Tina has two girls.
We can abbreviate this sample space as follows*:
BB = 20
BG = 20
GB = 20
GG = 20
To be clear, what we’re imagining here is obviously not that we actually bump into Tina on 80 different occasions. Rather, we’re imagining bumping into her exactly one time in each of 80 different possible worlds. It’s like tossing two fair coins 80 times.
(*If you have a problem with including BG and GB in the sample space, check out my post explaining why both must be counted: Omega Hungers: Skeptics about Heads-Tails and Tails-Heads in the Sample Space. I also point out there that both need to be included even if the two children were born at precisely the same instant. Conceiving of one being born before the other is just a conceptual convenience. This is similar to constructing the sample space of a tossed quarter and nickel, whether they are flipped sequentially or simultaneously:
I’ll also note that it doesn’t matter if Tina is the stepmother or adoptive parent of one or both children, nor does it matter if Tina identifies strictly as a caregiver. The models constructed here need only accord with our evidence in the mathematically relevant ways.)
Note that the above sample space still doesn’t reflect what we actually know about Tina’s children. Namely, we know that at least one of them is a boy. So we can remove GG from the sample space, leaving us with:
BB = 20
BG = 20
GB = 20
Now for the crucial move, which reflects another assumption: the odds are even for bumping into Tina with either of her children. For example, if she has a boy and a girl, then, for all we know, we could have just as easily bumped into Tina with a girl. This is our most reasonable presumption, as we have no evidence for giving a higher probability to bumping into Tina with one of her particular children. (This is why I ruled out running into her at a mother-son picnic.)
That in mind, we update the sample space one last time:
BB = 20 (because every time we bump into her in this scenario, she’ll be with a boy)
BG = 10 (because she’ll only be with a boy half the time in this scenario, and 10 is half of 20)
GB = 10 (same reasoning as in the BG scenario)
We can now calculate the probability that Tina has two boys by asking what proportion of the time she has two boys given an instance in which we’ve seen her with a boy. That is, we’ll see her with a boy 40 times, and 20 of those times, she has two boys. Put another way, she has two boys 20 out of the 40 times we see her with a boy. Or you can put it in possible-world terms: in 20 out of the 40 possible worlds we’re in, Tina has two boys. That’s half the time: 20/40 = 1/2.
And so, the probability that Tina has two boys is 1/2.
Two-Child Problem = 1/3
So when is a two-girl problem’s answer 1/3? This is easier to demonstrate, and is in fact the explanation given on The Infinite Monkey Cage (though it fails due to ambiguity).
Once again, we bump into Tina. But this time she has no child with her. We ask, “Is at least one of your two children a boy?” She replies, “yes” (and we believe her).
So, we are again back at the following sample space:
BB = 20
BG = 20
GB = 20
And, again, we can further adjust the sample space based on what we’ve learned. But the numbers are different this time:
BB = 20 (because every time we ask Tina if she has a boy, she says “yes”)
BG = 20 (same as above)
GB = 20 (same as above)
(Notice that this is the same sample space we’d presume had we bumped into Tina participating in a mother-son picnic.)
There are now 60 instead of 40 instances of learning that one of Tina’s children is a boy. Twenty of those instances are BB scenarios, which gives a proportion of 20/60 = 1/3.
And so, the probability that Tina has two boys is 1/3.
And that’s that.
Two-child problems are often posed ambiguously. When that happens, I (and others) think the proper answer is, “the question is ambiguous,” but the standard default assumption seems to be a 1/3 scenario. I suppose that, superficially, a 1/3 answer just makes the question seem more counterintuitive and thus both more fun and more instructive for thinking about conditional probability; though, as I think has been demonstrated here, the well-posed 1/2 formulation is at least, if not more, interesting.
Finally, it’s possible that my careful attempts at clarity and heading off objections (“What if both children were born at the same instant?”) overcomplicate my explanation. I think I must leave those in, however, as they at least give an acknowledging nod to the sorts of nagging counter-intuitions that are a large part of what makes probability difficult. Indeed, the problem’s intuitive, psychological dimensions are where its real value lies, rather than it just being a fun brain-teaser. There’s still more to say about the problem in that respect. But this is deeper than I’m allowing myself to go today. For that, and for diagrams and yet more examples (including with coins), see again my more in-depth post: Two-Child Problem (when one is a girl named Florida born on a Tuesday).
Enjoy or find this post useful? Please consider pitching in a dollar or three to help me do a better job of populating this website with worthwhile words and music. Let me know what you'd like to see more of while you're at it. Transaction handled by PayPal.