*Estimated read time (minus contemplative pauses): 9 min.*

I recently published a post called “Three Strange Results in Probability.” The second result I mention there strikes me as strange enough to merit its own post. I thought of it while contemplating the Two-Child Problem:

I flip two fair coins hidden from your view. I reveal one of the coins to you, a Heads, and then I ask: “What’s the probability the other coin landed Heads?” You give the justified answer of 1/2. What I don’t tell you, however, is that I intended all along to reveal a Heads, provided one of the coins landed that way. So, the answer you’d be best off giving is 1/3. That is, 1/3, rather than 1/2, is the probability of your guess being correct when you guess the other coin landed Heads. Similarly, had I revealed a Tails, you’d be justified in assigning 1/2 to the other coin being Heads, but would be best off assigning 0 given my secret intention.

Brief elaboration and discussion:

A review of the possible outcomes—given your available evidence and a basic understanding of probability—justifies your answering 1/2. Had both coins landed Heads, you’d see a Heads. Had they landed Heads-Tails or Tails-Heads, you’d equally see either a Heads or Tails, because (assuming a principle of indifference) I could equally have chosen to show you a Heads or a Tails. If they’d landed Tails-Tails, you’d have seen a Tails, so a Tails-Tails condition is out. It must, then, be Heads-Heads, Heads-Tails, or Tails-Heads. These are equally likely: 1/4 each before ruling out Tails-Tails, and 1/3 each now that Tails-Tails is ruled out.

This means, for example, that in 120 runs of the game, you’d expect 30 games to be Tails-Tails. Of the 90 remaining games, 30 would be Heads-Heads (you’d see a Heads in 30 of these), 30 would be Heads-Tails (you’d expect to see a Heads in 15 of these), and 30 would be Tails-Heads (you’d expect to see a Heads in 15). In the end, you expect 30 of the 60 times you see a Heads to be in a Heads-Heads condition. Thus, 30/60, or 1/2, is your answer.

That’s fine, but it’s not the answer you’re best off giving. Because I intend to show you a Heads, what would actually happen in 120 games is something more like: you’d see a Heads 90 times and 30 of those would be in a Heads-Heads situation. That 30/90, or 1/3.

These results aren’t strange on their surface. Probability is all about coming up with the best measure your evidence affords. But what does strike me as captivatingly strange is that, given a *single instance* of this game, the probability you’d theoretically be best off assigning is 1/3 *due to my cognitive state*; though, in practice, my cognitive state seems to make no difference in how the game goes—e.g., whether your guess of a second Heads (or Tails) will be correct in that single game.

The strangeness of this is particularly noticeable in Heads-Heads and Tails-Tails situations. In the former, I reveal a Heads no matter my intention, and in the latter I reveal a Tails no matter my intention. Yet my intention matters. If you knew what I intended, then on seeing a Tails, you’d know to assign zero for the other coin being Heads. Just as, in any instance of guessing a single coin flip, you’d assign 1 to whatever the coin actually lands, if only you knew enough about the physical phenomena involved—i.e., precisely how the coin will respond to and interact with its environment. The difference here is that there is a single, simple, knowable rule, while in the single-coin instance, whatever rules are in place—i.e., the sorts of physical phenomena investigated by physicists—are too hard to track.

Over enough runs of the game, you’d notice the rule imposed by the flipper’s secret cognitive state. In a single run of the game, you of course would not—and it strikes me as strange to imagine the rule even being in play in that case. But it is. Because if you knew the rule, you’d apply it (just as the flipper is; as things stand, however, you and the flipper are assigning different probabilities to these outcomes).

To summarize, 1/2 is what you should assign the problem stated at the beginning of this article, but your expectation would likely turn out to be off, given just a few runs of the game. So, 1/3 is a better answer, even in a universe composed only of one instance of this game. Likewise, if you saw a Tails revealed, you should assign 1/2; however, as the flipper, my intention to reveal a Heads makes the probability of a correct “Heads” guess zero in this scenario,* despite the fact that I’m not given the opportunity to exercise that intention!*

Finally, I keep referring to the flipper’s “cognitive state,” but should clarify that the flipper need not have a mind—it could be a mindless computer programmed to *reveal a Heads when a Heads is available*. This will lead to a very different series of games than when the rule is to *randomly choose one of the coin results to reveal*. But in a single instance of the game, the rule would only make any practical difference if you knew the rule. (Hey, this could make for a fun “discover the rule” game. Maybe it already exists.)

In contrast to a blind algorithm, what is special about mental states is that they’re the unpredictable products of a being with beliefs and desires, and they can change on a whim. I can intend to reveal a Heads, but then change my mind just before revealing a Tails, and the probability of your guess being correct will change accordingly. This makes the example not only strange, but bothersome. I can keep changing the rules over the course of what amounts to a series of distinctly designed games, while never being able to apply those rules (as in the aforementioned Tails-Tails scenario), and the probabilities will continue to change, even though each game might look exactly the same.

Indeed, you could play the game I describe above, and, though it’s unlikely, end up being correct exactly half the time that you guess Heads, simply due to how the coins happen to land. But 1/3 still would have been the recommended probability to assign in an given instance (given that the result of the concealed coin is unknown; if it were known, you could assign zero or 1).

In that game, the application of the principle of indifference is the point at which you go “wrong,” but its application seems reasonable. This, despite there being legitimate (game-theoretic) questions to ask about the likelihood of a human flipper favoring Heads over Tails (again, not like a robot with a randomizing instruction). That is, there seems to be special privilege given to Heads culturally (I Googled “coin lands Heads” and got 12.6 million hits, while “coin lands Tails” yielded only 6.36 million), but who knows whether that felt prevalence might lead a given individual to favor Tails rebelliously. Further, we know that people—e.g., students trying to fake a list of coin toss results for their homework—are pretty bad at intentionally producing statistically convincing results for a series of, say, binary equiprobable eventsbut this actually seems to bolster the 1/2 assignment: people tend to underestimate how many strings of Heads and Tails will come up in several tosses of a coin, so attempts to consciously randomize choices from a small series of tosses might end up closer to 1/2 than it usually would were it actually random (a computer would do better, but it might look “less random” to the human eye).*

(*Notice, though, that were I to attempt to concoct a random series of equiprobable H’s and T’s, this would indeed produce a random sequence in the sense of its being irregular and defying attempts to infer a rule for predicting a given “toss” result. So, as a virtual coin-tossing machine, I’d likely be playing with a biased “coin,” but what that bias is could change—with or without my consciously influencing that change—depending on which subset of tosses is analyzed, and thus would probably give no evidence for the hypothetical convergence to some limiting value, as is often expected of an actual fair coin, i.e., with convergence to 1/2 for Heads. I’ve read, anecdotally, that people tend to hit a H:T ratio of about 7:3, but I have no idea if that’s true. I tried it myself, inputting four strings of 100 H’s and T’s into Excel, then tallying the results. String 1, I got a H:T ratio of 51:49^{1}; String 2 was 53:47; String 3, which I did really fast [though I did do all of them pretty quickly], was 50:50; String 4, which I did late at night and faster still while very tired and distracted, was 48:52. It would also be interesting to look at other features, such as average length of H and T series. The idea was to make it “look random.”)

At any rate, while I suspect the principle of indifference is over-applied in general^{2}, and in particular I think we should take care in applying an shortcuts around the mind gap (more about which soon), but I don’t think it’s the main culprit here. My concern, primarily, is with identical events having distinct probabilities, particularly when this is due to rules that cannot even be applied. The strangeness here seems, more than anything, to do with how to subjectively respond to the relationship between distinct events and the frequencies that arise over several instances of those distinct events.

What to learn from or do about this, I’m not sure. Maybe I’ll run into something that will relieve the bother in one of the books I’m currently picking through (e.g., Philosophy of Probability), but as of now, it bothers and nags. Hopefully I don’t just find:”Probability is simply a tool for tidily modeling uncertainty. Don’t expect so much from it. It’s a poor probabilist who blames their tool.” To be continued… [3/22/18 UPDATE: A series of four articles in that book have so far been somewhat instructive, but the nag lingers; namely Part V: Physical Probability, The Frequency Theory, more about which later, maybe.]

*Enjoy or find this post useful? Please consider pitching in a dollar or three to help me do a better job of populating this website with worthwhile words and music. Let me know what you'd like to see more of while you're at it. Transaction handled by PayPal.*

*Or click the banner to shop at Amazon (at no extra cost: it just gives me some of what would have gone to Amazon).*

#### Footnotes:

- Disclaimer: I accidentally only did 99, yielding 51:48, then went back and “randomly” added one more T at the end (assuming that’s what I would have done, but I can’t be sure).
- Here are two examples from my own experience, though I would like to see them tested. It’s commonly said that when you become stuck on two possible answers on a multiple choice reading comprehension test, you have a 50% chance of guessing the correct answer. I think this might be wrong, as there seems to be more going on here than, say, were the student to flip a coin. It seems that guessing here is slightly biased in favor of the wrong answer for some psychological reason or another.
My second example is similar. I often exit the subway in New York and have to decide whether to turn left or right. After many, many such instances, I’m convinced that I nearly always go the wrong way. I’ve even tried going the opposite direction I initially choose, and it still turns out wrong. Maybe this summer I’ll make a little project of testing this. But I’m currently convinced that I’d do much better by flipping a coin in such situations!

The way I explain it, is that probability models what we

don’tknow about a situation. Say I draw a card from a standard deck of cards. I tell Ann that it is a black suit, Bob that is a Spade, Carl that it is an honor card (AKQJT), and Debbie that it is an Ace. I then ask each for the probability that it is the Ace of Spades (which I know it is). Ann says 1/26, Bob says 1/3, Carl says 1/20, and Debbie says 1/4.There is no contradiction in any of this. The question is not about the card itself, it is about the information – or more specifically, what is lacking – that each has about it. Where many “paradoxes” go wrong, is assuming more about what is known than actually is.

In the Monty Hall Problem, you can’t assume that assume that a door was chosen

becauseit had a goat, it was chosenfrom among eligible doorsthat have goats. So it had a 100% chance if you picked the car, and a 50% if you didn’t. These 2:1 odds make for a 2/3 probability if you switch.Your Two Child Problem is: “You bump into someone in the street, they have two children, one is a boy. What are the chances that the other is also a boy?” The answer is 1/2. Part of what is unknown, is why we were told that one was a boy. We can treat that with probability. If we do, there is only a 50% chance that we learn about a boy if he has a sister. WHETHER OR NOT this actually matches how the question was posed, it does match our information about the question was formed. Just like each of Ann’s, Bob’s, Carl’s, and Debbie’s answers were correct.

Hi JeffJo,

As always, thank you for your thought-provoking comments.

I love your Anne, Bob, Carl, Debbie example—more than I like my usual example for getting across the same idea (i.e., “I know the coin I’m hiding landed heads but for you it’s 1/2”). I agree there’s nothing strange or paradoxical or contradictory in this, though it’s nice for demonstrating the problem with the common idea that “the past event in question either happened or it didn’t so the defendant’s guilt has probability 1 or 0.” To get around that, I often cast such scenarios as looking for the probability that your guess turns out to be correct (e.g., given that you guessed heads).

That said, strange (to me) results do sometimes emerge. In the example I give in the above post, what I find most strange is that we could play a game one time wherein I flip two coins and show you the result of one flip. You see a tails, and thus reasonably assign 1/2 to the other being heads. But from my perspective that probability is 0. And not necessarily because I know how it landed (more on that in a moment). But because I secretly intended to show you a tails only if forced to.

What’s strange to me here is that I have a probability-affecting rule that I never get to apply if we play the game once (or if we play it three times and I get two tails each time, etc.). And suppose I forget the rule just when I’ve revealed the tails. We’d then both reasonably assign 1/2 to the hidden coin being heads. But the “best” assignation would be 0, not because it actually landed tails, but because I was following the reveal-heads-when-can rule when I flipped the coin and revealed the tails. The effects of this rule would become apparent over time (assuming two fair coins, etc.), but would not actually have any visible effect on the outcome given this single instance of the game—it would, rather, only pose the counterfactual scenario that,

hadthe coins landed HT, you would have seen a heads.Similarly, I do find the following variation on the standard, three-door Monty Hall game a little strange. Suppose someone plays the game once. The contestant is correctly told that Hall knows where the car is and intends to reveal a goat. But in this one instance, Hall forgets where the car is, so he hopes for the best and opens a door to (luckily) reveal a goat (revealing no signs of forgetfulness). The contestant then reasonably switches while assuming 2/3 probability of winning, but “should” assign 1/2 given Hall’s sudden forgetfulness. I find this a little less strange than the above coin game example.

I like your Monty Hall comment. I’ve recently posted a variation in which Hall prefers to open Door 2 to reveal a goat, with interesting results (to me, at least).

Regarding your Two-Child problem comments. That statement of the problem came at the end of a radio show that was recorded in front of an audience. The host rushed through it and then cut the guests off as they tried to answer. I’m not sure this was intended to be how the problem was posed, or if it was meant to be stated that the person you bump into is with a single child, both children, or no children.

If we assume they’re with just the boy, I agree with the 1/2 solution. If they’re with neither child, I lean slightly to the 1/3 solution, but am inclined to instead say the problem is poorly posed. I don’t like the idea of trying to figure out why a generic, and in this case rushed, prompt mentions the child it mentions; it’s a thrown-together toy problem… there’s no actual second child sitting around waiting to vindicate our evaluation. So, if I had to guess why we’re told a boy: it’s because the person who put the prompt together needed to pick a gender given that this classic problem requires one. And let’s remember why this problem is usually trotted out: so that people will say 1/2 when the “real” answer is 1/3. I imagine the prompt-designer had that in mind as well. I agree that this hasn’t been clearly accomplished here.

Finally, if we’re to assume that both children were with the person (maybe you’ve forgotten the encounter and are now being told, “you saw the person with her two children, one of whom was a boy…”), I’d say this requires us to fill in too many details to come to a subjective agreement on the best answer. In fact, I feel that way about this particular statement of the problem overall.

I’m curious, are there any results in probability you find strange or counterintuitive or surprising—particularly even after one thoroughly understands the underlying math?