Probability is known for its power to embarrass our intuitions. In most cases, math and careful observation bear out counterintuitive results. After many such experiences, one’s intuition improves (sometimes perhaps crossing into a kind of overcorrection—see the Optional Endnote for some inchoate thoughts on that). But some results stay strange, and it’s not always clear whether our rebelling intuitions signal a problem with formal probability, or simply confirm that human cognition has evolved to concoct tidy stories amounting to illusory—if sophisticated—representations of the world rather than to deal head on with complexity, chance, and uncertainty.
Here, briefly, are three results I find particularly interesting because, despite being strange (if not problematic), their solutions are simple within their given models. My favorite is (2).
(1) When the Monty Hall problem really is 1/2: The Monty Hall problem contains a strangeness I’ve not seen explicitly discussed, though it’s often implied by explanations of why the common-sense answer is wrong, including in my post, “Monty Hall Problem and Variations: Intuitive Solutions.” In case you need a refresher, the problem goes:
There are three closed doors. Behind one is a car. Behind each of the other two is a goat. You pick a door, hoping to find the car. The host, Monty Hall, then does what he always does in this game: he reveals a goat by opening one of the doors you didn’t pick. He then offers you the chance to switch to the door he didn’t open. (When you chose the car-concealing door, he chooses among the two goat-concealing doors with equal probability.) If you switch, what is the probability you find the car?
The answer is 2/3. Nothing strange about that. However, if we change the problem so that Monty Hall opens a door at random rather than always revealing a goat, it changes to 1/2. This is because 1/3 of the time he’ll reveal the car, thus ending the game.1
This strikes me as strange not because of the math, but because it allows for games that appear identical—or at least seem significantly similar—to yield different probabilities.
For example, imagine you guess Door #1, behind which stands a goat. Monty Hall—for the first and only time during his long hosting career—forgets which door conceals the car, so he acts naturally, hopes for the best, and randomly chooses Door #2, which happens to reveal a goat. You know that you had a 1/3 probability of having chosen the car. If you knew Hall had guessed, you would now update that to 1/2. But you don’t know Hall guessed, so you quietly continue to presume a 2/3 chance of winning by switching. You switch, you win, and nobody’s the wiser.
Now imagine the same scenario, but this time Monty Hall doesn’t forget, and knowingly reveals the goat. This time, the chance you chose the goat conditional on Hall’s revealing a goat stays 2/3 rather than updating to 1/2, because the probability that Hall reveals a goat is 1, rather than 1/2 as it was above. So, you’re in what externally appears to be the exact same scenario as above, but this time you’re correct to assign a 2/3 chance of winning by switching.
You’d see those probabilities borne out were you to run many instances of the game, but I’m especially interested with how we conceive of a single run of the game.
It strikes me as endlessly fascinating that two given scenarios could be externally identical, with the exact same promised outcome (i.e., switching will win), yet those final outcomes have different chances—or at least demand different subjective assignations—due to the mental state of one actor during the middle stage of the scenario. Once Hall makes and acts on the decision to open that door, the influence of his mental state passes.
Interestingly, my intuition tells me the opposite of what most claim to be the intuitive answer here. That is, it always feels like it should be 2/3 to me, once a goat is revealed. I figure that I most likely chose the wrong door in the first place, so getting the chance to switch now that one wrong answer has been removed seems smart. But this fails to account for the additional randomness introduced by Hall’s guessing (where “random” means he’s equally likely not to open the door with a goat, or, at the very least, that his opening that door could not have been predicted with certainty).
To jiggle that intuition around like a loose baby tooth ready to come out, I can think of the following. You play the game with a million doors. You pick a door. Chances are you chose a goat door. Hall, who’s forgotten where the car is, randomly opens 999,998 doors, leaving one door closed. Luckily, he only reveals goats. Do you switch? My intuition here is, overwhelmingly, that you should stay. Why? Well, you’d have to get really lucky to pick the car door on your first try, but not as lucky as Hall would have to be to pick a goat 999,998 times! In other words, chances are you chose the car door, thus making it easier for him to reveal all goats.
So, my intuition has changed from switch to stay. But there’s something very wrong with this intuition as well. Here’s how I’ll correct it.
Assume you chose a goat door. The first door Hall opens, he has a greater than 99% chance of revealing a goat, given that there are 999,999 doors to choose from, and only one hides a car. Once he’s down to ten doors, he’s still got a 9/10 = 90% chance of revealing a goat. Of course, that’s still a lot of opportunity for accidentally revealing the car. There’s only a 1/999999 chance he’ll reveal only goats; or, put another way, will choose the correct door to keep closed (he could have just chosen one door at the offset and then said, “Open all the others”). His probability for keeping the car door closed is low, but a little higher than yours was for picking that same door to begin with.
Now let’s assume we don’t know whether you choose a goat or car door. You started with a 1/1000000 chance of having chosen the car. Every time he reveals a goat, you get evidence that you’ve made it easier for him (than had you chosen a goat) to not accidentally reveal the car (a little easier at first, a lot easier by the end). The upshot of this is that if he reveals all goats, the probability of winning by switching becomes, as in the two-door case, 50%. That’s a big if, but, in the unlikely-to-occur run of game you played, he did indeed make it that far without revealing a car, and you must update your probabilities to match that world. The challenge here (for me) is to not increase my confidence in having initially chosen the car to above 50%.
Still, this helps improve my intuitions about the two-door case (where I don’t feel tempted towards overconfidence). That is, if I learn he randomly revealed a goat, I feel my confidence rise a little for having made it easier for him to not reveal the car. If I’m appropriately careful about putting a number on that increase, I’d say it rises from 1/3 to 1/2.
(2) When it’s right to assign 1/2 to a 1/3 situation: While recently writing about the Two-Child problem, I noticed a variation on the above example. For a quick refresher on that problem:
A couple has two children, one of whom is a girl. What is the probability both children are girls?
The answer is 1/3 when it’s assumed one learned of the child through some deterministic means, and 1/2 when learned randomly. Most discussions of the problem assume a 1/3 circumstance. I won’t get into that here. You can read about it in my post on the topic, or you can see how those solutions play out by considering the following strange example (which also appears in the last section of that post).
If I flip two coins behind your back and then tell you one of them landed Heads, you should assign a 1/2 chance to both having landed Heads, due to the assumption that, in a Heads-Tails or Tails-Heads condition, I’m equally likely to reveal a Heads or Tails to you. So, for instance, in 120 runs of the game, 30 of the 60 times you hear me say “at least one landed Heads,” we’ll be in a Heads-Heads condition.
Despite this, the probability of Heads-Heads is really 1/3, because what I didn’t tell you is that it was my plan all along to tell you Heads if one landed that way. I would only have told you Tails had both landed Tails! This means that in 120 runs of the game, 30 of the 90 time you hear me say “at least one landed Heads,” we’ll be in a Heads-Heads condition.
What’s especially interesting to me here is that, in evaluating a single instance of this trial, you’d be justified in imagining a multi-trial simulation in which 1/2 of the outcomes are Heads-Heads. In a real series of such trials, however, a 1/3 result would emerge. You’d eventually catch on. For example, every time you’re told, “at least one landed Tails,” it would turn out to be in a Tails-Tails condition (a condition, interestingly, in which I’m unable to exercise the rule; so, if our first three runs were in that condition, it wouldn’t be of much help). But in the first instance of the game, you’d reasonably apply a principle of indifference that assumes even odds for my revealing a Heads or Tails. This is the same principle of indifference that leads folks to claim that the solution to the Two-Child problem is 1/2 (i.e., on the grounds that, for example, a father who mentions having a daughter might just as well have mentioned having a son).
This leads me to the third strange result…
(3) Factory Boxes and the Principle of Indifference: The principle of indifference, also known as the principle of insufficient reason, says that when you see no reason to weight competing outcomes differently, you should weight each of them as equally probable. I gave examples in (2) above. The most common application might be when we assume a given coin is fair.
Bas van Fraassen has produced a compelling paradox arising from this principle.2 Here I’ll quote Aidan Lyon’s discussion in his 2010 paper “Philosophy of Probability” (published as a chapter in Philosophies of the Sciences: A Guide):
Consider a factory that produces cubic boxes with edge lengths anywhere between (but not including) 0 and 1 meter, and consider two possible events: (a) the next box has an edge length between 0 and 1/2 meters or (b) it has an edge length between 1/2 and 1 meters. Given these considerations, there is no reason to think either (a) or (b) is more likely than the other, so by the Principle of Indifference we ought to assign them equal probability: 1/2 each. Now consider the following four events: (i) the next box has a face area between 0 and 1/4 square meters; (ii) it has a face area between 1/4 and 1/2 square meters; (iii) it has a face area between 1/2 and 3/4 square meters; or (iv) it has a face area between 3/4 and 1 square meters. It seems we have no reason to suppose any of these four events to be more probable than any other, so by the Principle of Indifference we ought to assign them all equal probability: 1/4 each. But this is in conflict with our earlier assignment, for (a) and (i) are different descriptions of the same event (a length of 1/2 meters corresponds to an area of 1/4 square meters). So the probability assignment that the Principle of Indifference tells us to assign depends on how we describe the box factory: we get one assignment for the “side length” description, and another for the “face area” description.
There have been several attempts to save the classical interpretation and the Principle of Indifference from paradoxes like the one above, but many authors consider the paradoxes to be decisive. See Keynes 3 and van Fraassen 4 for a detailed discussion of the various paradoxes, and see Jaynes 5, Marinoff 6, and Mikkelson 7 for a defense of the principle. Also see Shackel 8 for a contemporary overview of the debate. The existence of paradoxes like the one above were one source of motivation for many authors to abandon the classical interpretation and adopt the frequency interpretation of probability.
Lyon then goes on to discuss the frequency interpretation, which comes with its own problems. Here’s a taste, which ends with a segue into what’s known as the reference class problem:
Ask any random scientist or mathematician what the definition of probability is and they will probably respond to you with an incredulous stare or, after they have regained their composure, with some version of the frequency interpretation. The frequency interpretation says that the probability of an outcome is the number of experiments in which the outcome occurs divided by the number of experiments performed (where the notion of an “experiment” is understood very broadly). This interpretation has the advantage that it makes probability empirically respectable, for it is very easy to measure probabilities: we just go out into the world and measure frequencies. For example, to say that the probability of an even number coming up on a fair roll of a fair die is 1/2 just means that out of all the fair rolls of that die, 50% of them were rolls in which an even number came up. Or to say that there is a 1/100 chance that John Smith, a consumptive Englishman aged fifty, will live to sixty-one is to say that out of all the people like John, 1% of them live to the age of sixty-one. But which people are like John? If we consider all those Englishman aged fifty, then we will include consumptive Englishman aged fifty and all the healthy ones too. Intuitively, the fact that John is sickly should mean we only consider consumptive Englishman aged fifty, but where do we draw the line?
To summarize the factory box dilemma: in “side length description, there’s a 1/2 chance of getting a box with an edge length in the 0 to 1/2 range; in the “face area” description that same range has a 1/4 chance. Rather than merely a strange result, perhaps this demonstrates a technical or theoretical problem (if so, maybe it’s similarly wrong in the Two-Child problem to assume a parent could mention either gender in a seemingly neutral context unless that’s explicitly built into the problem).
But perhaps this result isn’t so worrisome, depending on what we expect from probability as a formal tool for doing better under uncertainty than we otherwise would. Maybe both those probabilities are equally recommendable so long as we are ignorant of actual data about the frequency of edge lengths produced—just as you are correct in assigning 1/2 rather than 1/3 in the coin example of (2) above, until your observations improve that assignment. And perhaps either assignment would still be better than what intuition alone would likely provide.
That an initially correct assignment can be better aligned with the world suggests to me that uncertainty is cognitive. Events in the world do not seem to entail paradox, but rather paradoxes seem to arise due to what we don’t—and perhaps cannot—know about the world (even should some of those events turn out to be knowable only probabilistically, even to an omniscient observer).
Optional Endnote: Probability and Intuition
I conclude with some rough thoughts I’ve been mulling over in recent months, other strains of which may be found in various recent writings I’ve posted on probability.
Probability isn’t just a technical discipline. It also asks us to systematically adjust the informal probabilistic worldview each of us inevitably develops, usually unawares, as we go through daily life guided by intuition and common sense. It’s in large part this relationship to intuition and the real world that makes the subject so captivating to me. Indeed, even which probability model to adopt can be a matter of disagreement between intuitions (for elaboration, see Lyon’s excerpted paper above).
The tuning of one’s intuition usually starts with coins and dice and jars of marbles and such, and gets rougher as increasing doses of complexity, interdependence, and uncertainty are injected.
Nassim Taleb has pointed out the error—what he calls the “ludic fallacy”—of turning one’s game-derived probability sense towards the complex real world. It seems that many of the results I find most intriguing rest in the hazy penumbral area between games, with their readily calculable expectations, and the real world, which consists of events lacking clearly shared features and, most inscrutably, the human mind. In other words, we may even be committing the ludic fallacy when we apply game probability to games!
And so I resist fully surrendering my intuition to formal disciplines (though by “I” here I really denote my stubborn intuition itself, rather than some central executive who gets to decide what my intuitions do from moment to moment). Rather, I think we should pay attention when a result carries a whiff of strangeness about it—especially when it endures the hard work of understanding that result’s technical dimensions. I often observe people rehearsing counterintuitive probability results as though they’re obvious. But clearly they’re not obvious9, or else it wouldn’t have taken as long as it did for the field of probability to develop, and there wouldn’t be a history of brilliant mathematicians getting such problems wrong—e.g., when Paul Erdős rejected the 2/3 solution to the Monty Hall problem.
Presumably Erdős rejected that solution until he understood it for himself—until it felt right—rather than accept it because other mathematicians said so. Maybe many results these days are accepted on the grounds of consensus, or even on the general grounds that counterintuitive results are the best kind of results.
Perhaps this is why many will accept the Monty Hall problem or the Two-Child problem and so on, before fully understanding the math. Before understanding, for example, why it is that if a random mother tells you “one of my two children is a girl born on a Tuesday,” the probability of her having two daughters is 13/27, while if you ask a random mother of two if “one of them a girl born a Tuesday,” and she says “yes,” the chance of both being daughters is 1/2.10
Maybe this is in part an attempt to avoid finding themselves in the camp of those who scolded Marylin vos Savant for her correct answer to the Monty Hall problem. Though I imagine something different goes on with the layperson and the professional (the former can still be seen all over YouTube rejecting the 2/3 Monty Hall solution, often emphatically offering their own “obvious” solution; while the latter, as well as ambitious amateurs such as myself, take 2/3 as obvious).
I’ve seen articles in reputable popular publications attempting to explain the above-noted 13/27 “born on a Tuesday” result, ultimately getting it wrong. Those do sometimes carry a tinge of modesty, but I prefer the modestly skeptical approach taken by George Johnson in his review of Leonard Mladinow’s 2008 book, The Drunkard’s Walk. In that book, Mladinow further complicates the Two-Child problem: What if you learn one of the children is a girl named Florida? Johnson writes in response: “Even weirder, and I’m still not sure I believe this, [Mladinow] demonstrates that the odds change again if we’re told that one of the girls is named Florida.”
At any rate, I think such results should be displayed with care. I fear that treating them as obvious threatens to trivialize a field that needs to be taken more, rather than less, seriously by a general public that might think the intuitions of experts have been so beaten into submission and bent out of shape by formal training that the experts will accept absurd results over obviously common sense ones—or in simpler terms: the expert will miss what a layperson sees readily.
Sometimes this worry seems legitimate. It would be interesting to make a list of strange things believed by experts, both in terms of ideas with many adherents, and those with few. I can think of several recent philosophers and scientists who’d make both categories (one need only read a few writings on the nature of consciousness to get started).
As Augustín Rayo once put it on the Elucidations podcast, in a droll response to a question about why anyone would think there could exist no dinosaurs while the number of dinosaurs was not zero: “You would only think that after years of training as a philosopher.” Interestingly, later in that fascinating conversation, which was about the construction of logical space, Rayo himself makes an argument essentially claiming that numbers have objective, independent existence of some sort. This belief, generally referred to as “mathematical platonism,” strikes me as strange as a belief in ghosts, but many mathematicians, philosophers, and physicists seem to view it as obvious.
I myself take for granted that Edmund Gettier got it right in his 1963 paper11 in which he shows that justified true belief is not always knowledge, as it had for centuries been believed to be. For reasons I won’t get into here, I doubt his paper would have convinced pre-20th-century experts (or perhaps even pre-WWII experts). To me and seemingly most contemporary epistemologists, though, there’s nothing strange about it.
More aptly, this worry can be found in the context of statistics as well. For example, when Taleb, in The Black Swan (2010, second edition), writes things like: “[Chris] Anderson is lucky enough not to be a professional statistician (people who have had the misfortune of going through conventional statistical training think we live in Mediocristan)” (page 223); or when he urges “the intelligent reader” without formal training to skip a section “meant to prove a point to those who studied too much to be able to see things with clarity” (page 352).
Enjoy or find this post useful? Please consider pitching in a dollar or three to help me do a better job of populating this website with worthwhile words and music. Let me know what you'd like to see more of while you're at it. Transaction handled by PayPal.
Or click the banner to shop at Amazon (at no extra cost: it just gives me some of what would have gone to Amazon).
- More thorough explanations can be found at my aforementioned post.
- van Fraassen, B. (1989). Laws and Symmetry. Oxford: Oxford University Press.
- Keynes, J. M. (1921). A Treatise on Probability. London: Macmillan.
- van Fraassen, B. (1989). Laws and Symmetry. Oxford: Oxford University Press.
- Jaynes, E. T. (1973). The Well Posed Problem. Foundations of Physics, 4(3):477– 492.
- Marinoff, L. (1994). A Resolution of Bertrand’s Paradox. Philosophy of Science, 61(1):1–24.
- Mikkelson, J. M. (2004). Dissolving the Wine/Water Paradox. British Journal for the Philosophy of Science, 55:137–145.
- Shackel, N. (2007). Bertrand’s Paradox and the Principle of Indifference. Philosophy of Science, 74:150–175.
- Admittedly, though I’ll forego attempting one, I likely owe some explanation here of what I mean by “obvious,” as many of these things seem like they couldn’t be anything but true to an intuition that’s been developed in a certain way. I hope context is enough to understand what I mean here by “obvious.”
- I won’t explore this here, but there’s a political dimension to this as well: people often seem to accept or reject certain scientific claims according to—or sometimes as a corollary to—their political affiliation rather than due to understanding those claims.
- “Is Justified True Belief Knowledge?“, Edmund L. Gettier, Analysis, Vol. 23, No. 6. (Jun., 1963), pp. 121-123.