In his book The Black Swan1, Nassim Nicholas Taleb, a fellow urban slow-walker, describes a scenario in which he poses the following question to two characters, the rational & educated Dr. John and the intuitive & streetwise Fat Tony:
Assume that a coin is fair, i.e., has an equal probability of coming up heads or tails when flipped. I flip it ninety-nine times and get heads each time. What are the odds of my getting tails on my next throw? 2
Dr. John refers to the question as trivial and gives the mathematically correct answer of one half. Fat Tony calls Dr. John a sucker and says,”no more than 1 percent, of course … the coin gotta be loaded.”
This gets at a critical disconnect, noted often by Taleb in The Black Swan, that arises when we endeavor to generalize a real-world-applicable probability calculus from neatly devised games (application of which he aptly calls the ludic fallacy). This distinction between probability models and the real world is one I often struggle with in my ongoing attempts to understand the tense relations between formal probability, intuition (what I sometimes call informal probability)3, complexity, and epistemology (i.e., belief, opinion, knowledge). In short, I’m with Fat Tony: If I ever saw someone throw 99 Heads in a row, I’d think the game rigged.
To be clear, Taleb’s example urges us to go further than simply suspecting fowl play should we encounter a real-world instance of 99 Heads in a row. I presume the rational Dr. John would also be skeptical in that situation. What’s questioned in the example, rather, is whether we should accept such a scenario even on conceptual or theoretical grounds. This is what Fat Tony refuses to do by rejecting the thought experiment itself.
I’d like to explore this theme further, starting with a similar question: What is the probability of throwing 100 Heads in a row?
Some thoughts (I used Wolfram|Alpha for the math):
This is an easy question to answer. The probability of flipping a fair coin and getting 100 Heads in a row is 1 in 2^100. That’s 1 in 1,267,650,600,228,229,401,496,703,205,376.
Or, written out: 1 in 1 nonillion 267 octillion 650 septillion 600 sextillion 228 quintillion 229 quadrillion 401 trillion 496 billion 703 million 205 thousand 376
Or, in decimal form: .0000000000000000000000000000007888609052210118054117285652827862296732064351090230047702789306640625
In other words, the probability is very, very, very, very low. Not zero, but might as well be.4
And the probability of getting at least one Tails in 100 flips is: 1 – (1/2)^100.
Or, as a fraction: 1,267,650,600,228,229,401,496,703,205,375/1,267,650,600,228,229,401,496,703,205,376. (The only difference here is that the denominator ends with a 6 instead of a 5.)
Or, as a decimal: 0.9999999999999999999999999999992111390947789881945882714347172137703267935648909769952297210693359375
In other words, very, very, very, very high. Practically (i.e., might as well be) 1.
(For the record, the number of flips you need to execute in order to breech 99% confidence in landing at least one Heads is seven: 1 – (1/2)^7 or .9921875.)
These odds don’t stop many of us from saying, “It could happen.” I don’t believe it could happen, not in a predefined sample space. By which I mean each coin flip could be spaced 100 years apart, or could happen in a different country or you could flip the 100 coins all at once. So long as you are clear about which coins and which flips before you know the results of those flips, you’re never going to get 100 Heads. In other words, it wouldn’t do to find 100 Heads that came up somewhere in the world over the last 24 hours and designate a space around them.
I’m of course even more skeptical about flipping one thousand or one million or one trillion and so on Heads, though all of those are mathematically intelligible. The math is easy: each throw has a .5 chance of coming up Heads, even after 999,999,999,999 Heads in a row. But this doesn’t mean that getting even as few as 100 Heads in a row is not practically (perhaps also physically; see below for the distinction) impossible without rigging the game.5 It seems to me that it is impossible.
There is a certain sense, however, in which the seemingly impossible—according, that is, to the probabilistic terms I’m outlining here—does happen in the real world: If you flip a coin 100 times, you will get some arrangement of Heads and Tails, and that arrangement will have a probability of (1/2)^100 of occurring, the same probability as getting all Heads. So why would I think that arrangement is possible but all Heads isn’t?
Well, something is going to happen when you flip the coin. Just not what you happen to predict. Choosing one out of 2^100 outcomes, I’d say you have practically no shot at all of seeing that outcome come up; but, something with the same probability of occurring will come up. The number of those other outcomes is 2^100 – 1. A very big number.6
I’d like to make this more vivid. Suppose you have a bag that contains all real numbers. If you randomly grab a number out of the bag, the probability of getting any particular number is precisely zero (that is, the ratio of any given desired outcome to the total possible outcomes is 1 to infinitely many, which we treat as zero). Yet, some number will be pulled out of the bag.
Your probability of pulling that particular number was zero, but the probability was 1 that some number would come out. In other words, you have no hope of predicting what number you’ll pull. Now suppose you try to make things easier for yourself. You remove several infinitely large sets of numbers from the bag. Say you remove all but the positive numbers. The probability is still zero that you pull a particular number. Lets say you really try to narrow the space and pull from just between 4.88 and 4.89. There are (uncountably) infinitely many numbers in that space, so the probability is again zero that you’ll pull any particular number.
If you predict that you’ll pull, say, 4.880000000001, and that number comes up, I’d say the bag was rigged. Having been rigged is FAR more likely than pulling any number you predict. Similarly, though flipping 100 Heads in a row is technically non-zero, if I saw that happen, I’d say the probability is far more likely that I’ve encountered a cheater.
Following this line of thinking, I have to say that any sequence you predict ahead of time will be hopeless (there’s nothing special about all Heads, except that it’s so easy for the human mind to organize, and it is apparently something we remain on the lookout for). Further, any sequence involving what would count to a human as a recognizable pattern, from which a prediction may be made, would also suggest foul play. For example:
The probability of getting one from the set of all seemingly patterned outcomes would be hard to figure out, though it must be extremely low. Also, to be clear, there is no natural pattern here, no natural rule that’s being enforced, but only the observer-dependent illusion of one, even if that observer gets lucky enough to guess subsequent flips correctly by believing the apparent pattern. The odds, however, are incredibly in favor of the pattern going off the rails—these simply aren’t the sorts of results to expect from a random sequence of fair coin flips.
And yet, everything that happens is in some way an extremely rare event—an unsurprising thing for me to say, given my belief that any event only ever happens once. Types of events, however, do repeat. When you along a sidewalk, no one could have expected well in advance that you—as that particular arrangement of those particular particles, with your unique history, etc.— would at that time step on that particular arrangement of particles in that particular region of space. So, while part of what’s at the heart of my exploration here is an attempt to make sense of how to treat highly rare events, I should make clear the importance both of importance and of whether the rare event in question is a token (e.g., a particular step upon the ground, which only happens once) or a type (e.g., generic steps on the ground, which happen often).
Expectation seems to play a critical role here: you won’t get the sequence of flips you expect, you won’t pull the number from the bag you hope for, etc. But in the case of all Heads or any other apparent pattern, the expectation does not precede the run of flips. My claim is simply that 100 Heads in a row won’t happen, regardless of whether it happened to be on anyone’s mind. I claim the same of any sequence from which a reliably predictable pattern would appear to emerge, given the character of a fair coin. And yet, I’m faced with the conundrum that some equally unlikely sequence will indeed occur! Though this becomes far less of a conundrum on the grounds that Heads and Tails are equiprobable for the coin: when they each account for in the ballpark of 40–60 flips out of the 100, there’s nothing remarkable going on. Unless, again, it does so too tidily, with 50 each: THTHTHTHTHTHTHTHTH… or some such. Despite each result being equiprobable, that would be a remarkable result; and, I seem to claim, a practically impossible one.
So far, I’ve been focusing on the physical (or practical) possibility of flipping 100 Heads in a row. There may, however, also be a theoretical (or conceptual or logical) problem here. The way we know that a coin is fair is not just by declaring its possible flip outcomes (of which, frankly, there are more than two) as equiprobable in theory, but by flipping it many times and observing that it lands on each of its two faces roughly half the time.7
(Though, interestingly, not if that result is HTHTHTHTHTHT… or a similarly predictable pattern. Were a coin to yield that pattern indefinitely, supposing it starts on Heads after its first, freshly minted flip, it would not at any point be a 50–50 coin, but would have a probability of 1 of landing next on the alternate side. This, though, only works for the observer who knows what the previous flip result was! For the ignorant observer, even if the coin’s tendency is known, the probability of the first flip observed—i.e., the next flip—will revert to 50–50.)
If you flip a coin 100 times and it lands on one side, it’s by definition not a fair coin. Certainly this would be case if you threw it another 100 and then another 100 and another 100, and it continues to come up on the same side. These are supposed to be theoretically intelligible scenarios. We’re suppose to say that, after 300 throws, whatever those outcomes, the probability of getting Heads on the 301st throw is one half. But clearly this is not a fair coin.
It seems, then, that it is an oxymoron to invoke together the words “fair coin” and “100 Heads in a row.”
In other words, it may be logically incoherent to posit a coin that has an equal probability of coming up Heads or Tails, and to then describe a scenario in which that coin comes up only Heads for some huge number of flips. Just as it would not make sense to characterize a coin as heavily Heads-biased and then describe a scenario in which it comes up Heads only roughly half the time in some huge number of flips.
What counts as a huge number of flips? Five Heads from a fair coin is unremarkable.8 So, upon getting four Heads in a row, it’s far too soon to call the game fixed. And it’s certainly to commit the Gambler’s Fallacy should you ever assume any sort of interdependence between the flip results, or that mysterious forces—natural or otherwise—are influencing flip outcomes (cheaters notwithstanding). At any rate, four Heads in, you should assign .5 to the likelihood of the fifth throw yielding Heads—which is to say that you now have a .5 chance of having thrown HHHHH, as well as of having thrown HHHHT. (To underscore the independence of the flip events, imagine throwing the first four Heads today, then returning ten years from now to throw the fifth toss.)
I’d say the same about six, seven, eight, nine throws. Twenty in a row should be doable—that’ll happen in about one in 1.05 millions runs. Maybe even 30 is fine. But at some point, there must be a line where the likelihood of getting all Heads in a row becomes problematic. Where is that line? Certainly far before, say, one nonillion—there is no world in which that happens (feel free to collaborate billions of coin-flippers all at once so it’s not a matter of running out of time). Somewhere, the line from possible to impossible is crossed. Construe impossible how you like here—I simply mean that there is no possible world in which that happens with a fair coin. There also may be some intermediate lines, such as between logically possible and practically impossible (meaning, we think maybe it could happen, but it’s certainly not something to expect to see on this planet; while impossible—whether physically, logically, or metaphysically—explicitly means it’s never going to happen on this or any other world, even in an infinite number of flips).
I think the possible–to–impossible line lives somewhere before 100 Heads (or, as noted above, before any apparently predictable sequence of 100 flips occurs). Where that is, I don’t think can be intelligibly said within current probability theory. That is, if I say it’s at 35, that would mean that, after 35 Heads, a Tails would be due. But that’s nonsense, and just as much an instance of the Gambler’s Fallacy as in the unremarkable cases above. There can be no number that demarcates possible and impossible: If you make it to one Heads you have a .5 chance of making it to two Heads, and on and on up to 36 and beyond. Put in other terms, there’s no clear line where you can definitely rule the coin unfair. (At which point perhaps you might try a different route and measure the physical properties of the coin itself.) Though you can say things like, “there’s a 1-2^(-5), or 97% chance, of getting at least one Heads in five flips of a fair coin.”
And yet, I maintain that there is at least some vague line between unremarkable and remarkable results; and, somewhere beyond that, between possible and impossible (if not before nonillion Heads, how about infinitely many? In what world could that be a fair coin?). In conclusion, then, I rule in favor of Fat Tony.
In a similar vein, Evelyn Lamb, in an article for Scientific American called “Has Anyone Ever Flipped Heads 76 Times in a Row?,” examines the 76 Heads in Rosencrantz and Guildenstern Are Dead and concludes, “After crunching the numbers, I am convinced that no one in the world has ever flipped heads 76 or 90 times in a row on a fair coin…” She also writes about the topic here: “Heads I Win, Tails You Lose,” where she links a nice dialog by Ben Orlin (at Math with Bad Drawings) that conveys a similar moral to that of Taleb’s Fat Tony example: “The Swindler’s Coin.”
Some closing thoughts:
It would be nice were our probability models to map neatly onto the real world, but perhaps the best we can aim for is to be aware of their limitations (as we are, say, with Euclidean geometry) while taking care not not to confuse them—nor our models more generally—with the unfathomably complex real world, no matter how complicated our models get.
Probability is a model that permeates the broader models we rely on to create our world.9 It is from this perspective that I’m interested in probability—a perspective that is in line with my interest in underlying, world-making concepts in general (the word I give this perspective, as a daily practice, is philosophy).
It’s often said that to the person who has only a hammer, the whole world looks like a nail. Probability is a tool. Its form shapes how those who use it see the world. And a lot of people use it.10 Those who deal in purer and purer concepts engage in world-building. Shape the tools and you shape the world. Dismantle the tools, you dismantle the world. Barring that, one might at least try to understand the tools. It strikes me as meaningful (if insomnia-inducing) work. I’ll keep at it.
- Initially published in 2007. I reference here the 2010 Second Edition.
- Page 124.
- Intuition is one part of informal probability. Also included are psychological, political, and sociological concerns. For example, it’s possible to view certain accounts of racism and sexism as critiques about informal probability; e.g., about assigning an either outsized or otherwise inappropriate likelihood to a person fitting a particular stereotype. There’s a lot to unpack here. It would be nice to see some formal research into the cognitive and social psychology of this, of the sort we see described in current work being done on cognitive bias and behavioral economics. I think the socio-political dimension will prove challenging to study, however, given that while basic cognitive biases can be picked out given discrepancies between intuitive and objective mathematical results, there may be, for example, good reasons for ignoring sound statistical data when trying to effect positive social change.
- To be clear, having a probability of zero doesn’t mean impossible. We might think of it as there simply being no degree of certainty that it will happen. I have zero degree of certainty, or confidence, for pulling the number 2 out of a bag containing infinitely many numbers. Indeed, that probability is zero. But some number will come out of the bag. And that outcome had a probability of zero as well. More on this below.
- For rigging expertise, see the work described in Dynamical Bias in the Coin Toss by Persi Diaconis, Susan Holmes, and Richard Montgomery; SIAM Review Vol. 49, No. 2 (Jun., 2007), pp. 211-235. They made a machine that can consistently yield the same flip result.
- Among the named distinctions to be made in probability (e.g., frequentist vs. subjective) may be the event that is so unlikely to happen that it can be considered practically impossible, versus the thing that in fact does happen in that situation, and that has the same probability as the thing that was seemingly too improbable to happen. Does that phenomenon have a name?
- It’s often claimed that we could also determine that a coin is fair by measuring it. If so, what I’m claiming is that you will not find, by whatever empirical means, a coin to be fair that could then land 100 Heads in a row, much less infinitely many Heads in a row (despite the standard tendency to declare this theoretically possible; it certainly isn’t physically possible).
- Slightly remarkable is the fact that I just now picked up a quarter to see how many Heads I might get. First try, I got five in a row. The sixth toss was Tails. Six in a row, by the way, is about the longest sequence you should expect to get of Heads and Tails in 100 flips.
- By our world, or just world, I mean the world we construct and inhabit through sensory perception, math, language, embodied cognition, probability models, art, and so on. The real world on the other hand is unconstructed and model-independent; in a word, it’s reality: the thing that our world is constructed to help us navigate. Our world may be complicated, but it’s not necessarily complex. The real world, though it has enough regularities for, say, consciousness to have evolved, is largely a complex mess full of uncertainty, irrational numbers (figuratively speaking), and other things tough for us humans to make sense of. These two modes may influence one another, and may overlap in part, but they are distinct. You and I might live in different worlds, but we live in the same reality.
- As I remarked in Footnote 3, many people use probability informally. Here’s a nice summary from of probability’s formal applications, taken from the second paragraph of the Stanford Encyclopedia entry “Interpretations of Probability“:
It plays a role in almost all the sciences. It underpins much of the social sciences — witness the prevalent use of statistical testing, confidence intervals, regression methods, and so on. It finds its way, moreover, into much of philosophy. In epistemology, the philosophy of mind, and cognitive science, we see states of opinion being modeled by subjective probability functions, and learning being modeled by the updating of such functions. Since probability theory is central to decision theory and game theory, it has ramifications for ethics and political philosophy. It figures prominently in such staples of metaphysics as causation and laws of nature. It appears again in the philosophy of science in the analysis of confirmation of theories, scientific explanation, and in the philosophy of specific scientific theories, such as quantum mechanics, statistical mechanics, and genetics. It can even take center stage in the philosophy of logic, the philosophy of language, and the philosophy of religion. Thus, problems in the foundations of probability bear at least indirectly, and sometimes directly, upon central scientific, social scientific, and philosophical concerns. The interpretation of probability is one of the most important such foundational problems.