Dice Probability: Median Rolls to 6 Sides (Coupon Collector’s) with Stirling Numbers

Estimated read time (minus contemplative pauses): 8 min.

The other day I posted an explanation for how to find the median number of rolls for seeing all six sides of a die. I tried a few things that didn’t work, questioned my understanding of what a median is, and ultimately used a Markov chain to produce some equations that, when subjected to Newton-Raphson’s method, produced an approximate median of 12.81104728506638 rolls.

Read about that at: “Dice Probability: Median Rolls to See All 6 Sides (Coupon Collector’s Problem).” If anything here is mysterious, you might want to give it a look.

At the SAS blog post that got me thinking about the median question, post author Rick Wicklin notes in a comment that the probability of collecting n objects in m trials can be expressed with:

\displaystyle \frac{n!}{n^m}S_2(m,n)

Where, in the present context, m is the number of rolls, n is the number of faces collected, and S2(m,n) is a Stirling number of the second kind. The latter object was—but is no longer!—new to me. It is, to quote MathWorld:

The number of ways of partitioning a set of n elements into m nonempty sets (i.e., m set blocks), also called a Stirling set number. For example, the set {1,2,3} can be partitioned into three subsets in one way: {{1},{2},{3}}; into two subsets in three ways: {{1,2},{3}}, {{1,3},{2}}, and {{1},{2,3}}; and into one subset in one way: {{1,2,3}}.

I’ll say a little more about that shortly. For now, I’ll just use this calculator to work out the number for n = 6, m = 12.81104728506638. That gives 6465021.47022249742117253840516. If we plug these numbers into the formula, we get, as expected, 1/2:

\displaystyle \frac{6!}{6^{12.81104728506638}}\cdot6465021.47022249742117253840516 = 0.5

Nice! But how does it work? To consider that, I’ll use 13 as the median, which also gives exactly the expected output based on my previous post (don’t forget to update the Stirling number):

\displaystyle \frac{6!}{6^{13}}\cdot9321312 = 0.513858194042

Here’s how it works. If I say anything wrong, let me know!

First, let’s find the denominator.

There are six die faces and we’re rolling the die 13 times. That means there are 613 possible outcomes (i.e., there are six ways the first roll can go, six ways the second roll can go, and so on up to 13 rolls; multiply those to get 613). For example, we might roll this outcome: 1, 2, 3, 4, 5, 4, 3, 2, 1, 2, 3, 4, 5. Thus failing to collect all six faces.

Another way to say essentially the same thing is that there is a probability of (1/6)13 that we roll that sequence, because each roll has a 1/6 probability of coming up precisely at the spot indicated for it. This is, of course, true for any given outcome of 13 rolls. But how many of those 613 outcomes include all six faces?

In other words, let’s find the numerator.

Start with the Stirling number of the second kind, which tells us how many ways we can partition 13 rolls into six “boxes,” let’s call them. That is, let’s assign each of six boxes a number corresponding to the die’s six faces: Box1, Box2, and so on up to Box6. By stipulating six boxes, we ensure that all six faces are used.

Here’s how I’ll model this. Suppose on our first roll we get a 3; then “roll1” goes into Box3. Suppose the next roll is a 6; then roll2 goes into Box6. And so on. The Stirling number of the second kind tells us how many different ways there are to fill the six boxes with 13 rolls. For example, we might get:

Box1[roll1, roll5, roll13];  Box2[roll2, roll10];  Box3[roll6];  Box4[roll3, roll12];  Box5[roll4, roll7, roll9];  Box6[roll8, roll11].

This describes a rolling sequence of 1, 2, 4, 5, 3, 5, 6, 5, 2, 6, 4, 1. The winning roll was roll7.

How many partitions are there? There are S2(13, 6) = 9,321,312. But this isn’t our full numerator because, for a partition, order doesn’t matter. In other words, the above partition is no different than, say:

Box3[roll6];  Box1[roll1, roll5, roll13];  Box2[roll2, roll10];  Box6[roll8, roll11];  Box4[roll3, roll12];  Box5[roll4, roll7, roll9].

This may not be surprising, as it still describes the same sequence of rolls as above. We want order to matter, however, in which case we call it a “composition” rather than a “partition.”

There are different ways to conceptualize this. Here’s how I’ll do it. It’s easy to imagine keeping the roll#’s where they and rearranging only the box labels. Had we done that above, we’d have ended up with:

Box3[roll1, roll5, roll13];  Box1[roll2, roll10];  Box2[roll6];  Box6[roll3, roll12];  Box4[roll4, roll7, roll9];  Box5[roll8, roll11].

Now we have a different outcome: 3, 1, 6, 4, 3, 2, 4, 5, 4, 1, 5, 6, 3. There are 6! ways we can arrange the boxes while keep the roll#’s fixed. Same goes, of course, for every possible partition. This means that the numerator is 6! × 9,321,312.

To be clear, the box numbers are only for organizational clarity about the compositions. Technically, the above arrangements are all still the same partition, even after changing the numbers on the boxes. This is because they have the same groupings; for example, roll2 and roll10 are still in a box together. The number of ways we can make such groupings is precisely what Stirling numbers of the second kind tell us. So we can rest assured that there is no double-counting happening among the compositions.

Here’s a nicely done video if you’d like to look a little closer at Stirling numbers (also includes the first kind):

Now that we know how many successful outcomes there are, we can set that in ratio to our denominator of 613, which is another way of saying that we can multiply by the probability of any given scenario occurring—i.e., (1/6)13—to get, once again:

\displaystyle \frac{6!}{6^{13}}\cdot9321312 = 0.513858194042

And there we have verified, via (I hope) an intuitive method, the answer of 13 for the median. And we can now put in any number we like for m to get the probability of having collected all six faces after m rolls; and with an integer for m, the number will be exact (the exact answer for 13, by the way, is 485485/944784)​.

But what about setting the formula equal to something? Like:

\displaystyle \frac{6!}{6^m}S_2(m,6) = \frac{1}{2}

How to find m?

Before addressing that, I’d like to briefly ponder a question that you may or may not prefer to skip. Above, among the 6! × 9,321,312 = 6,711,344,640 outcomes that contain all six faces, there are many that look like, say: 1, 2, 3, 4, 5, 6, 1, 1, 1, 1, 1, 1, 1. Does it really make sense to include sequences that continue on for seven (or however many) rolls after collecting the sixth face?

Recall that the claim here is that ~51.4% of the time—or put another way, with ~51.4% probability—we will have collected 6 faces after the 13th roll. I conceptualized this at least two ways in the previous post.

With the Markov chain, it was that there’s ~51.4% probability of going from having collected no faces to having collected 6 faces by the 13th roll.

Another way was to express it as a cumulative distribution function, adding up the probabilities for ways to succeed until you hit or pass 50%: the probability of succeeding on 6th roll PLUS the probability of succeeding on 7th roll, and so on up to succeeding on the 13th roll, all adds up to ~.514.

What we are not finding is the probability of having succeeded on exactly the 13th roll. In other words, above, when we counted up the outcomes for the numerator, those didn’t include only those outcomes in which the sixth face is seen for the first time on the 13th roll. I worked out that probability last post by using the Markov chain to calculate the probability of having collected only 5 faces by the 12th roll (~.456), then multiplying that by (1/6) to get ~.076.

What we’ve been doing, on the other hand, involves all the ways we can succeed given 13 rolls, even if the success is on, say, the sixth roll. This is fine, as the probability of succeeding on the sixth roll is the same for rolling six times as it is for rolling 13 times. This is because after the winning roll, there is always a 100% chance of getting a repeat of an already collected face. So you’re just multiplying by 1 over and over again.

In short, we are in general looking to capture all the ways in which we have rolled 13 times and have collected all six faces. Relating all these models for arriving at the same answer has required of me a little bit of staring into space and pacing about the room, but it does make sense—one of those “convince yourself that this is true” moments math explainers so often encourage.

Moving on.

Finding m in the equation is hard. That again is:

\displaystyle \frac{6!}{6^m}S_2(m,6) = \frac{1}{2}

I suppose the best thing to do is a guess-and-check. To streamline this, we can use one of the results listed at the WikipediaCoupon Collector’s Problem” entry. This is less intuitive than the other stuff I’ve explored here, but let’s check out the result (brought to my attention by this Math StackExchange thread) listed at the top with the attribution: “Pierre-Simon Laplace, but also Paul Erdős and Alfréd Rényi, proved the limit theorem for the distribution of T.”

Where T is the time (or for us, number of rolls) it takes to collect n objects:

\displaystyle P(T < n\log_e( n) + cn) \to e^{-e^{-c}}, \text{ as } n \to \infty

Our n is 6. But we need to solve for c. We want the probability to go to 1/2, so we need e^{-e^{-c}}=1/2.

I’ll show the steps I use to solve for c (for a review of logarithm operations, see this Math Is Fun page):

\displaystyle e^{-e^{-c}}=\frac{1}{2} \\ \\ \\ -e^{-c}=\ln\left(\frac{1}{2}\right) \\ \\ \\ e^{-c}=-\ln\left(\frac{1}{2}\right) \\ \\ \\ e^{-c}=\ln\left(2\right) \\ \\ \\ \ln\left(\ln\left(2\right)\right)=-c \\ \\ \\ c=-\ln\left(\ln\left(2\right)\right)

Now plug that, along with n = 6, back into (n\log_e( n) + cn):

\displaystyle 6\ln\left(6\right)-6\ln\left(\ln\left(2\right)\right) \approx 12.9496343389

That’s a nice approximation of the ~12.811 we expect to see. Plugged into the equation, with S2(12.9496343389, 6) (found, again, with this calculator):

\displaystyle \frac{6!}{6^{12.9496343389}}\cdot8456149.43431563032204592089648 = 0.510188884844

Not bad. It’s easy to see that going closer to 12.8 might help get closer to 1/2. And if you’re looking for an integer, this is a fast way to do it that I imagine becomes increasingly useful as n grows.

But if you want to get as precise as 12.811, much less 12.81104728506638, I think you’ll need different tools. For example, I arrived at those numbers in the previous post with a Markov chain that yields equations* that can be easily graphed or popped into an application like Wolfram|Alpha to run it through the Newton-Raphson method or some such.

[*Whose recursive, inclusion-exclusion nature, however, makes me think I can get to them with Stirling numbers of the second kind.]

I did, by the way, input the argument (6!/6^x)*stirlingS2[x,6] = .5 at Wolfram|Alpha, and it recognized the input but didn’t yield a value for x. Maybe they have some dedicated machine there that can do this with the same precision as the Newton-Raphson method, though I didn’t see one. Seems it shouldn’t be particularly difficult. And there’s surely far more out there than I’ve managed to see in the few nanoseconds I’ve spent scratching—better yet, smudging—the surface of these tools.

Speaking of which, I’m not going to obsess today over where the limit theorem comes from. But if you’d like to, here’s Erdős & Rényi’s 1961 paper: “On a Classical Problem of Probability Theory” (MR0150807).

I think that’ll do for Coupon Collector problems for a while.


Enjoy or find this post useful? Please consider pitching in a dollar or three to help me do a better job of populating this website with worthwhile words and music. Let me know what you'd like to see more of while you're at it. Transaction handled by PayPal.

Further Reading

Share your thoughts: