6

I have a question related to dice. Suppose there are 6 dice, and each die is rolled to receive a random value. Let the set of values obtained be represented as $\{a_1, a_2, ..., a_6\}$.

The problem I'm trying to solve is: What is the probability that, among the 6 random numbers obtained from the dice, there exists at least one subset of three numbers whose sum equals 6?

Here’s how I approached the problem:

The probability that the sum of three dice equals 6 is $10/216$. The number of ways to choose 3 numbers from 6 is $6 \choose 3$. Therefore, the probability can be computed as $1 - (1 - 10/216)^{20}$. Thus, the probability that at least one subset of three numbers sums to 6 is $61.25$%. To verify this, I wrote a program to simulate this process, but the actual value I obtained was quite different from my theoretical result.

Both checking all possible cases and running a simulation 100,000 times gave me a result of approximately $41$%.

Why is there such a discrepancy between the theoretical calculation and the actual result? How can I reconcile this difference?

Here is the code I used:

import math 
import random 
import os 
import itertools

n = 6 m = 3 t = 6 print(f'Probability that the sum of {m} dice out of {n} dice is {t}')

dice = [1, 2, 3, 4, 5, 6] hap = 0 total = 0 ALL_Events = [[]] for i in range(m): ALL_Events = [E + [e] for e in dice for E in ALL_Events]

for E in ALL_Events: if t == sum(E): hap += 1 total += 1

one_set_prob = hap / total print(f'Probability that the sum of {m} dice is {t} : {hap}/{total} = {one_set_prob * 100:.2f}%')

com = math.comb(n, m) calc_prob = 1 - ((1 - one_set_prob) ** com) print(f'Probability that the sum of {m} dice out of {n} is {t} (calculated): {calc_prob * 100:.2f}% (= 1 - {1 - one_set_prob:.4f} ^ {com})')

random.seed(os.urandom(16)) hap = 0 total = 0 for i in range(100000): E = [random.choice(dice) for _ in range(n)] for e in itertools.combinations(E, m): if t == sum(e): hap += 1 break total += 1

test_prob = hap / total print(f'Probability that the sum of {m} dice out of {n} is {t} (test): {test_prob * 100:.2f}%')

hap = 0 total = 0 ALL_Events = [[]] for i in range(n): ALL_Events = [E + [e] for e in dice for E in ALL_Events]

for E in ALL_Events: for e in itertools.combinations(E, m): if t == sum(e): hap += 1 break total += 1

real_prob = hap / total print(f'Probability that the sum of {m} dice out of {n} is {t} (actual value): {real_prob * 100:.2f}%')

```

pioneer
  • 173
  • Isn't it easier to just list all 19192 configurations (of 6^6=46656 total) for which there is a subset satisfying the required condition, instead of simulations? Hence the probability is exactly 19192/6^6. – van der Wolf Oct 24 '24 at 13:27
  • 1
    @vanderWolf Thanks for the suggestion! My goal, however, is to generalize the problem. I used the 6-dice, 3-subset case as an example, but I'm interested in finding the probability that, for n dice, there exists a subset of m dice whose sum equals t. Any insights on generalizing this approach would be appreciated! – pioneer Oct 24 '24 at 13:37
  • @vanderWolf I was also curious why there was a discrepancy between my mathematical calculation and the simulation results. – pioneer Oct 24 '24 at 13:38
  • Regarding large $n$ and the subset size of $m$: if $n\to\infty$ but $m$ is fixed and $t\le m\le 6m$, then the probability will go to one (law of large numbers), but there can be more intricate situations, e.g. when we have some functions $m(n)$ and $t(n)$ and to see what happens as $n\to\infty$.... – van der Wolf Oct 24 '24 at 14:46
  • Intermediate result 10/216 : ok. When we repeat an experience 20 times, all independant each other, formula $1-(1-p)^{20}$ is ok. But your 20 experience are not independant. If dice A,B,C have value 1,2,3, it is a success, and I am highly favorite to have also a succes with (A,B,D) or (A,C,E) or... – Lourrran Oct 24 '24 at 15:50
  • In the general case, even the deterministic subset sum is a well-known NP-complete problem (though only weakly so). – HighDiceRoller Oct 24 '24 at 21:05

2 Answers2

3

This answer is an explicit application of the Inclusion-Exclusion referred to in the answer of mathperson314. See this article for an introduction to Inclusion-Exclusion. Then, see this answer for an explanation of and justification for the Inclusion-Exclusion formula.

Let $~S~$ denote the collection of all possible ways of rolling the $~6~$ dice. Then $~| ~S ~| = 6^6.$

Let $~S_1~$ denote the subset of $~S~$ where $~3~$ of the dice showed [1::1::4].

Let $~S_2~$ denote the subset of $~S~$ where $~3~$ of the dice showed [2::2::2].

Let $~S_3~$ denote the subset of $~S~$ where $~3~$ of the dice showed [1::2::3].

Then, the desired computation of the probability is

$$\frac{| ~S_1 \cup S_2 \cup S_3 ~|}{| ~S ~|} = \frac{| ~S_1 \cup S_2 \cup S_3 ~|}{6^6}. \tag1 $$

By Inclusion-Exclusion theory, the numerator in (1) above is equivalent to

$$\left\{ ~| ~S_1 ~| + | ~S_2 ~| + | ~S_3 ~| ~\right\}$$

$$- ~\left\{ ~| ~S_1 \cap S_2 ~| + | ~S_1 \cap S_3 ~| + | ~S_2 \cap S_3 ~| ~\right\} $$

$$+ ~| ~S_1 \cap S_2 \cap S_3 ~|.$$

So, the entire problem reduces to analytically computing the $~7~$ terms above.

Some of these $~7~$ terms will individually be computed by Inclusion-Exclusion.


$\underline{\text{Computation of} ~| ~S_1 ~|}$

[1::1::4]

Let $~A~$ denote the set of all dice rolls.

Let $~A_1~$ denote the subset of $~A~$ that does not contain at least $~2~$ 1's.

Let $~A_2~$ denote the subset of $~A~$ that does not contain a 4.

Then $$|~S_1~| = [ ~| ~A ~| - | ~A_1 \cup A_2 ~| ~]$$

$$= [ ~| ~A ~| - | ~A_1 ~| - | ~A_2 ~| + | ~A_1 \cap A_2 ~]$$

$$= 6^6$$

$$- [ ~5^6 + (6 \times 5^5) ~]$$ $$- 5^6$$ $$+ [ ~4^6 + (6 \times 4^5) ~]$$

$$= [ ~46656 + 4096 + 6144 ~] - [ ~15625 + 18750 + 15625] = 6896.$$


$\underline{\text{Computation of} ~| ~S_2 ~|}$

[2::2::2]

The number of ways of having less than 3 2's is

$$5^6 + (6 \times 5^5) + (15 \times 5^4)$$

$$ = 70 \times 625 = 43750.$$

Then

$$| ~S_2 ~| = 6^6 - 43750 = 46656 - 43750 = 2906.$$


$\underline{\text{Computation of} ~| ~S_3 ~|}$

[1::2::3]

Let $~A~$ denote the set of all dice rolls.

Let $~A_1~$ denote the subset of $~A~$ that does not contain at least $~1~$ 1.

Let $~A_2~$ denote the subset of $~A~$ that does not contain at least $~1~$ 2.

Let $~A_3~$ denote the subset of $~A~$ that does not contain at least $~1~$ 3.

Then $$|~S_3~| = [ ~| ~A ~| - | ~A_1 \cup A_2 \cup A_3 ~| ~]$$

$$= [ ~| ~A ~| $$

$$ - \left\{ ~| ~A_1 ~| - | ~A_2 ~| + | ~A_3 ~| ~\right\} $$

$$ + \left\{ ~| ~A_1 \cap A_2 ~| - | ~A_1 \cap A_3 ~| + | ~A_2 \cap A_3 ~| ~\right\} $$

$$- | ~A_1 \cap A_2 \cap A_3 ~|$$

$$= 6^6 - [ ~3 \times 5^6 ~] + [ ~3 \times 4^6 ~] - [3^6] = 11340. $$


$\underline{\text{Computation of} ~| ~S_1 \cap S_2 ~|}$

[1::1::4] :: [2::2::2]

$$| ~S_1 \cap S_2 ~| = \binom{6}{2} \times \binom{4}{1} = 60.$$


$\underline{\text{Computation of} ~| ~S_1 \cap S_3 ~|}$

[1::1::4] :: [1::2::3]

The direct approach is easiest here:

  • 1 :: 1 :: 4 :: 2 :: 3 :: [ add 1]
    $\displaystyle 6 \times 5 \times 4 = 120.$

  • 1 :: 1 :: 4 :: 2 :: 3 :: [ add 2]
    $\displaystyle \binom{6}{2} \times \binom{4}{2} \times \binom{2}{1} = 180.$

  • 1 :: 1 :: 4 :: 2 :: 3 :: [ add 3]
    $\displaystyle \binom{6}{2} \times \binom{4}{2} \times \binom{2}{1} = 180.$

  • 1 :: 1 :: 4 :: 2 :: 3 :: [ add 4]
    $\displaystyle \binom{6}{2} \times \binom{4}{2} \times \binom{2}{1} = 180.$

  • 1 :: 1 :: 4 :: 2 :: 3 :: [ add 5 or 6]
    $\displaystyle \binom{6}{2} \times 4! \times 2 = 720.$

$$| ~S_1 \cap S_3 ~| = 120 + (3 \times 180) + 720 = 1380.$$


$\underline{\text{Computation of} ~| ~S_2 \cap S_3 ~|}$

[2::2::2] :: [1::2::3]

The direct approach is easiest here:

  • 2 :: 2 :: 2 :: 1 :: 3 :: [ add 1]
    $\displaystyle \binom{6}{3} \times \binom{3}{2} = 60.$

  • 2 :: 2 :: 2 :: 1 :: 3 :: [ add 2]
    $\displaystyle \binom{6}{4} \times \binom{2}{1} = 30.$

  • 2 :: 2 :: 2 :: 1 :: 3 :: [ add 3]
    $\displaystyle \binom{6}{3} \times \binom{3}{2} = 60.$

  • 2 :: 2 :: 2 :: 1 :: 3 :: [ add 4, 5, or 6]
    $\displaystyle \binom{6}{3} \times 3! \times 3 = 360.$

$$| ~S_2 \cap S_3 ~| = 60 + 30 + 60 + 360 = 510.$$


$\underline{\text{Computation of} ~| ~S_1 \cap S_2 \cap S_3 ~|}$

[1::1::4] :: [2::2::2] :: [1::2::3]

1::1::2::2::2::3::4 -- not possible.

$$| ~S_1 \cap S_2 \cap S_3 ~| = 0.$$


$\underline{\text{Final Computation}}$

$$\left\{ ~| ~S_1 ~| + | ~S_2 ~| + | ~S_3 ~| ~\right\}$$

$$- ~\left\{ ~| ~S_1 \cap S_2 ~| + | ~S_1 \cap S_3 ~| + | ~S_2 \cap S_3 ~| ~\right\} $$

$$+ ~| ~S_1 \cap S_2 \cap S_3 ~|$$

$$= \left\{ ~6896 + 2906 + 11340 ~\right\}$$

$$- \left\{ ~60 + 1380 + 510 ~\right\}$$

$$+ \left\{ ~0 ~\right\}$$

$$ = 19192.$$

Therefore, the probability of being able to find a suitable subset of 3 dice is

$$\frac{19192}{6^6} = \frac{19192}{46656} = \frac{2399}{5832}.$$

user2661923
  • 42,303
  • 3
  • 21
  • 46
2

Let $E_{ijk}$ be the event that $a_i+a_j+a_k=6$. The events $E_{ijk}$ are not independent, which is required to use the product rule for probabilities $P(A \cap B) = P(A)P(B)$.

For instance: If $E_{ijk}=6$ then $a_i, a_j$ are small so it is far more likely that $E_{ijl}=6$ for another $l$.

To reconcile the difference and calculate the exact probability would be hard and require using conditional probability, or maybe inclusion exclusion.

Let events $A_i$ be defined as follows:

  1. There are (at least) three $2$'s.

  2. There is a $1,2,3$.

  3. There is a $4$ and two $1$'s.

There are no other ways to produce a sum of 6 from three dice.

Then could you use inclusion-exclusion to calculate $\Pr(A_1 \cup A_2 \cup A_3)$?

Note that $P(A_1 \cap A_2 \cap A_3) > 0$ because you might roll $1,1,2,2,2,3,4$. So this requires some work.

Also $P(A_1)$ is the probability that a binomial random variable $B(6,1/6)$ is at least $3$, which is a rather complicated expression to say the least. $P(A_2)$, $P(A_3)$ are even harder to express.

Another approach might be to do inclusion-exclusion on the $E_{ijk}$. For instance, what is $P(E_{123} \cap E_{124})$? This will require more layers, but might actually end up being easier.

  • Thanks for your answer! First, I used inclusion-exclusion as you mentioned, but since all intersections are 0, the result still matches my original calculation of $10/216$. Second, shouldn't we also consider the combinations when selecting 3 dice from 6? – pioneer Oct 24 '24 at 13:51
  • @pioneer See edit for your first question. The different combinations were implicit in my definitions of the $A_i$. For instance $A_1$ is three twos, in any position. – mathperson314 Oct 24 '24 at 14:00