Elementary statistical stuff

Eleas · Post by **Eleas** » 2005-11-17 02:19am

I've been having trouble with some fairly simple mathematical stuff. The basic trouble stems from the Infinite Dice found in an RPG I'm fond of. It uses six-sided die, but is still tricky, as I'll show you below.

First, these "infinite" dice are rolled as norma. Any result from 1 to 5 is tallied together normally. At six, however, the die is removed and instead two new dice are rolled. For example, if you roll two dice and get a 2 and a 6, you keep the 2 but roll two more dice, and if one of them should become a 6, you roll two more, et cetera.

As a consequence, the roll is open-ended and can potentially reach infinite value. This system is good for introducing uncertainty into a game.

Now, the idea in this game is to simulate abstracted "skills" and determine whether an attempted action has been successfully performed, or not. Typically, three "infinite" dice are rolled and compared to the skill value. If the result is below or equal, the action succeeded.

Now... how do I calculate probabilities based on that? For example, say that I have a "skill value" of 15; what's my probability of succeeding at a roll of three "infinite" dice?

drachefly · Post by **drachefly** » 2005-11-17 11:46am

Let's start by getting the probability of each number on a single die (we only need to go as high as 13 for purposes of this, but you'll see how it extends)

After the first roll, 1-5 have probability 1/6 and 1/6 possibility of a split.
The average value of the split is 2 times the average value of the first roll (each of the two rolls is identical to the first).

we can do a little algebra to get the mean:

AVERAGE = (1 + 2 + 3 + 4 + 5 + 2*(AVERAGE))/6
AVERAGE = AVERAGE/3 + 15/6
AVERAGE = (3/2)(5/2) = 15/4 = 3.75, not all that impressively more than the normal average of 3.5.

This is the average roll, not a specific probability...

as for the individual probabilities, you'll want to make a table calculating the probability of having a total of N splits:
no splits: 5/6
one split: 1/6 * (5/6)^2... i.e. first roll splits, neither of the splits splits.
two splits: 1/6 * 2 *(5/6)(1/6) * (5/6)^2... i.e. first roll splits, either the first of the split dice splits, or the second of the split dice splits.. and then neither of the resulting dice split.

Beyond 14 splits, you will cease to have contributions to the probability of rolling a 15 because if each die came up as a 1 then you'd still go over.

So, it's a finite problem, but large if done naively.

Is there a cleverer way to do it? Yes, using networks and some algebra... but it's more complicated.

drachefly · Post by **drachefly** » 2005-11-17 02:40pm

To be even more specific, construct a spreadsheet in which each cell represents a state of the system -- one dimension is the number of dice left to be rolled, and the other is the total score so far.
In each cell put the probability that this state will be passed through.
How do you do this? Use the formula

P(S, R) = (P(S, R-1) + sum(i=1...5)(P(S-i, R+1)))/6
with S the score so far and R the rolls remaining. The first part takes into account the possibility of increasing the rolls by rolling a six; the second part takes into account increasing the score by rolling anything else.

Spread this over a region bounded by the cells of 0 score or 0 rolls on one corner and the maximum roll you're interested in on the opposite corner

Then you put a '1' in the (0,1) slot for a single roll, or in the (0, 3) slot for a three-die roll....
and then you'll see the probability for getting any given result laid out along the 0 rolls left axis.
The answer to the question you asked would be the sum of the entries from (3,0) to (15,0).

Kuroneko · Post by **Kuroneko** » 2005-11-17 04:30pm

The expected value for n 6-sided infinite dice is indeed 15n/4, as drachefly calculated. Given n dice that do not split, the number of ways k can occur is the coefficient A(n,k) of x^k in (x+x^2+x^3+x^4+x^5)^n. Recognizing the multinomial series, this can be shown to be A(n,k) = \sum_{j=0}^n{ (-1)^k C(n,j)C(k-5j-1,k-5j-n) }, where C(⋅,⋅) is the binomial coefficient. Most of the terms in the summation do not contribute, so calculating it is actually simpler than it looks. Let's call the probability that n dice will have exactly m splits N(n,m). If one lets B(n,k) = \sum_{j=0}^n{ A(n,j) }, then the probability of getting a roll less than or equal to k is obviously P(n,k) = N(n,0)B(n,k)/5^n + N(n,1)B(n+1,k)/5^{n+1} + N(n,2)B(n+2,k)/5^{n+2} + ... . The values of B(n,15) for 0<n<16 are 5,25,125,555,1753,3745,5595,6075,4915,2993,1365,455,105,15,1, and zero everywhere else. The values of N(n,m) are not too difficult to calculate for n = 3, but I'm more interested in the general case. I'll come back to it when I have the time.

Faram · Post by **Faram** » 2005-11-17 04:38pm

drachefly, Kuroneko

You guys scare me, I cannot understand what you just did. Not that I even tried.

Kuroneko · Post by **Kuroneko** » 2005-11-17 11:06pm

If p = 5/6 and q = 1/6, the recursion N(n,m) = \sum_{k=1}^m{ C(n,k) p^{n-k} q^k N(2k,m-k) } is easy to crunch on a computer. The following are the result for N(3,k) for k=0..12: 125/216, 625/2592, 3125/31104, 109375/2519424, 390625/20155392, 4296875/483729408, 1955078125/470184984576, 1396484375/705277476864, 10791015625/11284439629824, 5125732421875/10968475320188928, 30360107421875/131621703842267136, 45343017578125/394865111526801408, 14736480712890625/255872592269367312384, while the numbers of ways for n dice (2<n<16) to give a number less than 15 being: 125, 555, 1753, 3745, 5595, 6075, 4915, 2993, 1365, 455, 105, 15, 1. For example, the probability that three dice do not split is 125/216, and given that they do not split the chance that they give a result no greater than 15 is 125/5^3 = 1. Summing up over all 13 possibilities, the exact probability that three six-sided infinite dice give a result less than or equal to 15 is 220338478078181111717/255872592269367312384, which is about 86.11%.

Kuroneko · Post by **Kuroneko** » 2005-11-18 12:24am

Here's the theoretical probability of success on a roll of three die for a given skill level. They were computed up to the 14th split, meaning the first column was computed exactly and then rounded to the given accuracy, while the second column is not so precise. But one can still expect about eight significant figures even on the last one, so it doesn't really matter--real dice are not so free of bias as to conform to the theoretical probabilities that precisely anyway. If the skill level is less than 3, the probability is obviously zero. At level 32, probability of failure is about 1:410.

Code: Select all

Skill    Probability            Skill     Probability
 3       0.00462962962963        18       0.93443738710635
 4       0.01890432098765        19       0.94887846588940
 5       0.04825745884774        20       0.95995718060006
 6       0.09857530927704        21       0.96853184309759
 7       0.17623497474026        22       0.97526809879586
 8       0.27425675761272        23       0.98053143289542
 9       0.38469468457422        24       0.98463864437347
10       0.49852451682333        25       0.98785227817520
11       0.60551935831302        26       0.99037872190187
12       0.69411151068915        27       0.99237162864844
13       0.76512796619991        28       0.99394276645260
14       0.82004602700010        29       0.99518292610286
15       0.86112575060881        30       0.99616400973047
16       0.89156328323673        31       0.99694182305504
17       0.91566803166137        32       0.99755932844732

Eleas · Post by **Eleas** » 2005-11-18 02:32am

Thanks, guys. I just found the bonus statistics chapter in one of my RPG books, and they basically do the same, so between those three explanations, I should be able to figure it out, I guess.

Kuroneko · Post by **Kuroneko** » 2005-11-18 03:01am

If you're wondering how the recursion N(n,m) = \sum_{k=1}^m{ C(n,k) p^{n-k} q^k N(2k,m-k) }, p = 5/6 (probability a die does not split), q = 1/6 (probability that a die splits), where N(n,m) is again the probability that n die will have exactly m splits, think of it this way:
(1) It is possible to have exactly one split on the first roll, and then the two replacement dice cascade into m-1 splits. Since there are C(n,1) = n possibilities as to which die split, the probability of the former event is C(n,1)p^{n-1}q^1, and of the latter N(2,m-1).
(2) It is also possible to have exactly two splits on the first roll, and then the four replacement dice giving m-2 splits. There are C(n,2) = n(n-1)/2 possibilities as to which two dice split, which probability C(n,2)p^{n-2}q^2, followed by N(4,m-2).
(3) Three dice may split on the first roll and the six replacement dice giving m-3 splits. Thus, C(n,3)p^{n-3}q^3 N(6,m-3).
...
(m) Exactly m might split on the first roll, with the 2m replacement dice giving no splits. Thus, C(n,m)p^{n-m}q^m N(2m,0).

Summing over all C(n,k) p^{n-k} q^k N(2k,m-k) gives the above equality. Note that N(n,0) = (5/6)^n. That's how the probabilities 25/216, 625/2592, 3125/31104, ... were generated (it helps to have a computer for this). Since exactly m splits on n dice gives a total of n+m dice with five possible values each, what this allows one to do is to reduce the problem to a substantially easier one: what is the probability of rolling exactly k using m five-sided die? If that number is B(m,k)/5^m, then the answer to the original question is P(n,k) = N(n,0)B(n,k)/5^n + N(n,1)B(n+1,k)/5^{n+1} + ..., i.e., the probability of no splits and rolling a k on the n dice, plus the probability of one split and rolling a k on the n+1 dice, plus ... .

drachefly · Post by **drachefly** » 2005-11-18 02:55pm

If you didn't understand my instructions on how to make the spreadsheet, I have made it myself for my own curiosity (I have a thing for random number distributions). So, if you want to get the general answer without having to grok what Kuroneko is getting at (which is very interesting, but it might not be your cup of tea), I can post it in either excel or appleworks format... on monday (must leave for class momentarily).

Kuroneko · Post by **Kuroneko** » 2005-11-23 03:02pm

Hmm... solving the recurrence exactly turned out to be quite possible. If N(n,m) is the probability that n dice have exactly m splits, with the probability of split for each die q (p = 1-q), then the recurrence relation is N(n,m) = \sum_{k=1}^m{ C(n,k) p^{n-k} q^k N(2k,m-k) }, as above. This turns out to be N(n,m) = n C(n+2m-1,m-1)/m p^{n+m} q^m. For the six-sided die above, p = 5/6 and q = 1/6. For example, N(3,5) = 3 C(12,4)/5 (5/6)^8 (1/6)^5 = 4296875/483729408 ~ 0.89% is the probability that three six-sided dice will split five times.

That simplifies computations greatly, but the number of ways the dice will give a number less than k still involves a double summation. Perhaps it has a simpler form as well.