...albeit with big-ish numbers and lots of conditions.
The back story is that product collation in Heroclix's Lord of the Rings mini-set has been a total debacle, that is to say a booster of six semi-random figures is frequently identical to another, to the extent that some stores have had entire display cases be the same as other cases.
There are several things I'm trying to work out:
First the odds that two specific six-figure packs are identical with ideal clix distribution, which is obviously actually just the odds of one pack having a specific arrangement.
Each pack a set arrangement: 3 commons, followed by 2 uncommons, followed 7/8ths of the time by a rare or 1/8 a chase.
There are 9 numbered commons; each of the 3 has a higher number than the one before, without duplication in a single pack.
There are 6 numbered uncommons, with the same conditions.
There are 5 rares and 3 chases.
It's all the numerical contingencies that throw me.
Second, I need a reminder how to work out the odds that any X packs (or cases, same method OFC) out of Y total are identical. So let's say we have 1,000,000 packs; how do I determine the odds that there are exactly 3 with arrangement A and 2 or more with arrangement B? Obviously 1 and 2 have the odds determined above, but number 3 can be set up as 1 or 2, 4 as 1 2 or 3, and so on...
I'll try to work it out on my own, but I'll probably get distracted by Skyrim before I do.
Medium difficulty probability refresher...
Moderator: Alyrium Denryle
- Spectre_nz
- Youngling
- Posts: 121
- Joined: 2009-10-22 06:45am
Re: Medium difficulty probability refresher...
Oh, this actually turned out a little interesting...
I'm not actually a math major, so, hopefully this is right
(Sorry about number spam. Turns out I have a thing for statistics. Weirdly, I dislike math in other cases)
So, it depends how they're selecting the first Clix to put in the pack, you end up with two very different distributions.
Given the way the clix are not duplicated and must be a higher number than the one before, it puts some tight constraints on the possible arrangements you could have; for example, in the common set, if your first clix is a #7 the next can only be # 8 and #9, so a 7,8,9 sequence.
The big implication of how they choose their clix is; do they select the first randomly, or do they select the entire set at once randomly from all potential arrangments.
First case; Randomly selecting from all possible arrangments
I'll use '.' to indicate when a number has not changed from the previous
For the commons, I'll start with a #1 Clix first to show you the pattern;
123, ..4, ..5, ..6, ..7, ..8, ..9 = 7 possible arrangements
134, ..5, ..6, ..7, ..8, ..9 = 6 possible arrangements
145, ..6, ..7, ..8, ..9 = 5 possible arangements
The pattern continues, so you end up with 7+6+5+4+3+2+1 = 28 arrangements starting with Clix #1. If you start with Clix #2 this removes your first row and this time you get 6+5+4+3+2+1 arrangements. Continuing on like this you get 84 possible arrangements for the Commons and doing the same to the uncomones, 35 possible arrangements. Plus the 8 arrangements for rares/chases
So, assuming Commons/uncommons/rares are selected independently there should be 23,520 unique pack arrangements possible. The probability of each one is complicated by the fact that rares/chaser is a 1/8th chance of an or choice. I haven't calculated that in yet.
But; there is another senario; They randomly select each clix as they put them in then select the next one based on the remaining possibly choices, rather than as a whole set taken from all possible sets, things are very different;
Example, there's a 1/7 chance of selecting a #7 common clix first (becasue #8 or #9 can never be the first selection) but all remaining arrangements must be #8, #9. This implies that fully 1/7th of the time, you'll get a 7,8,9 set. The same for the uncommons. Fully 1/5th of the time you'll get a 5,6 combination.
Things become heavily weighted towards getting the same high number set over and over, because they'll be over-represented amongst all the sets.
you end up with;
123, ..4, ..5, ..6, ..7, ..8, ..9 = 7 possible arrangements
134, ..4, ..5, ..6, ..7, ..8, ..9 = 7 possible arrangements, one of which is a duplicate arrangement.
145, ..5, ..5, ..6, ..7, ..8, ..9 = 7 possible arangements, two of which are duplicates.
and so on
So you end up with 140 arrangements for the commons, of which 56 are duplicates of an existing set, and 55 arrangements for the uncommons, of which 20 are duplicates. So, overall we now have
61,600 sets, of which 38,080 are duplicated versions of other sets.
The concequence of this will be, in 61,600 ideally distrubuted sets there will only be 440 that have the common set 1,2,3, but 8800 that have the common set 7,8,9
Edit: I realize I haven't actually answered your question, but kinda pointed you at something to think about...
I'm not actually a math major, so, hopefully this is right
(Sorry about number spam. Turns out I have a thing for statistics. Weirdly, I dislike math in other cases)
So, it depends how they're selecting the first Clix to put in the pack, you end up with two very different distributions.
Given the way the clix are not duplicated and must be a higher number than the one before, it puts some tight constraints on the possible arrangements you could have; for example, in the common set, if your first clix is a #7 the next can only be # 8 and #9, so a 7,8,9 sequence.
The big implication of how they choose their clix is; do they select the first randomly, or do they select the entire set at once randomly from all potential arrangments.
First case; Randomly selecting from all possible arrangments
I'll use '.' to indicate when a number has not changed from the previous
For the commons, I'll start with a #1 Clix first to show you the pattern;
123, ..4, ..5, ..6, ..7, ..8, ..9 = 7 possible arrangements
134, ..5, ..6, ..7, ..8, ..9 = 6 possible arrangements
145, ..6, ..7, ..8, ..9 = 5 possible arangements
The pattern continues, so you end up with 7+6+5+4+3+2+1 = 28 arrangements starting with Clix #1. If you start with Clix #2 this removes your first row and this time you get 6+5+4+3+2+1 arrangements. Continuing on like this you get 84 possible arrangements for the Commons and doing the same to the uncomones, 35 possible arrangements. Plus the 8 arrangements for rares/chases
So, assuming Commons/uncommons/rares are selected independently there should be 23,520 unique pack arrangements possible. The probability of each one is complicated by the fact that rares/chaser is a 1/8th chance of an or choice. I haven't calculated that in yet.
But; there is another senario; They randomly select each clix as they put them in then select the next one based on the remaining possibly choices, rather than as a whole set taken from all possible sets, things are very different;
Example, there's a 1/7 chance of selecting a #7 common clix first (becasue #8 or #9 can never be the first selection) but all remaining arrangements must be #8, #9. This implies that fully 1/7th of the time, you'll get a 7,8,9 set. The same for the uncommons. Fully 1/5th of the time you'll get a 5,6 combination.
Things become heavily weighted towards getting the same high number set over and over, because they'll be over-represented amongst all the sets.
you end up with;
123, ..4, ..5, ..6, ..7, ..8, ..9 = 7 possible arrangements
134, ..4, ..5, ..6, ..7, ..8, ..9 = 7 possible arrangements, one of which is a duplicate arrangement.
145, ..5, ..5, ..6, ..7, ..8, ..9 = 7 possible arangements, two of which are duplicates.
and so on
So you end up with 140 arrangements for the commons, of which 56 are duplicates of an existing set, and 55 arrangements for the uncommons, of which 20 are duplicates. So, overall we now have
61,600 sets, of which 38,080 are duplicated versions of other sets.
The concequence of this will be, in 61,600 ideally distrubuted sets there will only be 440 that have the common set 1,2,3, but 8800 that have the common set 7,8,9
Edit: I realize I haven't actually answered your question, but kinda pointed you at something to think about...
Re: Medium difficulty probability refresher...
The odds cannot be based on completely random distribution. This is not like card booster packs where you can have machines do stuff for you.
Its a production line of poor workers doing long shifts. Any selection process is by humans under time constraints.
So if you are the worker stuffing the clix boosters and on the bench in front of you is only one kind of minis then you go with that or fall behind schedule.
This is not new its been like that ever since the first MageKnight releases. But back then the series stretched over longer time and with more workers per minis produced. So a short series with a lean production approach will generate a lot of errors like this.
Its a production line of poor workers doing long shifts. Any selection process is by humans under time constraints.
So if you are the worker stuffing the clix boosters and on the bench in front of you is only one kind of minis then you go with that or fall behind schedule.
This is not new its been like that ever since the first MageKnight releases. But back then the series stretched over longer time and with more workers per minis produced. So a short series with a lean production approach will generate a lot of errors like this.
Re: Medium difficulty probability refresher...
The OP seems to want to pretend that they're statistical, so that he can compare the measured distribution against the statistical prediction.
In that case, we can group the cards into their component sets, work out the probabilities of having the same grouping within that set, and multiply the probabilities for a complete package of nine.
set Common has 3 possible correct choices: ABC, BCA, CAB, where A, B, and C are your desired cards.
set Uncommon has 2 possible correct choice: DE, ED
set Rare has 1 possible correct choice: F
set Chase has 1 possible correct choice: G
set Common has 9*8*7 possible choices: 504
set Uncommon has 6*5 possible choices: 30
set Rare has 5 possible choices: 5
set Chase has 3 possible choices: 3
Modify the latter two probabilities since they are joined by a probability of their own. In any drawing of Rare and Chase, there are 7*1 correct Rares in every 8*5 drawings, and 1*1 correct Chases in every 8*3 drawings. To get these comparable, we cross multiply and get 21 correct Rares in every 120 drawings and 5 correct Chases in every 120 drawings, for a total of 26 correct choices in every 120 possible choices.
Probability is the product of all the individual probabilities:
3/504*2/30*26/120 = 156/1814400, or approximately 0.008%
In a rush so I can't check my work, sorry. Hope this helps.
In that case, we can group the cards into their component sets, work out the probabilities of having the same grouping within that set, and multiply the probabilities for a complete package of nine.
set Common has 3 possible correct choices: ABC, BCA, CAB, where A, B, and C are your desired cards.
set Uncommon has 2 possible correct choice: DE, ED
set Rare has 1 possible correct choice: F
set Chase has 1 possible correct choice: G
set Common has 9*8*7 possible choices: 504
set Uncommon has 6*5 possible choices: 30
set Rare has 5 possible choices: 5
set Chase has 3 possible choices: 3
Modify the latter two probabilities since they are joined by a probability of their own. In any drawing of Rare and Chase, there are 7*1 correct Rares in every 8*5 drawings, and 1*1 correct Chases in every 8*3 drawings. To get these comparable, we cross multiply and get 21 correct Rares in every 120 drawings and 5 correct Chases in every 120 drawings, for a total of 26 correct choices in every 120 possible choices.
Probability is the product of all the individual probabilities:
3/504*2/30*26/120 = 156/1814400, or approximately 0.008%
In a rush so I can't check my work, sorry. Hope this helps.
- Spectre_nz
- Youngling
- Posts: 121
- Joined: 2009-10-22 06:45am
Re: Medium difficulty probability refresher...
that was my initial thought, but its not valid in this case.set Common has 9*8*7 possible choices: 504
set Uncommon has 6*5 possible choices: 30
From the OP:
There aren't 9 possible choices for the first common slot. Only 7. The other two choices lead to invalid sets. Then there aren't always 6 choices for the set after, because it will depend on the previous choice. The same goes for the third position.There are 9 numbered commons; each of the 3 has a higher number than the one before, without duplication in a single pack.
There are 6 numbered uncommon, with the same conditions.
Which is when I brute forced the problem and just worked out all the sets. I get 84.
Also, I now realize I made an error with my uncommon set number; should be 15, not 35.
So a total of 10,080 possible sets.
Then you get two different distributions depending on how the sets are chosen.
Re: Medium difficulty probability refresher...
You are right and wrong both. I did miss that part of the OP. I also botched my counting of the correct answers for set Common. Obviously there are six, not three, non-repeating combinations: I only counted the 'forwards' ones, not the 'backwards' ones.
However, the stipulation of order has no impact on probabilities.
The arrangement rules disqualify "correct" choices and "incorrect" choices at the same rate, and therefore have no impact on the probabilities. Take set Common for instance. For every combination of three different variables, exactly one of them has the allowed arrangement out of the six possible arrangements. Consequently, both the numerator and the denominator must be divided by six. No change to probabilities occurs.
You can see this in the number you achieved by brute force. (By the way, brute forcing is unnecessary: you can just sum the numbers from 1 to n-2 if you're counting the number of ways to count up from lesser numbers to greater numbers.) 1/84, your number from brute forcing, = 6/504, the number from considering unsorted choices. 1/15, your number from brute forcing, = 2/30, the number from considering unsorted choices.
EDIT: note that this (the sorting process being irrelevant to the probabilities of the choosing process) is also why we can consider the cards in discrete sets, rather than worrying about the probabilities of them arriving in the right set order, with common first, then uncommon, etc, and with contiguous sets, rather than one common followed by a rare, then two uncommons....
However, the stipulation of order has no impact on probabilities.
The arrangement rules disqualify "correct" choices and "incorrect" choices at the same rate, and therefore have no impact on the probabilities. Take set Common for instance. For every combination of three different variables, exactly one of them has the allowed arrangement out of the six possible arrangements. Consequently, both the numerator and the denominator must be divided by six. No change to probabilities occurs.
You can see this in the number you achieved by brute force. (By the way, brute forcing is unnecessary: you can just sum the numbers from 1 to n-2 if you're counting the number of ways to count up from lesser numbers to greater numbers.) 1/84, your number from brute forcing, = 6/504, the number from considering unsorted choices. 1/15, your number from brute forcing, = 2/30, the number from considering unsorted choices.
EDIT: note that this (the sorting process being irrelevant to the probabilities of the choosing process) is also why we can consider the cards in discrete sets, rather than worrying about the probabilities of them arriving in the right set order, with common first, then uncommon, etc, and with contiguous sets, rather than one common followed by a rare, then two uncommons....
- Spectre_nz
- Youngling
- Posts: 121
- Joined: 2009-10-22 06:45am
Re: Medium difficulty probability refresher...
Ahh, I see. That's why your set is so much larger.
Admittedly, I didn't take the step to work out the probabilities, I found the two different outcomes depending on how you load the boxes more interesting.
But, underpaid, overtaxed workers loading booster packs may well be the source of their uneven distribution, not some dumb slip up with flawed distribution. I guess they'd also notice in a hurry if their loading lines were going through a disproportionate number of the higher numbered commons.
And yeah, brute force is unnecessary, but I can write down a few strings of numbers on a scrap piece of paper and idiot check my work that way...
Admittedly, I didn't take the step to work out the probabilities, I found the two different outcomes depending on how you load the boxes more interesting.
But, underpaid, overtaxed workers loading booster packs may well be the source of their uneven distribution, not some dumb slip up with flawed distribution. I guess they'd also notice in a hurry if their loading lines were going through a disproportionate number of the higher numbered commons.
And yeah, brute force is unnecessary, but I can write down a few strings of numbers on a scrap piece of paper and idiot check my work that way...
Re: Medium difficulty probability refresher...
Thanks for the advice/explanations.
The clix community would probably be less up-in-arms about it but for the recent sorta similarly packaged Street Fighter set, where distribution was great. They have been putting out a LOT of stuff lately, so overwork could well be part of the problem.
The clix community would probably be less up-in-arms about it but for the recent sorta similarly packaged Street Fighter set, where distribution was great. They have been putting out a LOT of stuff lately, so overwork could well be part of the problem.