*Added an Addendum at the end
Hopefully my Title isn't too vague, but I will try to elaborate here. I posted a similar question: Given 10 random letters where the number of repeated letters is known (i.e. 3,2,1,1,1,1,1), what's the formula for finding the number of combinations? And it looks like I understand the formula for getting permutations once I have the multiset, but I don't understand how to get the multisets to begin with. So this question is just about generating the multisets themselves.
I am not sure if I am using the correct terminology (distribution, possible, unique, combinations, etc.) as I am not a mathematician or student taking a class on this. I am a Software Performance Engineer trying to solve a problem, but I need to understand the problem first. So, please refrain from using terminology, symbols, or expressions that would only be understood by someone who has a deep understanding of probability and multisets to begin with.
My knowledge on this subject is only what I have been able to learn in the past 2-3 days. If you are going to use symbols, shorthand, or terminology specific to this type of math, then please explain what it means. As I found in my other question, (26 1) apparently means something way different than (5 3) and somehow (5 3) = 10!/3!2!, but (26 1) = 26/1... I don't understand how I am supposed to know that or understand it. Also, the Union, Element, and Sum symbols that I have seen in formulas related to this topic (i.e. {\displaystyle |A|=\sum {x\in \operatorname {Supp} (A)}m{A}(x)=\sum {x\in U}m{A}(x)}) don't make sense to me as they have completely different meanings and uses in my field of work. Please try to use math symbols and explanations that can be understood by anyone and don't skip steps when possible.
With that out of the way, given the 26 letters of the English alphabet, how do I find the number of possible multisets for 4 random letters, and then for 5 random letters? To understand it completely, I am looking for all possible combinations of letters. I think that If I can get formulas and explanations that can be applied to those situations, then I should be able to take those and apply them to the 10 letter problem that I am really trying to solve.
Starting with 4, if each letter is unique: 1111 (apxz). Then if there is a duplicate: 211 (aact). then if there are 2 duplicates: 22 (nnii). Then if there is a triplet: 31 (oooy). And finally a quadruplet: 4 (rrrr). Then I need to do the same for 5 letters. So, 11111 (abcde); 2111 (uuakl); 221 (ppjjx); 311 (mmmsc); 32 (hhhww); 41 (qqqqz).
Please don't get hung up on the letter examples that I have given as those are just one potential combination based on the distribution, but I am looking for all possible combinations for that distribution. For the 41 example, I could just have easily used (aaaab), (zzzzq), (jjjjs), etc. because every letter combination is equally possible. I am just trying to figure out how to figure out how many possible ways there are given a known number of repeated letters.
Remember, I am not looking for just the answer, but rather how you go about finding the answer. I need to know which formula to use and why. If I am only given answers that apply to specific scenarios, then there is no way that I can use them to solve for future problems. I am trying to learn to fish, not just be handed a fish.
Thank you in advance.
Addendum:
I'm including a few examples to hopefully illustrate what I am trying to ask. Let's say I have a four letter pattern of "evet". In the final answer to the real world problem I am looking to solve, the order will matter, but for this question the order doesn't matter. So, "veet", "eetv", "vtee" are all the same as "evet" for the purposes of this question.
So, I have that one set of 4 letters, but I need to know how many other variations fit the same pattern. Instead of "evet", I could have gotten "avat", "bbxy", or "ossp". And "avat" = "vaat", "tvaa", etc. "bbxy" = "bbyx", "xybb", "bxyb", etc. (Probably didn't need to reiterate that, but since it seems like people are going out of there way to misunderstand my question, it might have been) I could have gotten any combination of letters so long as one of them is a couple and the other two are monuples. So, how do I write a formula where I can take the number of possible letters (26), then apply the 211 distribution of letters to it so that I get the total number of possible 4 letter words that with one letter doubled - the total number of variants that are in a different order, but still contain the same letter combination?
After I have that, I will need to be able to do the same if I started with "eett" instead. Will the same formula for 211 work for 22? If not, why not? Maybe the problem is that people are answering with simplified formulas for easier versions, but I need one that I can apply universally. Where n=pool of possible letters (26), k=number of selected letters (4 in these examples, but it could just as easily be 5, 10, 20, 100, 1000, etc.), and z=the known number of repeated letters ({2,2} in this example, but it could just as easily be {4},{3,1},{1,1,1,1}, or {2,1,1} when k=4, or it could be {51,25,14,5,3,2},{95,3,1,1},etc. when k=100, or {22,17,8,2,1}, {37,5,5,1,1,1}, etc. when k=50.
I keep seeing answers like (26 1)(25 2) or (26 1)×(25 2)×(23 2), but don't understand where these are coming from. I see there is a formula for binomial coefficients: !!(−)! Which I can use to get the (26 1) if n=26, and k=1 so 26!/(1!(26-1)! = (2625!)/125! = 26/1. Then I get where the 25 comes from as a letter is already accounted for from the previous step and thus is not in the pool of possible letters. But why 2? Where does that come from? If we are using 211, why is (25 2) used? Why not (25 1)(24 1)? That gets a different answer because nothing is divided by 2, but I don't see why a 1 was used for the 2 in 211, but a 2 is used for the 11 in 211. And I haven't seen the break down answer for k=4 with z=22, but I suspect that neither (26 4) nor (26 2) will work based on previous responses. It's probably closer to (26 1)*(25 2) or something like that, but I have no way of knowing currently.
aand the cases ofaavsab) – JMoravitz Jul 11 '22 at 02:54So, 4 random letters. We know only that they consist of a couple and two monuples (ie. 211) How do I find the number of possible combinations? aabc=1,aacd=2,...zzwx=?,zzxy=?; then for 22, then 31, 4, and 1111. Then do the same for 5 letters.
– Jul 11 '22 at 03:27And yes that is what I am trying to ask sort of. For now I am looking for a particular partition, but I want to be able to apply something for any possible partition for any length of letters. For now I am keeping it at 4 and 5, but the final formula should work for 10 or 100.
– Jul 11 '22 at 03:59"that would be 26⋅(252)" How did you arrive at that?
"it does simplify to the stars and bars formula of (26+−126−1)"... So, is that for the 211? or is that for 1111,211,22,31,4 all combined. I am not looking for the total combined, just how to find the number of possible multisets if given 211, 22, or 4.
– Jul 11 '22 at 04:07