0

We know that every random variable can have a probability distribution. Examples include the number of heads in many tosses, or the number of ones on a dice after many rolls and so on.

Suppose we use the binomial distribution to model this random variable. Let us take an example. We are tossing a single coin $100$ times and checking for heads. By plugging this into the binomial distribution I get a beautiful graphical representation of probability on one side and the no. of heads on the other axis. This graph would peak at $50$ heads with a roughly $0.08$ probability.

However, there is also a physical interpretation. It basically means I toss coin a hundred times, and note the number of heads. Then I repeat this experiment thousands and thousands of times, and note the frequency of the occurance of each number of head. This frequency represents the probability or the height of the graph, in the binomial distribution graph. As one would expect, $50$ heads would appear roughly $8$ percent of the time. This can be shown easily with computer simulation, as 3blue1brown does with random number generators.

Now we have a mathematical as well as a physical meaning of what the binomial distribution represents.

Now Imagine the following scenario.

There is a bag with $100$ balls inside it. Some of them are blue, red and other colours. We don't know how many are of each colour. However, the colours are fixed ofcourse, we just don't know the amount.

What we do is, we pick a single ball at random, and note its colour. This is repeated many many times, and it appears that blue balls appear $20$ percent of the time. Since it is impossible to repeat the experiment infinite times, we can never know what is the exact percentage of blue balls appearing. Since we know the total number of balls, but not the exact probability of getting a blue ball, we can never know the exact number of blue balls inside our bag.

Hence the number of blue balls inside our bag is a random variable, and thus, it must have a distribution. In our sampling, we found out that the probability of getting a blue ball was $0.2$. This is not the true probability of getting a blue ball from that bag, more like our best estimate of the true probability.

Hence we can use binomial distribution to find the probability of different numbers of blue balls being inside the bag. The mean of this distribution would be $20$ balls, and as the total number of trials tend to infinity, the actual number of blue balls would tend towards this mean. However, in our trials, the probability of getting $20$ blue balls, would be about $9.93 $ percent.

Now mathematically this is all well and good. However, physically it doesn't seem to make sense.

Let us see, how would we interpret this binomial distribution in the physical sense, just like we did for our coin tosses. In case of the coin tosses, we did the experiment many many times and noted the frequency of a particular number of heads, and we used this to create a distribution.

Suppose, we do the same thing here. So, we empty the bag and count the number of blue balls and repeat this experiment many many times. According to the binomial distribution, in about $9.93$ percent of the cases, I should get $20$ blue balls out of the bag. In other cases, I'd get other results with different probabilities. However, if I'm doing the experiment with the same bag this creates a problem, since even though I don't know the number of blue balls in the bag, I do know that it is a constant. The same bag cannot give two different number of blue balls in two consecutive experiments.

So, the physical interpretation of the binomial distribution seems to fail over here.

One solution that I can think of is, instead of checking the same bag again and again, to get a frequency, what if I check thousands of different bags with different number of blue balls. Each of them would have a different number of blue balls in them from $0$ to $100$. However, the same bag won't have different blue balls in consecutive throws, because we are not checking the same bag, we are checking different bags. Since we don't know the exact number of blue balls in the bag, we essentially don't know which bag is it out of all these bags.

So, the binomial distribution is no longer about the number of blue balls in the same bag, directly. It is more about the different bags with different number of blue balls in them. So in a sense, the number of blue balls is not exactly the random variable, in our problem as we initially guessed. Its actually the bags, that is the random variable. Different bags have different number of blue balls, and we basically don't know which bag is the real one. To say that, in $9.93$ percent of the times, a total of $20$ blue balls appear, would be equivalent to saying that $9.93$ percent of the time, bags with $20$ blue balls turn up. This is correct, because the bags with $20$ blue balls in them, would be more likely to give us a $20$ percent of picking up a random blue ball. Bags with $100$ blue balls or $99$ would be less likely to give us a $20$ percent chance of picking a blue ball.

Would this be the correct physical interpretation of the binomial distribution ? Instead of the bernoulli trial being checking a single bag for number of blue balls, each bernoulli trial is basically checking all these different bags. I'm doing all this, because I single bag cannot give two different numbers of blue balls in successive bernoulli trials, even if we don't know the exact number of balls. So the question should be more like there are several bags with different number of blue balls from $0$ to $100$, given the probability of picking up a random blue ball is almost $0.2$, which one of these bags is most probable and so on. Hence, bags with $20$ blue balls would be the mean of this distribution of different bags. We are essentially checking how likely a certain bag would give us exactly $20$ percent chance of picking up a blue ball at random, since that is the only information we have.

Is this interpretation correct ? Mathematically it doesn't make a difference, since the binomial distribution formula describes both the physical cases equally. If the colour of the ball was not constant, and we were checking the same bag, I'd have got the exact same results. However, the philosophical and the physical interpretations are somewhat different. Like tossing a single coin $100$ times vs tossing $100$ coins once. Mathematically it is the same, physically not so.

Thanks for your time.

RayPalmer
  • 203
  • 1
  • 8

2 Answers2

2

Let $n = $ any large number : say $1000$.

Let $b$ denote the number of blue balls in the bag.

Let $f(b)$ denote the probability of exactly $20\%$ of the $n$ trials succeeding in showing a blue ball, when a ball is selected with replacement from the bag.

Let $W$ denote $\displaystyle \sum_{i = 0}^{100} f(b)$.

Then, the expected number of blue balls in the bag is

$$\frac{\sum_{i=0}^{100} \left[i \times f(i)\right]}{W}.\tag1 $$

$W$ in the denominator serves to normalize the sum of the weights (i.e. the probabilities) associated with each possible number of blue balls.

$\displaystyle f(b) = \binom{1000}{200} \times \left[\frac{b}{100}\right]^{(200)} \times \left[\frac{100 - b}{100}\right]^{(800)}.$

user2661923
  • 42,303
  • 3
  • 21
  • 46
  • Thanks, but my problem is not with the mathematical formulism, as we can see it follows a binomial distribution. My problem is, how to interpret the physical simulation. An example being, the number of heads that come up when one tosses a single coin $100$ times follows the exact same distribution as the number of heads that show up when $100$ coins are tossed once. However, physically, these are two completely different cases. – RayPalmer Nov 25 '21 at 17:13
  • If I want to physically talk about this particular scenario, I'd normally say that the number of blue balls inside the bag is the random variable. But on closer inspection, if one notes that the number of blue balls in a certain bag is fixed, even if it is random. So, if I want to design a simulation that would lead to the same distribution, by brute checking, I can't check the same bag. Instead I should check the distribution of different bags to see, which bag has how much chances of giving me an exact $20$ percent chance of picking up a blue ball. – RayPalmer Nov 25 '21 at 17:16
  • @RayPalmer I don't understand what you mean by saying that these are physically different cases. Flipping $100$ different coins, there is no difference between flipping them all at once, and flipping these $100$ different coins one at a time. Further, when flipping one at a time, since coins do not have a memory, there is no difference between flipping $100$ different coins, one at a time, and flipping one coin, over and over. – user2661923 Nov 25 '21 at 17:17
  • yes, but in the bag case, even though the number of balls in a bag is random, we cannot exactly say that a bag doesn't have any memory. Instead of saying that the number of balls in the bag being a random variable, shouldn't we say that our choice of a particular bag is a random variable ? – RayPalmer Nov 25 '21 at 17:19
  • Even though we don't know the exact number of balls in a bag. But this is equivalent to saying we don't know what bag is it ? Is it the one with $20$ balls in it ? Or the one with $30$ balls in it. – RayPalmer Nov 25 '21 at 17:19
  • @RayPalmer As far as the blue ball problem, the analogy would be that you get $101$ different bags, each with a distinct number of blue balls in it. Then, you consider $1$ trial, for each bag to be selecting a blue ball from that bag $1000$ times. Then, for each bag, you perform this trial $1000000$ times. Then, for each bag, you note the percentage of the $1000000$ trials where exactly $200$ blues were selected out of the $1000$. This is the physical analogy. – user2661923 Nov 25 '21 at 17:21
  • So, do you not think, it would be better to say that it is the choice of a bag which is the true random variable, instead of the number of balls in the bag. Because unlike a coin, the colour of a single ball is unique, and the bag does remember how many balls are in it, of what colour. We just don't know which should be our bag. – RayPalmer Nov 25 '21 at 17:21
  • @RayPalmer Yes, the random variable is which of the $101$ bags pertains, and you associate a probability with each of the $101$ bags. – user2661923 Nov 25 '21 at 17:23
  • Thanks, that is exactly what I was thinking. The key word being we need to check different bags. However instead of what you said, I was working more along the lines of getting $10000000000000$ different bags of $101$ types. And then check each one bag. This is absolutely equivalent to selecting one of $101$ bags, and checking it $10000000$ times. In the end, we'll just note the frequency of how many times each type of bag gives us $20$ out of $100$ blue balls. – RayPalmer Nov 25 '21 at 17:25
  • @RayPalmer No need for that. Just as coins do not have a memory, bags do not have a memory. – user2661923 Nov 25 '21 at 17:26
  • Yeah thanks, but physically those two are just equivalent. This entire confusion was because someone told me that it was the number of blue balls in a single bag that was the random variable, which is clearly not the case. Thanks for confirming for me, that it is indeed the choice of different bags that is the random variable, not not the number of balls in a single one. – RayPalmer Nov 25 '21 at 17:28
  • One particular bag would always have the same number of blue balls. You just need to check how many times, this bag would give you a chance of getting exactly $20$ out of $100$ blue balls. – RayPalmer Nov 25 '21 at 17:30
  • 1
    @RayPalmer Actually, you can construe the random variable to be the number of blue balls in the bag. This is (in a mathematical sense) equivalent to regarding the pertinent one of the $101$ bags as the random variable. The only difficulty is that in the physical sense, it is easier to intuit that the pertinent bag is the random variable. – user2661923 Nov 25 '21 at 17:30
  • @RayPalmer "This entire confusion was because someone told me that it was the number of blue balls in a single bag that was the random variable" You are taking my Answer out of context: I clearly said, "based on the epistemic interpretation of probability........is a random variable if and only if you don't know the actual number of blue balls (regardless of whether there are indeed 20 blue balls)". "This entire confusion"...Aren't many of the ideas summarised in your above question evolved and rehahshed from our recent discussions?? – ryang Nov 26 '21 at 14:40
  • @RayPalmer When posting a sequence of questions threaded together by a nice narrative & thematic arc (like these 4 or 5 recent questions), it's a great idea to mention and link the prequels: it supplies very relevant context and helps Answerers craft more useful Answers (eg. customised to your current understanding, which has crystallised significantly since the beginning). Possible previous points of confusion/error can also be more accurately referenced and fruitfully addressed. – ryang Nov 26 '21 at 15:06
  • @ryang I'm sorry for taking things out of context, but what I was trying to mean was that do you not think the binomial distribution is not a great way to describe the scenario. For example, I posted this same question and you used the binomial distribution, while here, the person has used a different distribution, which is slightly different, and makes sense to me. I was just wondering, which one is more accurate – RayPalmer Nov 26 '21 at 15:59
  • @RayPalmer Not wishing to clutter up these comments, I shall only add that user2661923's above distribution is Binomial. – ryang Nov 26 '21 at 16:16
  • @ryang, yeah but the probabilities are not constant, and instead of checking the probability of a certain number of blue balls, we are instead checking the probability for each of the bags. So, the individual probabilites are being updated as we move from one bag to the other. That isn't the standard binomial distribution, is it ? – RayPalmer Nov 26 '21 at 16:21
  • @user2661923 what distribution is this ? It looks like the simple binomial distribution, but the probability is different for each of the bags. – RayPalmer Nov 26 '21 at 17:13
  • @RayPalmer It is a binomial distribution. Further, the probability is different for each of the bags. You are trying to determine the expected value of the number of blue balls. This is equivalent to trying to determine the expected value of the bag number $~i ~: ~i \in {0,1,2,\cdots, 100}.~$ Using the formula for binomial distributions, a probability $~f(i)~$ is assigned to each bag number $~i.~$ Then, the expected value of the bag number $~i~$ is $$\frac{\sum_{i=0}^{100} \left[ i\times f(i) \right]}{\sum_{i=0}^{100} \left[f(i)\right]}.$$ – user2661923 Nov 26 '21 at 21:24
  • @RayPalmer Another way of saying the same thing is: Suppose for each bag number $~i~$, you run $1,000,000$ trials, where each trial consists of selecting a blue ball, with replacement from bag $~i,~ 1000~$ times. This means that you ran $~[1,000,000] \times [101]~$ trials, where each trial consisted of sampling a blue ball, with replacement, from bag $~i~ 1000~$ times. You then discarded all of the $~[1,000,000] \times [101]~$ trials where the trial did not result in a blue ball being selected exactly $200$ times (i.e. the trial failed). ...see next comment – user2661923 Nov 26 '21 at 21:34
  • @RayPalmer Then, looking only at the (small) portion of the $~[1,000,000] \times [101]~$ trials that succeeded, what is the average value of the bag number $~i~$ for those successful trials? – user2661923 Nov 26 '21 at 21:35
  • @user2661923 yeah I understood that, its just that they way I'm normally used to seeing a binomial distribution, the probability of a single trial is taken constant. In this case, for all of the $101$ bags, the probability of a single successful trial, where I got $200$ blue balls out of $1000$, is different for each of these bags, based on how many balls I know are inside them. – RayPalmer Nov 26 '21 at 21:41
  • @RayPalmer Note that the expected value of the number of blue balls is an entirely different concept from the probability that the number of blue balls has a specific value $~i.~$ By definition, if the relative probability that the number of blue balls equals $~i~$ is $~f(i)~$, then the computation of expected value in my answer fits the definition of expected value. – user2661923 Nov 26 '21 at 21:41
  • 1
    @RayPalmer +1 to the previous comment, which circles back to my Answer: Sentence 4 explains that we had empirically & indirectly obtained the expected number of blue balls to be 20, and Sentence 6 explains that we *consequently* constructed a *corresponding* distribution to then determine P(bag actually has 20 blues) to be 9.93%. This is one multi-step analysis, not two competing angles/approaches/interpretations. – ryang Nov 27 '21 at 07:14
0

As discussed in the chat with user2661923, there is a second way of solving this, which works best if the number of balls in the bag is very large.

Think of the following scenario. Suppose, you have $N$ balls in the bag, out of which $b$ are blue. So, what would be the true probability of picking up a blue ball from the bag - it would obviously be $b/N$. Now suppose, you pick up $N$ balls from the bag with replacement. Moreover, say that it turns out that $K$ of these balls turn out to be blue. Now repeat the experiment many many times, and you'll get some other $\epsilon$ number of blue balls out of $N$. According to statistics, most of the times, these $K$ or $\epsilon$ would lie within one standard deviation of the mean. This would roughly follow a gaussian distribution about the mean.

For comparison, consider the number of heads that turn up in a $100$ coin toss. Roughly $70$ percent of the time, you'd get between $45$ and $55$ heads. For $10000$ coin tosses you'd get between $4070$ and $5030$ roughly $95$ percent of the time. Similarly, for $1000000$ tosses, you'd get between $499000$ and $501000$ coins $95$ percent of the time. As you can see, for larger and larger values, the range is closer and closer to the mean. For extremely large values, we can approximate that that we get exactly the mean.

Now, what is the mean ? Well, obviously it is the total number of trials multiplied by the probability of 'success' in a single trial. Since in our case, the number of trials is exactly the total number of balls in our bag, we can say that the mean is equal to $b$, the original number of blue balls in our bag.

The approximation now is, if the total number of balls in our bag, and hence the total number of trials is extremely big, we can ignore this standard deviation, and say with reasonable accuracy, that the number of blue balls that we get i.e. $K$ or $\epsilon$ would be approximately equal to $b$.

Hence, $K \approx \epsilon \approx b$

Hence, for large systems, if we pick up $N$ balls with replacement, and found that exactly $b$ of them were blue, we could approximately say, using the above reasoning, that there were originally $b$ blue balls in the system of total $N$ balls.

Hence the probability of picking up $b$ blue balls at random, would give us indirectly, the probability that there were $b$ blue balls in the system. As we have already seen by random sampling, the probability of getting a blue ball randomly is given by $p_i$. Remember, this is not equal to $b/N$, since we don't know $b$. However, using this probability, we can find out the distribution for the number of blue balls that we get, if we randomly sample $N$ balls with replacement. This would be a simple binomial distribution as you know. (Picking up coloured balls from a bag with replacement, follows a binomial distribution).

However, using the reasoning above, we can claim that the number of blue balls that we get out of $N$ total balls, would be approximately equal to the number of blue balls in the bag. Hence, the binomial distribution for the number of blue balls that we get by picking up $N$ balls with replacement, should indirectly and approximately, give us the distribution of number of blue balls in the bag.

For example, if there are $100$ balls, and we pick up $20$ blue balls, our reasoning suggest that there must have been $20$ blue balls in the bag originally. So, the probability of picking up exactly $20$ out of $100$ blue balls, would also give us the probability that there were $20$ blue balls. Similarly, we check the probabilities for all number of blue balls and create a distribution. Ofcourse, this is much more accurate when the number of balls is huge.

With this being said, remember the answer posted above is the correct and most intuitive way of solving this problem, and this current answer is nothing but an approximation, albeit a reasonable one, in some cases.

Hence to summarize, the two ways of physically reasoning about this problem is as follows :

  1. Since you only know the expected probability of getting a single blue ball, from a bag of $N$ balls, you calculate the mean. Then you consider $N+1$ different bags with different number of blue balls in them, and check how likely are these bags of giving you the mean that you calculated for the original bag. This would give you a probability distribution of these bags chances of being the original bag. Since these bags have a defined number of blue balls in them, this automatically gives you the probability distribution for the number of blue balls in the original bag.

  2. Again, you know the expected probability of getting a single blue ball, out of $N$. What you do is, create a binomial distribution for getting $0,1,2,3.....N$ blue balls if you pick a $N$ balls from the bag with replacement. Then you use the argument, that if the number of balls is large, the number of blue balls that we get if we pick $N$ balls with replacement, would be approximately equal to the mean, which in this case would be the actual number of blue balls in the system. So, say you get $m$ blue balls in a trial, that would mean there are $m$ blue balls in the system. Hence the probability of getting $m$ out of $N$ blue balls would be the same thing as the probability of having $m$ blue balls in the original system.

An analogous problem would be, suppose I have a die, that I roll many-many times and get a $1/6$ chance of rolling a six, what is number of faces of the die that is marked $6$ ? This becomes a similar problem, where we have to compare $7$ dice, with $0$ to all faces marked $6$, and check which one is most likely to be our die.

Based on my limited understanding, this is used when we find the distribution of an estimator, a sort of probability of probability. On the other hand, if you know the exact probability of picking up a die, and rolling it and getting a six, you can use a simple binomial distribution to get the chances of rolling a certain number of sixes or something.

RayPalmer
  • 203
  • 1
  • 8