1

Suppose I have a bag full of $100$ balls, with some of them being blue. I randomly pick up a single ball from this bag, and note it's colour. I repeat this experiment a number of times, and I conclude that $20$ percent of the time, I've picked up a blue ball. From here, I can say that the probability of me obtaining a blue ball is $0.2$.

However, we know from the definition of probability that the number of blue balls in the bag is just the total number of balls multiplied by the probability of getting a single blue ball. Doing this in the above example, I'd get $20$ blue balls in the bag. This is nothing but the expectation value of the number of blue balls in my bag.

Let's now empty the bag, and check all $100$ balls. What would be the probability that there are actually $20$ blue balls in the bag ? I think this would take the form of some distribution, but I don't know what or how.

However, in the initial experiment, where I picked up a single ball to check it's colour and repeated this many times to get the probability of obtaining a blue ball, I got $P(b)=0.2$. From here, I calculated $\langle b\rangle=0.2\times 100=20$. Since this is the expectation value, and not the actual value, I can say $P(b=\langle b\rangle)\lt 1$.

However, If I repeated the trial infinite times, and noticed that in exactly $20$ percent of the times, I get a blue ball, can I say that the actual number of blue balls in the bag is equal to the expectation value of the number of blue balls ?

That is, $P(b=\langle b\rangle)=1$, when I've done an infinite number of trials to obtain the probability of obtaining a single blue ball from a bag.

Second question : What do you mean, when you say find the probability that $20$ balls are blue ? Does it ask us to find the probability that there are $20$ blue balls in the bag, or is it asking the probability that if we pick out $20$ balls at random, all of them would be blue ?

In essence, is asking the probability that there are $20$ blue balls in the bag, the same as asking the probability that if you pick $20$ random balls, all of them would be blue ?

RayPalmer
  • 203
  • 1
  • 8
  • "What do you mean, when you say find the probability that 20 balls are blue?" Did someone ask you this question? If so, it is important to know what they said before (and maybe after) that question. You would not normally be asked such a question in a probability exercise without important other information relevant to that particular question. – David K Nov 21 '21 at 14:17
  • There really isn’t a way to perform the process an infinite number of times and take the average. Taking the average of a countably infinite set is not possible. – Thomas Andrews Nov 21 '21 at 14:25
  • @DavidK you mean something like - they should ask something like, if we pick $20$ balls, what is the probability of getting $20$ blue ones ? Or maybe something like what is the probability of finding exactly $20$ blue balls in the bag. Are these valid questions ? – RayPalmer Nov 21 '21 at 14:27
  • @ThomasAndrews yes, but if we run the trial more and more number of times, and obtain a more and more refined probability of getting one blue ball out of a hundred, shouldn't the expectation value ( no. of balls multiplied by probability of getting one blue ball ) tend more and more toward the actual number of blue balls inside the bag ? – RayPalmer Nov 21 '21 at 14:29
  • I have in mind questions like, "There a three bags, one containing $30$ blue and $70$ white marbles, one containing $20$ blue and $80$ white marbles, and one containing $10$ blue and $90$ white marbles. I pick a bag at random, then draw marbles with replacement ___ times and observe that ___ of them are blue. What is the probability there are $20$ blue marbles in the bag?" – David K Nov 21 '21 at 14:30
  • @DavidK isn't that the same thing as asking what is the probability that the marbles came from the second bag, since we have already established there are $20$ blue ones there ? – RayPalmer Nov 21 '21 at 14:33
  • Yes it is. And you actually can calculate the probability of that event because you start with a prior probability ($1/3,$ assuming each bag is equally likely to be chosen) before you observe any marbles. – David K Nov 21 '21 at 14:34

1 Answers1

1

$20$ percent of the time, I've picked up a blue ball. From here, I can say that the probability of me obtaining a blue ball is $0.2$.

Yes, based on the empirical/frequentist interpretation of probability.

However, we know from the definition of probability that the number of blue balls in the bag is just the total number of balls multiplied by the probability of getting a single blue ball.

The last line should read “multiplied by the probability of getting a blue ball in a single draw” instead.

Note that here, we're relying on the classical (equal-possibility) interpretation of probability.

Doing this in the above example, I'd get $20$ blue balls in the bag. This is nothing but the expectation value of the number of blue balls in my bag.

Yes: you empirically obtained (estimated) the expectation of a Binomial experiment, computed its probability, derived the expectation of the corresponding $100$-trial experiment to finally infer an estimate of the expected number of balls in your bag.

However, If I repeated the trial infinite times, and noticed that in exactly 20 percent of the times, I get a blue ball, can I say that the actual number of blue balls in the bag is equal to the expectation value of the number of blue balls ?

Yes, under the assumptions of classical probability, the actual number of blue balls in your bag equals its limiting expected value.

What would be the probability that there are actually $20$ blue balls in the bag ? I think this would take the form of some distribution, but I don't know what or how.

The number of blue balls in your bag is a random variable and indeed has a probability distribution. Using the Binomial distribution and the estimated probability 0.2, $$P(\text{bag has $20$ blue balls})={100\choose20}0.2^{20}\,0.8^{80}=9.93\%.$$

and not the actual value, I can say $P(b=\langle b\rangle)\lt 1$.

Based on the epistemic/subjective/Bayesian interpretation of probability:

  • if you know that there are actually 20 blue balls, then $P(b=20)=1;$
  • if you know that there is not actually 20 blue balls, then $P(b=20)=0;$
  • $0<P(b=20)<1$ if and only if you don't know the actual number of blue balls (regardless of whether there are indeed $20$ blue balls).

Second question: What do you mean when you say to find the probability that $20$ balls are blue?

Does it ask us to find the probability that there are $20$ blue balls in the bag,

These are impossible to answer without more context. As opposed to $23$ balls being blue? As opposed to $20$ balls being red? What is the experiment (how many draws are there? are the balls replaced after each draw? Etc.), and what is the sample space? Etc.

or is it asking the probability that if we pick out $20$ balls at random, all of them would be blue ?

Possibly. You have just supplied some context; the scenario still needs to be further filled out before we can choose some probability interpretation and work out a reply.

ryang
  • 44,428
  • Thank you so much. However, my confusion was mostly regarding statistical mechanics and the boltzmann distribution. If the probability of a single particle having a particular energy is given by $p_i$ using the boltzmann distribution, then we can say that the number of particles with energy $E$ is some $n=N*p_i$, where $N$ is the total number of particles in the system. – RayPalmer Nov 21 '21 at 16:52
  • However, this creates a problem. Suppose I now want to find the probability that all $N$ particles in the system have the energy $E$. Mathematically, this would be given by $p_i^N$, I multiply the probability of all the particles to have that energy. – RayPalmer Nov 21 '21 at 16:53
  • Hence I would obtain some answer. However, I've already estabished that $n$ particles have the energy $E$. Then shouldn't the probability of all particles having energy $E$ be $0$ ? – RayPalmer Nov 21 '21 at 16:54
  • What am I missing here, if you could point that out – RayPalmer Nov 21 '21 at 16:55
  • Neither assertion in your first two comments sounds correct, according to elementary probability. – ryang Nov 24 '21 at 09:20
  • I think I get it now. Just because I don't know the actual number of balls in the bag, it becomes a distribution instead with different probabilities. However, what if I actually knew the probability of getting a single blue ball, not by using sampling, but from some other source, like the gibbs distribution. In that case, will the number of balls in the bag, still be a distribution ? – RayPalmer Nov 24 '21 at 09:34
  • regarding the fact that the true number of blue balls is indeed a distribution, in there any way of figuring out what distribution it would be ? I know it would not be a binomial distribution, but anyway to find out what it actually would be. Maybe a poisson distribution perhaps or something similar, since I know the mean. – RayPalmer Nov 24 '21 at 23:01
  • @RayPalmer Wouldn't the number of blue balls in your bag be binomially distributed, with parameters $n=100,p=$? – ryang Nov 25 '21 at 07:34
  • yeah that was my initial guess, but my reasoning is that the number of balls is a random variable only because we are not exactly sure what is the exact probability. However the answer that we know is extremely close to the true probability, so shouldn't we have a much higher probability of getting $20$ than what the distribution predicts. And the more we converge on the true probability, shouldn't the distribution predict that there is a $100$ percent chance ? – RayPalmer Nov 25 '21 at 09:29
  • May one can reason that the distribution basically predicts there is a $75$ percent chance of getting between $16$ and $24$ balls ( within one standard deviation ), but since the probability is so close to the true one, I'd just guess a a higher probability – RayPalmer Nov 25 '21 at 09:31
  • @RayPalmer Knowing that P(obtaining six on a die roll) is exactly $\frac16$ doesn't make 'the number of sixes obtained on a die roll' less of a random variable. The lack of exactness is due merely to the empirical roots of our inference; 'random' and 'exact' are not on opposite ends of a spectrum. This is a previous answer that I wrote.. – ryang Nov 25 '21 at 09:42
  • Ah so basically anything that I don't know, would be a random variable and hence, has some distribution. In this case, the number of balls in a bag, which is usually defined in most problems, is a random variable purely because I don't know. It doesn't matter, how close I am to the true probability of picking a ball at random, the total number is still going to be random until and unless I find the exact probability. Then using the classical definition I can just ditch the distribution and claim that there must be $20$ balls. Something like that ? – RayPalmer Nov 25 '21 at 09:48
  • @RayPalmer You have the exact probability $(\frac12)$ of tails in a single trial; the number of tails in $770$ trials remains random; you use (not ditch) its distribution to make the best guess $335$ of the number of tails in yesterday's/tomorrow's coin-flip tournament, and to predict the $0.00082%$ chance of obtaining exactly $329$ tails. – ryang Nov 25 '21 at 10:13
  • yes, but unlike a coin toss or the roll of a die, there actually is a true number of blue ball, that wouldn't change in every experiment. We just don't know its value, that makes it random. How can we show that with infinite trials it would actually be 20 ? – RayPalmer Nov 25 '21 at 10:28
  • @RayP I said "yesterday/tomorrow" precisely to indicate that you can equivalently view the situation predictively/retrospectively (in your words, there also is an actual true number of tails). Your blue balls distribution was empirically-derived (as the number of trials approaches infinity, your ESTIMATED probability naturally becomes more accurate); if OTOH you had started out knowing there are 20 blue balls (⇒ prob=20%), then of course we can derive from definition that there are indeed 20 blue balls. Your confusion stems from circular reasoning. – ryang Nov 25 '21 at 11:10
  • Please slowly read my two posted answers; between them they contain the answers to your 2nd to 4th posted questions. – ryang Nov 25 '21 at 11:10
  • Comments are not for extended discussion; this conversation has been moved to chat. – TheSimpliFire Nov 28 '21 at 12:03