1

Is there a theorem that states what the distribution of a function of a random variable should be given the distribution of a random variable?

For example, say $X_1$,$X_2$,...$X_n$ is a sequence of iid random variables drawn from a Bernoulli distribution using some parameter p.

$X_n \sim B(p)$.

Lets also say that $\bar{X}_n = \sum_{i=1}^{n}X_n$

What is the distribution of $\bar{X}_n$? And $n\bar{X}_n?$

My guess is that they are both normal distributions.

$\bar{X}_n$ ~ $N\left(p, \frac{p(1-p)}{n}\right)$ , because of the CLT which states that given a high enough sample size the distribution of sample means approaches a normal distribution regardless of what the underlying distribution is. In this case the underlying distribution is Bernoulli.

Using the delta method I was able to arrive at

$$g(\bar{X_n}) = N\left(g(p), \frac{g'(p)^2\sigma^2}{n}\right) $$

$$n\bar{X}_n = N\left(np,n(p-1)\right)$$

Was it appropriate to use the Delta method here?

Fei Cao
  • 2,988
Ryan J
  • 11

1 Answers1

1

You should have a division by $n$ on the outside of the definition of $\overline{X}$. With that division there, $\overline{X}$ is asymptotically (but not exactly) distributed as $N(p,p(1-p)/n)$.

With $g(x)=nx$, the delta method ostensibly yields $g(\overline{X}) \sim N(np,np(1-p))$. But this is not really correct, because in this scaling there is no real pressure to make the distribution become "closer to continuous", since the probability mass is at all times concentrated on $\{ 0,1,\dots,n \}$ and these points are not getting any closer together. With the averaging scaling, the probability mass gets concentrated on $\{ 0,1/n,\dots,1 \}$, which are getting closer together, so in this situation the notion of a continuous asymptotic distribution makes more sense.

The hypothesis of the delta method being violated here is that the $g$ in this setting isn't fixed.

With all that said, even though it is a bit technically sloppy, you can still estimate probabilities under the Bin(n,p) distribution by approximating it with a $N(np,np(1-p))$ distribution, provided that $n$ is large enough.

Ian
  • 104,572