6

Given $X \sim \text{NBin}(n,p)$, I've seen two different calculations for $\mathbb{E} (X)$:

\begin{align*} &1. \mathbb{E} (X) = \frac{n}{p}, \quad \text{or}\\ &2. \mathbb{E} (Y) = \frac{n(1-p)}{p} \end{align*}

Proof for 1.: Proof for the calculation of mean in negative binomial distribution

Proof for 2: Although I can't find a concrete proof on stackexchange, this is the expected value used in the wikipedia article for negative binomials, and I have also seen this value used in some questions here.

I've heard someone say that both are valid depending on the way you define the negative binomial, but I still don't quite understand the difference between the set-ups for the two different $\mathbb{E} (X)$.

Could someone explain their differences? Thank you!

tommik
  • 33,201
  • 4
  • 17
  • 35
punypaw
  • 497

1 Answers1

9

The negative binomial distribution is the sum of $n$ i.i.d. geometric distributions.

As for the Geometric, alse for the NBinomial you have 2 kinds of parametrizations

  1. The variable counting the total trials to get $n$ successes

  2. The variable counting the total failures to get $n$ successes

Thus you can prove your expectations in the following way:

  1. Start from the Geometric distribution that counts how many trials you need to get the first success:

$$P(X=x)=q^{x-1}p$$

$x=1,2,3,....$ and $q=1-p$

  1. Calculate $\mathbb{E}[X]$

$$\mathbb{E}[X]=p\sum_{x=0}^{\infty}q^{x-1}=p\sum_{x=0}^{\infty}\frac{d}{q}q^x=p \frac{d}{dq}\frac{q}{1-q}=\dots=\frac{1}{p}$$

Hence the expectation of the NBinomial counting how many trials you need to get $k$ successes is simply

$$ \bbox[5px,border:2px solid black] { \mathbb{E}[\Sigma_i X_i]=k\frac{1}{p} \qquad (1) } $$

  1. note that the geometric distribution counting the failures before the first success is

$$Y=X-1$$

Thus its mean is $\mathbb{E}[Y]=\frac{1}{p}-1=\frac{q}{p}$

Hence the Expectation of the NBinomial counting the number of failures before you get $k$ successes is

$$ \bbox[5px,border:2px solid black] { \mathbb{E}[\Sigma_i Y_i]=k\frac{q}{p} \qquad (2) } $$

...that's all!

Sha Vuklia
  • 4,356
tommik
  • 33,201
  • 4
  • 17
  • 35
  • 2
    Oh I see. So the first definition is the expected number of trials (successes + failures) before n (or k) successes, while the second is the expected number of failures before n (or k) successes? – punypaw Nov 11 '20 at 00:32
  • @punypaw : Yes, absolutely correct – tommik Nov 11 '20 at 00:37