1

I want to find an estimator of the probability of success of an independently repeated Bernoulli experiment. Given that we have exactly $k$ failures before the $r$-th success.

The probability for $k$ failures before the $r$-th success is given by the negative binomial distribution:

$$P_p[\{k\}] = {k + r - 1 \choose k}(1-p)^kp^r$$

This yields the $\log$-Likelihood function for the observed number of failures $k$:

$$l_k(p) = \log({k + r - 1 \choose k}) + k\log(1-p) + r\log(p)$$

With derivative

$$l_k'(p) = \frac{r}{p} - \frac{k}{1-p}$$

The derivative is zero at $\hat p = \frac{r}{r+k}$. To show that $\hat p$ is really a MLE for $p$ we need to show that it is a maximum of $l_k$. But evaluating the second derivative at this point is pretty messy. Is there an easier way to show that this is in fact an MLE for $p$?

user7802048
  • 1,285

1 Answers1

2

In general the method of MLE is to maximize $L(\theta;x_i)=\prod_{i=1}^n(\theta,x_i)$. See here for instance. In case of the negative binomial distribution we have

$$L(p;x_i) = \prod_{i=1}^{n}{x_i + r - 1 \choose k}p^{r}(1-p)^{x_i}\\$$

$$ \ell(p;x_i) = \sum_{i=1}^{n}\left[\log{x_i + r - 1 \choose k}+r\log(p)+x_i\log(1-p)\right]$$ $$\frac{d\ell(p;x_i)}{dp} = \sum_{i=1}^{n}\left[\dfrac{r}{p}-\frac{x_i}{1-p}\right]=\sum_{i=1}^{n} \dfrac{r}{p}-\sum_{i=1}^{n}\frac{x_i}{1-p}$$

Set it to zero and add $\sum_{i=1}^{n}\frac{x_i}{1-p}$ on both sides.

$$\sum_{i=1}^{n} \dfrac{r}{p}=\sum_{i=1}^{n}\frac{x_i}{1-p}$$

$$\frac{nr}{p}=\frac{\sum\limits_{i=1}^nx_i}{1-p}\Rightarrow \hat p=\frac{\frac{1}{\sum x_i}}{\frac{1}{n r}+\frac{1}{\sum x_i}}\Rightarrow \hat p=\frac{r}{\overline x+r}$$

Now we have to check if the mle is a maximum. For this purpose we calculate the second derivative of $\ell(p;x_i)$.

$$\frac{d^2\ell(p;x_i)}{dp^2}=\underbrace{-\frac{rn}{p^2}}_{<0}\underbrace{-\frac{\sum\limits_{i=1}^n x_i}{(1-p)^2}}_{<0}<0\Rightarrow \hat p\textrm{ is a maximum}$$

callculus42
  • 31,012
  • Why do you need the product? I'm not required to looking at a joint distribution of multiple negative binomial distributions? – user7802048 Jul 26 '19 at 13:28
  • 1
    You have a sample of n values ($x_i$). Based on this (fix) values you estimate the parameter. For this propose we maximize the product of $f(x_i,\theta)\cdot \ldots \cdot f(x_n, \theta)$ – callculus42 Jul 26 '19 at 13:32
  • And isn't the second derivative of $\mathcal{l}$ equal to $\frac{\sum_{i=1}^nx_i}{(1-p)^2} - \frac{rn}{p^2}$ (notice the positive sign)? – user7802048 Jul 26 '19 at 13:32
  • But in my question I stated, that I just have one sample.. – user7802048 Jul 26 '19 at 13:33
  • If the sample size is 1 then $n=1$. Have you read the link? – callculus42 Jul 26 '19 at 13:36
  • Yes, I've read the link and I understand using the joint density, if we have multiple samples, but I don't see the point of introducing the general case of $n$ samples, when only one sample is of significance. – user7802048 Jul 26 '19 at 13:37
  • This are not multiple samples. You have one sample with a sample size of $n$!!! It has nothing to do with joint densities. – callculus42 Jul 26 '19 at 13:39
  • But the product you use is the joint density of $n$ independent negative binomial distributions isn't it? I think we talk past each other at this point. But can you explain, the second derivative a little bit further? I'm confused about the negative sign in front of the second fraction. – user7802048 Jul 26 '19 at 13:49
  • By the way the derivative of $-\frac1{1-p}=-(1-p)^{-1}$ is $(-1)\cdot -(1-p)^{-2}\cdot (-1)=-\frac{1}{(1-p)^2}$. You have to regard the chain rule. The derivative of $(1-p)$ is $-1$ – callculus42 Jul 26 '19 at 13:50
  • Again. MLE has nothing to do with a joint density. Maybe you trust wiki a little more here. Here they talk about "observed data". The sample could be $3,5,3,2$ if $n=4$ There is no randomness involved. – callculus42 Jul 26 '19 at 13:58
  • @callculus42 How can we show the MLE is biased here? – Jake Mar 30 '22 at 19:35