0

For $N\geq 1$ there is an urn with $N$ balls labeled with the numbers $1,...,N$ and we want to estimate $N$ by randomly choosing $n\leq N$ balls without replacement. Determine the corresponding statistical model, a maximum likelihood estimator and an unbiased estimator for the identity.

I'm still a bit confused about this topic so maybe someone could tell me whether my results are correct and help with continuing.

$\ $

  • For the statistical model $(\mathcal{X}, \mathcal{F}, \{P_{\vartheta}\mid \vartheta\in\Theta\})$ I chose $\mathcal{X}=\mathbb{N}_{\geq 1}^n, \mathcal{F}=\mathcal{P}(\mathcal{X}), \Theta=\mathbb{N}_{\geq 1}$ and for $\vartheta \in \Theta$ $$P_{\vartheta}((x_1,...,x_n))=\prod_{i=1}^{n}\frac{1}{\vartheta +1-i}\cdot \chi_{[1,\vartheta]\setminus A_{i-1}}(x_i)=\begin{cases}\frac{1}{\vartheta\cdot(\vartheta-1)\cdot...\cdot(\vartheta-n+1)}\cdot\chi_{[1,N]}(\max_i(x_i))&\text{if}& x_i\neq x_j & \forall 1\leq i\neq j\leq n \\0 &\text{else}\end{cases}$$ with $A_{0}:=\varnothing, A_{i}:=\{x_1,..,x_i\}$ for $i=1,...,n$

$\ $

  • To determine the maximum likelihood estimator I fix $(x_1,...,x_n)\in\mathcal{X}, x_i\neq x_j \forall i\neq j$, and consider $$\varphi(\vartheta)=\frac{1}{\vartheta\cdot(\vartheta-1)\cdot...\cdot(\vartheta-n+1)}\cdot \chi_{[\max_i(x_i), \infty)}(\vartheta)$$ which has its maximum at $\max_i(x_i)$, as it is zero for $\vartheta\in[1,\max_i(x_i))$ and the fraction gets smaller for bigger values for $\vartheta$.

$\ $

  • I'm struggling with determining the unbiased estimator $T$. I know that $E_{\vartheta}(T)\overset{!}{=}\vartheta$, therefore, with $\mathcal{X}':= \{x=(x_1,...,x_n)\in \mathcal{X}\mid x_i\neq x_j \forall i\neq j \}$ $$\int_{\mathcal{X}}T\mathrm{d}P_{\vartheta}=\sum_{x\in\mathcal{X}', \max_i(x_i)\leq\vartheta}T(x)\cdot\frac{1}{\vartheta\cdot(\vartheta-1)\cdot...\cdot(\vartheta-n+1)}$$ but don't quite know how to continue.

Note: This is no assignment, I am looking at some old exercises to study for an exam and for this one I don't have the solutions to help. Thank you in advance!

Lu1998
  • 27
  • 1
    I think your $P_\vartheta$ definition also needs to include an indicator function for all of the $x_i$ being distinct. Any seqeuence $(x_i){i=1}^n\in \mathbb{N}^n{\leqslant }$ with a repetition should have probability zero. Additionally, it's unnecessary, but your product $\prod_{i=1}^n \frac{1}{\vartheta+1-i}$ can be written as $\frac{(\vartheta-n)!}{\vartheta!}$. – user469053 Apr 06 '24 at 18:21
  • @user469053 Thanks, you're right! I'm not quite sure how to do this but I'll try and add my new guess to the original question. Maybe I can exclude the prior elements out of the set I use to define the indicator function – Lu1998 Apr 06 '24 at 18:49
  • 1
    Alternatively, you could put it into the definition of $\mathcal{X}$: $$\mathcal{X}={(x_1,\ldots, x_n)\in \mathbb{N}_{\geqslant 1}^n:(\forall 1\leqslant i<j\leqslant n)(x_i\neq x_j)}.$$ – user469053 Apr 06 '24 at 19:05
  • 1
    This is the German tank problem. The distribution of $T=\max_{1\le i\le n} X_i$ can be obtained from $P(T=t)=P(T\le t)-P(T\le t-1)$. The mean of $T$ is shown here, from which you get an unbiased estimator of $N$ based on $T$. – StubbornAtom Apr 06 '24 at 21:06

0 Answers0