3

I try to be rational and keep my questions as impersonal as I can in order to comply to the community guidelines. But this one is making me mad. Here it goes. Consider the uniform distribution on $[0, \theta]$. The likelihood function, using a random sample of size $n$ is $\frac{1}{\theta^{n}}$.
Now $1/\theta^n$ is decreasing in $\theta$ over the range of positive values. Hence it will be maximized by choosing $\theta$ as small as possible while still satisfying $0 \leq x_i \leq \theta$.The textbook says 'That is, we choose $\theta$ equal to $X_{(n)}$, or $Y_n$, the largest order statistic'.But if we want to minimize theta to maximize the likelihood, why we choose the biggest x? Suppose we had real numbers for x like $X_{1} = 2, X_{2} = 4, X_{3} = 8$.If we choose 8, that yields $\frac{1}{8^{3}}=0.001953125$. If we choose $\frac{1}{2^{3}}=0.125$. Therefore why we want the maximum in this case $X_{n}$ and not $X_{1}$, since we`ve just seen with real numbers that the smaller the x the bigger the likelihood? Thanks!

  • 2
    The simple answer: the probability density at $x$ is not simply $1/\theta$, it is $\begin{cases}1/\theta&\text{if $0\le x\le\theta$,}\0&\text{otherwise}\end{cases}$. Now tell me what the likelihood function is (hint: it's not $1/\theta^n$). –  May 13 '15 at 03:43
  • In your example, if $\theta=10$ the likelihood is proportional to $\frac1{10^3}$. If $\theta=8$ the likelihood is proportional to $\frac1{8^3}$, which is higher. But if $\theta=4$ the likelihood is $0$ since there is no possibility of having seen $X_3=8$ when $\theta=4$. Similarly if $\theta=4$ the likelihood is $0$. – Henry May 16 '15 at 09:03
  • https://math.stackexchange.com/q/649678/321264, https://math.stackexchange.com/q/2941187/321264 – StubbornAtom Feb 09 '20 at 20:05

3 Answers3

1

What you are doing is wrong. You must find the likelihood function. What you found is $1/\theta^n$? so where is it defined? It is true that $X_n$ is the maximum likelihood estimator because it maximizes the true likelihood function. How do you find it?

Added: Your answer is actually on the right direction but as I mentioned it is missing a crucial point which alters everything. So the right way of writing down the likelihood funtion is as follows:

\begin{align}L(x_n;\theta)=\prod_{n=1}^N\theta^{-1}\mathbf{1}_{0\leq x_n\leq \theta}(x_n)\\=\theta^{-N}\prod_{i=1}^n\mathbf{1}_{0\leq x_n\leq \theta}(x_n)\end{align}

Until now, $L$ is a function of $x_n$ now lets write it as a function of $\theta$

\begin{align}L(\theta;x_n)=\theta^{-N}\prod_{i=1}^n\mathbf{1}_{\theta \geq x_n}(x_n)\end{align}

Observe that $L(\theta;x_n)$ is zero if $\theta<x_N$ and it is a decreasing positive function of $x$ if $\theta\geq x_N$. We can now see that for any choice $x>x_N$, $L(\theta;x_N)>L(\theta;x)$ this means maximum is reached at $\hat\theta=x_{N}$.

  • The likelihood function is $\frac{1}{\theta^{n}}$ as i said. Do you mean the log-likelihood function? And I've just showed with numbers that if you chose $X_{n}$ that would yield a smaller value for the likelihood function than choosing $X_{1}$. I appreciate you want to help but you just said i'm wrong re stated what was on the question and gave no explanation. Thanks for your effort – matt_zarro May 13 '15 at 00:19
  • Yes, it means I know the answer already but I want to hear something from you. It is always better for you, not for me. Okay, I give you another hint: you seem to forget $0 \leq x_n \leq \theta$ – Seyhmus Güngören May 13 '15 at 00:45
  • It will make exactly 3 days im wondering about this question, trust me im not a lazy person. If ive made the question is because ive tried with all my strengths to find the answer to this question. Could you give at least a more meaningful clue? Ive already thought about the interval of values x can assume. – matt_zarro May 13 '15 at 00:52
  • My exam is tommorrow morning and its very late by the way.. – matt_zarro May 13 '15 at 00:54
  • Okay just wait. Then I edit the answer. – Seyhmus Güngören May 13 '15 at 00:55
0

$\theta$ is the parameter to estimate, which corresponds to the upper bound of the $U(0,\theta)$. The observed samples are $x_1=2,\,x_2=4$, and $x_3=8$. The likelihood function to maximize is $\mathcal{L}(\theta|X) = \frac{1}{\theta^n}$ with $X$ corresponding to the observed values (a vector, really), $\theta$ the upper margin of the interval for which this uniform distribution is defined, and $n$ the number of samples.

In order to "see" this intuitively, it's important to realize that $\theta$ has to be at least as big as your largest sampled value ($8$) to avoid leaving samples out of the interval for which your probability density function is defined.

Picking out $8$ would render the value of the $pdf$ at any point equal to $\frac{1}{8}$, and the joint probability distribution (you are sampling three observations - a vector), $\frac{1}{8^3}$. That would certainly the maximum $\mathcal{L}(\theta|X)$, because it is the largest denominator possible (compare to $\frac{1}{2^3}$ or $\frac{1}{4^3}$).

0

Suppose the observed order statistics are $1,2,3.$

You have $L(\theta) = \dfrac 1 {\theta^3}$ for $\theta\ge3.$

And $L(\theta)=0$ for $\theta<3.$

As $\theta$ gets smaller, $L(\theta)$ gets bigger until $\theta$ gets as small as $3.$

$3$ is the largest order statistic.