3

The Fisher Neyman Factorisation Theorem states that if we have a statistical model for $X$ with PDF / PMF $f_{\theta}$, then $T(X)$ is a sufficient statistic for $\theta$ if and only if there exists nonnegative functions $g_{\theta}$ and $h(x)$ such that for all $x,\theta$ we have that $f_{\theta}(x)=g_{\theta}(T(x))(h(x))$.

Computationally, this makes sense to me. However, recently I have started to have some doubts about when and where I can apply this theorem.

For example, if I have the PDF for a uniform distribution $f_{\theta}(x)=\frac{1}{\theta}$, doesn't this allow me to make any sufficient statistic that I like by rewriting the PDF as $$f_{\theta}(x)=\Big(\frac{X^2+X^4}{\theta}\Big)\Big(\frac{1}{X^2+X^4}\Big)$$ which would make our sufficient statistic $T(X)=X^2+X^4$. Or if we replace this particular choice of $T(X)$ for something else, doesn't this allow us to construct almost any choice of $T(X)$ as a valid sufficient statistic?

Is this correct, or am I doing something wrong here by arbitrarily adding in functions of $X$ that cancel out in order to create sufficient statistics?

FD_bfa
  • 4,757

2 Answers2

3

The PDF of the uniform distribution is actually $f_\theta(x) = \frac{1}{\theta} \mathbf{1}_{[0, \theta]}(x)$. This indicator term is important because it ties $x$ and $\theta$ together, and prevents you from doing your cancellation.

angryavian
  • 93,534
  • Can't there still be cancellation if we multiply by $(X^2+X^4)\big( \frac{1}{X^2+X^4}\big)$. And this would make the sufficient statistic $T(X)=(X^2+X^4)\mathbf{1}_{[0, \theta]}(x)$ – FD_bfa May 18 '22 at 02:19
  • 1
    $T(X)$ cannot depend on $\theta$. @FD_bfa – angryavian May 18 '22 at 03:58
2

Let $X$ be distributed $\text{Uniform}(0, \theta]$.

We have that $$f_{X}(x) = \dfrac{1}{\theta}\mathbf{1}_{(0, \theta]}(x) $$ where $\mathbf{1}_{(0, \theta]}$ denotes the indicator function on $(0, \theta]$.

Now multiply by $x^2 + x^4$; indeed, $$f_{X}(x) = \dfrac{x^2 + x^4}{\theta}\mathbf{1}_{(0, \theta]}(x) \cdot \dfrac{1}{x^2 + x^4}\text{.}$$

We must write the above in the form $g_\theta(T(x)) \cdot h(x)$ for some $T$.

Indeed, note that $\mathbf{1}_{(0, \theta]}(x)$ includes both $\theta$ and $x$, so by necessity, $\mathbf{1}_{(0, \theta]}(x)$ must be part of $g_\theta(T(x))$. We could thus approach this problem in quite a few ways, but I outline two of them below.


Factorization $1$: let $$g_\theta(x) = \dfrac{x^2 + x^4}{\theta}\mathbf{1}_{(0, \theta]}(x)$$ and $$h(x) = \dfrac{1}{x^2 + x^4}\text{.}$$ Then this implies that $T(X) = X$ is sufficient for $\theta$.


Factorization $2$: let $Y = X^2$, then $$g_\theta(y) = \dfrac{y + y^2}{\theta}\mathbf{1}_{(0, \theta]}(\sqrt{y})$$ and $$h(x) = \dfrac{1}{x^2 + x^4}\text{.}$$ Then this implies that $T(X) = X^2$ is sufficient for $\theta$.


Main point:

Theorem. Let $g: \mathbb{R} \to \mathbb{R}$ be a one-to-one mapping and $T(X)$ be sufficient for $\theta$. Then $g(T(X))$ is also sufficient for $\theta$.

See Injective functions and sufficient statistics for a proof. Consequentially, the sufficient statistic of a parameter is not unique.

Clarinetist
  • 20,278
  • 10
  • 72
  • 137
  • Doesn't factorisation one also imply that $T(X)=(X^2+X^4)\mathbf{1}_{[0, \theta]}(x)$ is also a valid sufficient statistic? – FD_bfa May 18 '22 at 03:00
  • 1
    @FD_bfa By definition, that cannot be a statistic because it has the (unknown) parameter of interest $\theta$ in it. – Clarinetist May 18 '22 at 03:05