Showing that $Y$ has a uniform distribution if $Y=F(X)$ where $F$ is the cdf of continuous $X$

Question

Let $X$ be a random variable with a continuous and strictly increasing c.d.f. $F$ (so that the quantile function $F^{−1}$ is well-deﬁned). Deﬁne a new random variable $Y$ by $Y = F(X)$. Show that $Y$ follows a uniform distribution on the interval $[0, 1]$.

My initial thought is that $Y$ is distributed on the interval $[0,1]$ because this is the range of $F$. But how do you show that it is uniform?

This is not true in cases where there's a discrete component. For example, suppose $X=\left.\begin{cases} 1/2 & \text{with probability }1/2, \ W & \text{with probability }1/2,\end{cases}\right}$ and $W$ is uniformly distributed on $[0,1]$, and that the choice between whether $X=1/2$ or not is independent of $W$. Then the cdf of $X$ has no values between $1/4$ and $3/4$, so it cannot be uniformly distributed on $[0,1]$. It is, however, true of continuous distributions. — Michael Hardy, Jul 15 '14 at 21:41
By the way, it is not necessary that $F$ is a strictly increasing CDF, continuity is sufficient. Just define the quantile function the usual way as a generalized inverse via $F^-(y)=inf{x\in\mathbb{R}: F(x)\geq y}$. See the proof of Proposition 3.1 in Embrechts, P., Hofert, M.: A note on generalized inverses. Mathematical Methods of Operations Research 77(3), 423-432 link for a very careful and detailed explanation. — binkyhorse, Jul 16 '14 at 14:38
@binkyhorse so if $X$ is a continuous random variable and $Y=F_{X}(X)$, then $Y$ must be a $U(0,1)$ random variable? (since continuity of CDF is guaranteed by the fact that $X$ is continuous) — s0ulr3aper07, Feb 23 '19 at 15:04
@s0ulr3aper07 By Proposition 3.1 in the paper I linked above, yes.
Prop. 3.1: Let F be a distribution function and X ~ F. (a) If F is continuous, F(X)∼U[0,1]. The paper includes a detailed proof. — binkyhorse, Mar 27 '19 at 20:10

score 61 · Answer 1 · answered Jul 15 '14 at 21:35

61

Let $F_Y(y)$ be the CDF of $Y = F(X)$. Then, for any $y \in [0,1]$ we have:

$F_Y(y) = \Pr[Y \le y] = \Pr[F(X) \le y] = \Pr[X \le F^{-1}(y)] = F(F^{-1}(y)) = y$.

What distribution has this CDF?

answered Jul 15 '14 at 21:35

JimmyK4542

55,969

1

Are all CDF's of continuous densities invertible? – tintinthong Aug 22 '17 at 07:43
3

@tintinthong: not always completely, but enough. If you define $G(y)= \inf{x:F_X(x) \ge y}$ then $F_X(G(y))=y$ when $y \in (0,1)$ – Henry Aug 22 '17 at 22:12
4

Strictly increasing and continuous CDF is needed. – Mathphys meister Feb 20 '19 at 15:36
Thank you for the last question: "What distribution has this CDF?" I needed to read this to understand. – Danyal Mar 13 '24 at 23:09

score 11 · Answer 2 · answered Jul 15 '14 at 21:35

11

$$ Prob(Y\leq x)=P(F(X)\leq x)=P(X\leq F^{-1}(x))=x \\ $$ The last equality is from the definition of the quantile function.

answered Jul 15 '14 at 21:35

Juanito

2,482
12
25

score 4 · Answer 3 · 2021-10-08T06:30:54.817

Let $y\in(0,1)$. Since $F$ is continuous, there exists $x\in\mathbb{R}$ s.t. $F(x)=y$. Thus, $$ \mathsf{P}(Y\le y)=\mathsf{P}(F(X)\le F(x))=F(x)=y, $$ i.e., $Y\sim\text{U}[0,1]$. In order to see the first equality we don't need continuity. Specifically, since any cdf is right-continuous, \begin{align} \{F(X)\le F(x)\}&=\{\{F(X)\le F(x)\}\cap\{X\le x\}\}\cup \{\{F(X)\le F(x)\}\cap\{X>x\}\} \\ &=\{X\le x\}\cup \{\{F(X)=F(x)\}\cap\{X>x\}\}, \end{align} and $\mathsf{P}(\{F(X)=F(x)\}\cap\{X>x\})=0$.

score 3 · Answer 4 · answered Mar 18 '16 at 18:30

Let $y=g(x)$ be a mapping of the random variable $x$ distributed according to $f(x)$. In the mapping $y=g(x)$ you preserve the condition of probability density (namely you counts the same number of events in the respective bins)

$$ h(y)dy=f(x)dx $$

where h(y) is the probability distribution of $y$

if $h(y)=1$ (uniform distribution) we have

$$ dy=g'(x)dx=f(x)dx $$

This means that $$ g(x)=\int f(x)dx $$

namely the function $g(x)$ that maps the random variable $x$ distributed according $f(x)$, into a random variable $y$ distributed uniformly is his own cumulative distribution function $\int f(x) dx$.

score 2 · Answer 5 · answered Feb 02 '20 at 22:40

Here is an approach that does not use the quantile function whatsoever - the only property used is that independent copies of $X$ have zero probability of being equal. (The main ingredient in my argument is conditional expectation.)

Consider the cumulative distribution function of $X$, namely $$ F(t)=\mathbb P(X\leq t). $$ Your random variable - which I will suggestively call $U$ instead of $Y$ - can be described by starting with two independent and identically distributed random variables $X,Z$ and considering the conditional probability $$ U=\mathbb P(X\leq Z\mid Z). $$ Then, for all integers $n\geq 1$, we can represent $U^n$ as follows. Let $X_1,X_2,\ldots,X_n,Z$ be independent and identically distributed. By independence, $$ \mathbb P\bigl(X_1\leq Z,X_2\leq Z,\ldots, X_n\leq Z\bigm\vert Z\bigr)=U^n, $$ and thus by the tower property $$ \mathbb EU^n=\mathbb P(X_1\leq Z,X_2\leq Z,\ldots, X_n\leq Z)=\mathbb P\bigl(Z=\max(X_1,X_2,\ldots,X_n,Z)\bigr). $$ Since $X_1,\ldots,X_n,Z$ are iid, each of them is equally likely to be the maximum and therefore $$ \mathbb EU^n=\frac{1}{n+1}. $$ Thus $U$ has the same moments as a uniformly distributed random variable on $[0,1]$. Since $U$ is supported in $[0,1]$ as well, it follows (by the uniqueness of the Hausdorff moment problem) that $U$ is uniformly distributed, as desired.

score 1 · Answer 6 · answered Feb 19 '22 at 01:41

For a proof of this problem when $F_X(x)$ is strictly increasing, refer to JimmyK4542's answer. Let's assume $F_X(x)$ is just non-decreasing (there are intervals such as $[a,b]$ where $F_X(x') = c$ for $x'\in[a,b]$). We define $G(y)$ similar to what Henry's comment suggests: $$ G(y)=\inf\{x:F_X(x)\gt y\}$$ Now substituting this expression in what Jimmy has written will give us: $$ F_Y(y) = \Pr[Y \le y] = \Pr[F_X(X) \le y] = \Pr[X \le G(y)] = F_X(G(y))= y \label{eq:I}\tag{I}$$

We need to show that:

$F_X(x)\le y \rightarrow x \le G(y)$
$F_X(G(y))=y$

The second argument is easier to prove. We have the following expression almost according to definitions: $$ F_X(G(y))= F_X(\inf\{x:F_X(x)\gt y\})= y$$ Now for the first argument, we can still use what $G(y)=\inf\{\cdots\}$ implies; if $F_X(x)\le y$, then $x\le \inf\{x:F_X(x)\gt y\};$ hence $x\le G(y)$.

With the two arguments proved and a substitution in \ref{eq:I}, we have proved the main argument.

I personally believe this problem is a severe case for abuse of notation, and a bad professor's problem. — Mahyar, Feb 19 '22 at 01:46
The first condition you claimed that needs to be shown is insufficient. You need to show implication in both directions - which is also true. @mahyar — G.Bar, Aug 31 '24 at 08:40

score 1 · Answer 7 · answered Feb 09 '25 at 04:04

Let $a:= \max\{b|F(b) = y\}$

F is continuous, so pre-image of y is non-empty and closed, so its $\max$ exists. Also we get F(a) = y.

Note:

F is non-decreasing, so $X\le a\Rightarrow F(X)\le F(a)$.
But since $a$ is $\max$ of pre-image of F(a), we have $F(X)\le F(a)\Rightarrow X\le a$.
Thus, $X\le a\Leftrightarrow F(X)\le F(a)$

So: $\mathbb P(Y\le y)=\mathbb P(F(X)\le y)=\mathbb P(F(X)\le F(a)) = \mathbb P(X\le a)=F(a)=y$

*that is for y in (0, 1). For 0 and 1 it is easy to investigate separately. (It's a separate problem since preimage for 0 and 1 can be empty for example for standard normal c.d.f. F).

Showing that $Y$ has a uniform distribution if $Y=F(X)$ where $F$ is the cdf of continuous $X$

7 Answers7

Linked

Related