Criterion for independency of random variables

Question

I saw in some notes the following "criterion" for independency of two random variables.

Let $X$ and $Y$ be real-valued random variables defined on the same space. $X$ and $Y$ are statistically independent if and only if for any two functions $g$ and $h$ the following holds true $$ \mathbb{E}\left\{ h(X)g(Y)\right\} = \mathbb{E}\left\{ h(X)\right\}\mathbb{E}\left\{ g(Y)\right\}. $$

(1) Regard the "any two functions" requirement, what does it mean?

(2) It is helpful to use this criterion to show that two random variables are dependent. I'm curious whether there is an example of showing that two random variables are independent, using this criterion?

where did you find this statement . I am looking for a proof of why this is true — user3503589, Feb 02 '17 at 13:44

Davide Giraudo · Accepted Answer · 2013-08-23T11:00:08.310

"Any two functions" means Borel measurable functions (we need to integrate random variables) for which the integrals make sense (for example bounded functions). It's enough to do the test among $g$ and $h$ continuous bounded functions. Indeed, we can approximate pointwise the characteristic function of a closed set by a sequence of continuous bounded functions, hence if $F_1$ and $F_2$ are closed, we have $\mu\{(X,Y)\in F_1\times F_2\}=\mu\{X\in F_1\}\mu\{Y\in F_2\}$. Then we can extend this identity to $B_1$ and $B_2$ arbitrary Borel subsets.
Actually, it seems that we rather use the direction "$X$ independent of $Y$" implies $E[g(X)h(Y)]=E[g(X)]E[h(Y)]$ for $g,h$ measurable (bounded functions) than the converse. Indeed, for the latter, we check the equality when $g(x)=e^{isx}$ and $h(x)=e^{ity}$ for any $s,t\in\mathbb R$ fixed.

I see. Why is it enough to do the test "only" among bounded and continuous function? — user91011, Aug 23 '13 at 10:49

score 0 · Answer 2 · answered Jun 08 '24 at 14:16

Here's a short (non-rigorous) way to understand this. Two random variables $X$ and $Y$ are defined to be independent if their joint cumulative distribution function (CDF) factors out into their marginal CDFs $\forall x,y$.

$$F_{XY}(x,y)=F_X(x)F_Y(Y)\tag{1}$$ where the standard definitions are $$ \begin{align}\tag{2} &F_{XY}(x,y)=P(-\infty<X<x,-\infty<Y<y)=\int^x_{-\infty}\int^y_{-\infty}p_{XY}(x,y)\,dx\,dy \\ &F_{X}(x)=P(-\infty<X<x,-\infty <Y<\infty)=\int^x_{-\infty}p_{X}(x)\,dx\\ &p_X(x)=\int^{\infty}_{-\infty}p_{XY}(x,y)\,dy \end{align} $$ and similarly for $F_Y(y)$ and $p_Y(y)$. $F_X,F_Y$ are the marginal CDFs while $p_X,p_Y$ are the marginal probability distribution functions (PDFs).

Furthermore, it carries over to the densities i.e. independence iff $\forall x,y$.

$$p_{X,Y}(x,y)=p_X(x)p_Y(y)\tag{3}$$

Claim $X$ and $Y$ are independent iff. for all $h(X),g(Y)$ and $k(X,Y)=h(X)g(Y)$ $$\mathbb{E}[k]=\mathbb{E[h]}\mathbb{E[g]}\tag{4}$$.

Ans. When $X$ and $Y$ are independent, this follows trivially from $(3)$. For the converse, $$ \begin{align}\tag{5} 0&=\mathbb{E}[k(X,Y)]-\mathbb{E}[h(X)]\mathbb{E}[g(Y)]\\ &=\int\int k(x,y) p_{XY}(x,y)dxdy-\int h(x)p_X(x)dx \int g(y)p_Y(y)dy\\ &=\int\int k(x,y)(p_{XY}(x,y)-p_X(x),p_Y(y))dx dy \end{align} $$

where the limits have been suppressed for clarity. For the last equation to be satisfied for an arbitrary kernel $k$ the term in the bracket must be zero $\forall x,y$ thus implying $(3)$.

(1) Regard the "any two functions" requirement, what does it mean?

Note that the above follows only if $k$ is allowed to be arbitrary.

(2) It is helpful to use this criterion to show that two random variables are dependent. I'm curious whether there is an example of showing that two random variables are independent, using this criterion?

For that, a representation of all possible functions $k(x,y)$ is needed. One (partial) way to do this is to use Taylor expansion (similar to using the exponential in Davide's answer) which reduces the check to (limits suppressed)

$$\int\int p_{XY}(x,y) x^i y^j dx dy \overset{?}{=} \int p_X(x) x^i dx\int p_Y(y) y^j dy\tag{6}$$

$\forall$ postiive integers $i,j$. It is indeed satisfied, for say, $p_{X,Y}(x,y)=1$.

Criterion for independency of random variables

2 Answers2

Linked