I'm reading Larry Wasserman's "All of statistics", and I've come across a definition I can't "unpack".
Specifically the text defines $f(x, y)$ to be a PDF for the random variables $(X, Y)$, if:
- $f(x, y) \geq 0 $ for all $(x, y)$
- $\int_{-\infty}^{+\infty}\int_{-\infty}^{+\infty}f(x, y)dxdy = 1 $ and
- For any set $A \subset \mathbb{R}\times\mathbb{R}, \mathbb{P}((X, Y) \in A) = \iint_{A}f(x, y)dxdy$.
Intuitively it all makes sense, but what exactly is $\mathbb{P}((X, Y) \in A)$?
Earlier in the text, we have $\mathbb{P}(X=x)$ defined to be $\mathbb{P}(X^{-1}(x))$, and this makes sense, since the pre-image of $X$ is the sample space.
This definition can be trivially extended to other arithmetic operators, i.e. $<$, $>$, $\leq$, etc and even set operations $\mathbb{P}(X \in A)$, as long as $A$ is a subset of $\mathbb{R}$.
I'm struggling to see how exactly this definition can be extended to $\mathbb{P}((X, Y) \in A)$. The text is not helpful in that regard.