Suppose I have $n$ iid Gaussian random variables $X_1,\ldots,X_n$ and that I compute the empirical average $\frac1n \sum_{i=1}^n X_i=Y$. Given an $n$-dimensional vector $\mathbf{x}=x_1,\ldots,x_n$, is it correct to say that the conditional density $P_{Y|X^n=\mathbf{x}}(y)= \mathbb{1}_{\left\{y=\sum_{i=1}^n x_i\right\}}$ or would such an object be ill-defined?
1 Answers
Let $\mathscr{G}\subseteq \mathscr{F}$ be a sub-$\sigma$-algebra. Let $X_1,...,X_n$ be $\mathscr{G}$-measurable random variables. Then $E[g(X_1,...,X_n)|\mathscr{G}]=g(X_1,...,X_n)$ for all Borel measurable $g:\mathbb{R}^n\to \mathbb{R}$ s.t. $E[|g(X_1,...,X_n)|]<\infty$. This is because $g(X_1,...,X_n)$ is a $\mathscr{G}$-measurable random variable, its conditional expectation exists, and it can be taken out of the conditional expectation. Now let $g(x_1,...,x_k)=\mathbf{1}_{B}(n^{-1}\sum_{k\leq n}x_k)$ for $B \in \mathscr{B}(\mathbb{R})$. Then we have $$P\bigg(\frac{1}{n}\sum_{k\leq n}X_k\in B\bigg|\mathscr{G}\bigg)=E\bigg[\mathbf{1}_{B}\bigg(\frac{1}{n}\sum_{k\leq n}X_k\bigg)\bigg|\mathscr{G}\bigg]=\mathbf{1}_{B}\bigg(\frac{1}{n}\sum_{k\leq n}X_k\bigg)$$ Now let $\mathscr{G}=\sigma(X_1,...,X_n)$. For $x \in \mathbb{R}^n$, the conditional probability given $(X_1,...,X_n)=x$ is given by $$P\bigg(\frac{1}{n}\sum_{k\leq n}X_k\in B\bigg|(X_1,...,X_n)=x\bigg)=\mathbf{1}_{B}\bigg(\frac{1}{n}\sum_{k\leq n}x_k\bigg):=\nu(B)$$ This is a probability measure on $(\mathbb{R},\mathscr{B}(\mathbb{R}))$: $$\nu(B)=\begin{cases}1&\frac{1}{n}\sum_{k\leq n}x_k\in B\\ 0&\textrm{otherwise} \end{cases}$$ It is called a Dirac measure; this does not have a density.
- 18,347
-
Thanks a lot for taking the time to write it down formally, it is really helpful! – user1868607 Oct 28 '22 at 10:58
-
@user1868607 you're welcome! – Snoop Oct 28 '22 at 11:05