Is it possible that $\mathbb P [ Y = 1 | X = x] >0$ whereas $\mathbb P [ X = x] = 0$?

Question

This question follows my previous one here, which is about the optimal classifier $g^*$ in case $X$ follows normal distribution.

Let $X,Y$ be random variables in which

$X$ follows normal distribution.
$Y$ takes values in $\{-1,1\}$.

In measure-theoretic probability theory, $\mathbb P [ Y = 1 | X = x] := \mathbb E[\mathbf{1}_{\{Y=1\}} | \mathbf{1}_{\{X=x\}}]$ and $\mathbb P [ X = x] := \mathbb E[\mathbf{1}_{\{X=x\}}]$. Here $\mathbf{1}_{\{Y=1\}}$ and $\mathbf{1}_{\{X=x\}}$ are both integrable random variables. Then $\mathbb P [ Y = 1 | X = x]$ is well-defined even if $\mathbb P [ X = x] = 0$.

I would like to ask if it's possible that $\mathbb P [ Y = 1 | X = x] >0$ whereas $\mathbb P [ X = x] = 0$?

Thank you so much for your clarification!

Isn't $\mathbb{P}[X=x]=0$ for every $x$ if $X$ is normally distributed? — uniquesolution, Sep 10 '20 at 21:30
@uniquesolution I think $\mathbb{P}[X=x]=0$. That's why I choose $X$ that follows normal distribution. — Akira, Sep 10 '20 at 21:33
If $X$ and $Y$ are independent, then $P(Y=1|X=x)=P(Y=1)$. So if $Y$ is supported on ${-1,1}$ then $P(Y=1|X=x)>0$. — , Sep 10 '20 at 21:38
Your definition of $P(Y = 1 | X = x)$ is incorrect. Conditioning on the sigma algebra generated by the measure zero event ${X = x}$ is the same as taking expectation. Your best guess for $Y$ given that a measure zero event happened is just the expectation. — Elle Najt, Sep 10 '20 at 22:29
@LorenzoNajt If $\mathbb P [ Y = 1 | X = x] := \mathbb E[\mathbf{1}{{Y=1}} | \mathbf{1}{{X=x}}]$ is not correct, please elaborate on the correct definition of $\mathbb P [ Y = 1 | X = x]$. — Akira, Sep 11 '20 at 05:26
@LAD I did that pretty extensively in your other question... if there's a specific question about that I'm happy to try to answer. Do you know how to condition on a sigma algebra? — Elle Najt, Sep 11 '20 at 05:35
@LAD No problem. Maybe this will help: https://en.wikipedia.org/wiki/Regular_conditional_probability — Elle Najt, Sep 11 '20 at 05:38
@LAD No problem. That wikipedia page seems like it might be more confusing than helpful. I think it would be easiest to follow bullet 1. in my answer on the other page , and come back to this question when you've learned how to condition on a sigma algebra. — Elle Najt, Sep 11 '20 at 06:38
The correct interpretation of "$\mathsf{P}(Y=1\mid X=x)$" is a (Borel) function $f(x)$ s.t. $$ f(X(\omega))=\mathsf{P}(Y=1\mid X)(\omega). $$ — , Sep 11 '20 at 09:48

score 1 · Accepted Answer · 2020-09-10T22:23:11.990

1

Here's an example to help you understand what's going on.

Suppose that $10$ people enter an elevator with capacity $2000$lbs. Denote $X$ as the combined weight of all $10$ people in the elevator. (I chose this example because weight is a random variable that classically possesses a normal distribution.) Now define an indicator random variable $Y$ such that $Y=1 \iff$ capacity is exceeded and $Y=0$ otherwise. Then $$P(Y=1|X=2200)=1 $$

edited Sep 10 '20 at 22:23

answered Sep 10 '20 at 21:43

This result is impossible and very counter-intuitive in classical probability in which we define $P(A \mid B)=\frac{P(A \cap B)}{P(B)}$. – Akira Sep 10 '20 at 21:49
@LAD, I think the more fundamental relationship is $P(A \cap B) = P(A|B) \times P(B)$ i.e., when you divide by $P(B)$ you've already made the assumption that $P(B) \neq 0$. However, $P(A \cap B) = P(A|B) \times P(B)$ would still hold true in Matthew Holder's case - isn't it? – Kartik Sep 10 '20 at 21:57
That's the formula you use if you assume $P(B)\neq 0$. That formula simply doesn't work if $P(B)=0$.
If you have a discrete random variable $Y$ and a continuous random variable $X$ and you seek to compute $P(Y=1|X=x)$ you need to have some information on the joint density of $(X,Y)$ which is usualy denoted by $f_{XY}$. Then $$P(Y=1|X=x)=f_{Y|X=x}(1|x)=\frac{f_{XY}(x,1)}{\sum_{y}f_{XY}(x,y)}$$
– Sep 10 '20 at 22:00
1

It's kind of like this. We all know and love the fact that $\int x^ndx=\frac{x^{n+1}}{n+1}+C$. But this formula doesn't work if $n=-1$ so you need to look elsewhere to get $\int x^{-1}dx$. This doesn't mean $\int x^{-1}dx$ can't be computed; it simply means you need to use some other tools in your toolbox. – Sep 10 '20 at 22:03

score 1 · Answer 2 · answered Sep 10 '20 at 21:49

1

Yes. Let $X,Y$ be independent. Using this for the conditional expectation with respect to the sigma field generated by $X$, we get: $$E[\mathbb{1}_{Y=1}|\sigma(X)] = E[\mathbb{1}_{Y=1}] = P(Y=1)$$

answered Sep 10 '20 at 21:49

Winger 14

2,329

I'm sorry for this duplicate comment. This result is impossible and very counter-intuitive in classical probability in which we define $P(A \mid B)=\frac{P(A \cap B)}{P(B)}$. – Akira Sep 10 '20 at 21:53
I am not sure I understand what is counter-intuitive... You're formula would yield $\frac{0}{0}$ which is undefined. Using sigma-fields helps us give meaning to that expression. In general, we need the joint density to compute $P(Y|X)$ anyway! – Winger 14 Sep 10 '20 at 21:59
And in this specific example, we get $P(A \cap B) = P(A) P(B)$ by independence and can then simplify the formula – Winger 14 Sep 10 '20 at 22:01

Is it possible that $\mathbb P [ Y = 1 | X = x] >0$ whereas $\mathbb P [ X = x] = 0$?

2 Answers2

Linked