21

When my teacher told us about the chain rule I found it quite easy, but when I am trying to prove something based on this rule I kind of get confused about what are the allowed forms of this rule. For example, I can't understand why I can say:

$$ p(x,y\mid z)=p(y\mid z)p(x\mid y,z) $$

I can not understand how one can end up to this equation from the general rule! Can you please help how to think correctly about this rule?


I found this post useful for my question:

Is order of variables important in probability chain rule

Moj
  • 313
  • 2
    What is $p$? And what are $x,y$, and $z$? – martini Nov 04 '12 at 11:08
  • 1
    Hi. See http://math.stackexchange.com/questions/200967/i-am-confused-about-bayes-rule-in-mcmc/201088#201088 – Stéphane Laurent Nov 04 '12 at 11:09
  • @StéphaneLaurent Thanks for reply. Reading your post I got one question. In there you defined the general rule for more than 2 RV. When I follow your definition for the second case in the question I come up with : p(x|z,y)p(z|y) which is different from p(z|x,y)p(x|y). My problem in the fist step is how these two are equivalent ? – Moj Nov 04 '12 at 11:36
  • @Moj $p(x,z|y)=p(x|z,y)p(z|y)=p(z|x,y)p(x|y)$ These are the two possible decompositions of the conditional joint distribution $p(x,z|y)$ of $(x,z)$ given $y$. In the first decomposition you choose to condition $x$ given $z$ whereas in the second decomposition you condition $z$ given $x$. – Stéphane Laurent Nov 04 '12 at 11:49
  • great!This is was my problem! I wanted to make sure that changing the decomposition is possible or not!Thanks – Moj Nov 04 '12 at 12:19

1 Answers1

29

$$p(x,y|z) = \frac{p(x,y,z)}{p(z)} = \frac{p(x|y,z)p(y,z)}{p(z)} = p(x|y,z)p(y|z)$$

On the first step we use the definition of conditional probability. On the second step we use the same definition on the numerator to convert the joint probability $p(x,y,z)$ into a conditional $p(x|y,z)$ and a joint $p(y,z)$. Finally, we divide $p(y,z)$ by $p(z)$ applying once again the definition of conditional probability, and we obtain the result.

Another way of looking at it is that you can just ignore variables that are always on the right side of the conditional sign. In that case the expression is just the usual conditional probability:

$$p(x,y) = p(x|y)p(y)$$

You simply condition all of these probabilities on $z$ and you get your original formula.

Andrei
  • 39,869
  • Thanks but I still confused about how many different equivalent are exist for this equation. For ex:

    p(x|y,z)= p(z|x,y)

    is it true?

    – Moj Nov 04 '12 at 11:26
  • Obviously not. p(x|y,z) = p(x,y,z)/p(y,z) = p(z|x,y)p(x,y)/p(y,z) – pedrosorio Nov 04 '12 at 12:01
  • Right, my problem was just different ordering of this conditional dependencies!Thanks – Moj Nov 04 '12 at 12:21
  • Ofcourse I accepted it!specially the second trick was really helpful! Thanks again – Moj Nov 04 '12 at 12:25
  • @Moj: Please see http://meta.math.stackexchange.com/questions/3286/how-do-i-accept-an-answer/3287#3287 and http://meta.math.stackexchange.com/questions/3399/why-should-we-accept-answers. – joriki Nov 04 '12 at 14:04
  • @pedrosorio Yes I tried to do that but I still dont have enough reputation – Moj Nov 04 '12 at 14:48
  • @Moj I don't think you needed to have any reputation to accept correct answers. – pedrosorio Nov 04 '12 at 15:01
  • I tried the vote up on upper left on of your answer but I'm getting this message about the reputation ! – Moj Nov 04 '12 at 15:15
  • @Moj you need reputation to upvote, but you don't need it to accept the answer, just click the check sign under the down arrow. – pedrosorio Nov 04 '12 at 15:17
  • sorry I didnt pay attention to that:D – Moj Nov 04 '12 at 15:24