3

Consider a partially observable Markov decision process (POMDP), see here for a complete definition.

My question is in relation to the conditional observation probabilities (denoted by $O(o|s',a)$ in the above link). This represents the probability of seeing observation $o$ if the current state is $s'$ and the action was $a$. Why is the conditional probability defined in this way?

Is there any issue with defining the conditional observation probability in terms of the previous state and action, $O(o|s,a)$, or in terms of the current state, action, and previous state as $O(o|s',a,s)$?

jonem
  • 381
  • 1
  • 8

0 Answers0