0

While going through one of the answer I came across the following notation

$P( y_i =1 \mid \mathbf{x}_i ; \theta)$

I am not sure about its interpretation and how should it be read.

The exact usage can be found here

Now logistic regression says that the probability that class variable value $y_i =1$ , $i=1,2,...m$ can be modelled as follows

$$ P( y_i =1 \mid \mathbf{x}_i ; \theta) = h_{\theta}(\mathbf{x}_i) = \dfrac{1}{1+e^{(- \theta^T \mathbf{x}_i)}} $$

so $y_i = 1$ with probability $h_{\theta}(\mathbf{x}_i)$ and $y_i=0$ with probability $1-h_{\theta}(\mathbf{x}_i)$.

1 Answers1

0

The vertical bar denotes conditional probability, whereas the semicolon indicates fixed parameters. In your case, the notation identifies $\mathbf{x}$ as a random variable, whereas $\theta$ is a fixed parameter. If this were a Bayesian setting, instead of the semicolon, $\theta$ and $\mathbf{x}$ would be separated by a comma, as $\theta$ would be a random variable with a given prior distribution.

gvegayon
  • 101