Probability notation semicolon and vertical line

Question

While going through one of the answer I came across the following notation

$P( y_i =1 \mid \mathbf{x}_i ; \theta)$

I am not sure about its interpretation and how should it be read.

The exact usage can be found here

Now logistic regression says that the probability that class variable value $y_i =1$ , $i=1,2,...m$ can be modelled as follows

$$ P( y_i =1 \mid \mathbf{x}_i ; \theta) = h_{\theta}(\mathbf{x}_i) = \dfrac{1}{1+e^{(- \theta^T \mathbf{x}_i)}} $$

so $y_i = 1$ with probability $h_{\theta}(\mathbf{x}_i)$ and $y_i=0$ with probability $1-h_{\theta}(\mathbf{x}_i)$.

Welcome to MSE. Please edit and use MathJax to properly format math expressions. — Lee David Chung Lin, Feb 18 '20 at 09:01
It is just saying this is the conditional probability that $y_i=1$ given the parameter $\mathbf \theta$ and the values of the independent variables in $\mathbf x_i$ — Henry, Feb 18 '20 at 09:11

score 0 · Answer 1 · answered Jul 26 '24 at 19:18

The vertical bar denotes conditional probability, whereas the semicolon indicates fixed parameters. In your case, the notation identifies $\mathbf{x}$ as a random variable, whereas $\theta$ is a fixed parameter. If this were a Bayesian setting, instead of the semicolon, $\theta$ and $\mathbf{x}$ would be separated by a comma, as $\theta$ would be a random variable with a given prior distribution.

Probability notation semicolon and vertical line

1 Answers1