37

The PDF describes the probability of a random variable to take on a given value:

$f(x)=P(X=x)$

My question is whether this value can become greater than $1$?

Quote from wikipedia:

"Unlike a probability, a probability density function can take on values greater than one; for example, the uniform distribution on the interval $[0, \frac12]$ has probability density $f(x) = 2$ for $0 \leq x \leq \frac12$ and $f(x) = 0$ elsewhere."

This wasn't clear to me, unfortunately. The question has been asked/answered here before, yet used the same example. Would anyone be able to explain it in a simple manner (using a real-life example, etc)?

Original question:

"$X$ is a continuous random variable with probability density function $f$. Answer with either True or False.

  • $f(x)$ can never exceed $1$."

Thank you!

EDIT: Resolved.

Newman
  • 517

6 Answers6

61

Discrete and continuous random variables are not defined the same way. Human mind is used to have discrete random variables (example: for a fair coin, -1 if it the coin shows tail, +1 if it's head, we have that $f(-1)=f(1)=\frac12$ and $f(x)=0$ elsewhere). As long as the probabilities of the results of a discrete random variable sums up to 1, it's ok, so they have to be at most 1.

For a continuous random variable, the necessary condition is that $\int_{\mathbb{R}} f(x)dx=1$. Since an integral behaves differently than a sum, it's possible that $f(x)>1$ on a small interval (but the length of this interval shall not exceed 1).

The definition of $\mathbb{P}(X=x)$is not $\mathbb{P}(X=x)=f(x)$ but more $\mathbb{P}(X=x)=\mathbb{P}(X\leq x)-\mathbb{P}(X<x)=F(x)-F(x^-)$. In a discrete random variable, $F(x^-)\not = F(x)$ so $\mathbb{P}(X=x)>0$. However, in the case of a continuous random variable, $F(x^-)=F(x)$ (by the definition of continuity) so $\mathbb{P}(X=x)=0$. This can be seen as the probability of choosing $\frac12$ while choosing a number between 0 and 1 is zero.

In summary, for continuous random variables $\mathbb{P}(X=x)\not= f(x)$.

H. Potter
  • 2,211
14

Your conception of probability density function is wrong.

You are mixing it up with probability mass function.

If $f$ is a PDF then $f(x)$ is not a probability and does not have the restriction that it cannot exceed $1$.

Arya McCarthy
  • 335
  • 2
  • 15
drhab
  • 153,781
4

Probability density functions are not probabilities, but , if $f(x)$ is a probability density function, then $P=\int_{x_0}^{x_1} f(x) dx$ is a probability and thus $\int_{x_0}^{x_1} f(x) dx \leq 1$ for all $x_0,x_1$ ($x_0\leq x_1$).

Wouter
  • 8,117
3

To add to the already good existing answers, it is easy to understand it by way of their definitions.

For discrete random variable, the probability mass function (pmf) denoted as $p_{_X}(x)$ gives us the probabilities that $X$ takes a discrete value $x$ which is always between 0 and 1.

For continuous random variable, the probability density function (pdf) denoted as $f_{_X}(x)$ shows us the "nonnegative behavior" that $X$ takes a value $x$, this is not a probability! But when we integrate it over the support set of $x$ it should be 1.

I guess the confusion usually arise when we often assign probability mass function to discrete random variables and probability density function to the continuous counterpart and we think that they are all probabilities, which one is and the other is not. Another confusion also comes from abuse of notations that I have seen many times: $p_{_X}(x) = f_{_X}(x) = \mathbb{P} (X=x)$, you may think that the two equalities are true for both discrete and continuous random variables which is not true. This is only true for the discrete case.

Therefore, avoid abusing the notations, strictly use $p_{_X}(x)=\mathbb{P} (X=x)\in[0,1]$ for discrete random variables and strictly use $f_{_X}(x)\ge 0$ for continuous random variables.

holala
  • 771
2

Here's an intuition:

Probability Density exists in the continuous space. Probability Mass exists in the discrete space.

The PDF $f(x)$ is the derivative of the CDF $F(x)$: $$ f(x) = \frac{d(F(X))}{d(x)} $$

Thus, for a given range of $x \in (x_1, x_2]$, we can say that the pdf is the unit change in cumulative probability when moving from $x_1$ to $x_2$, i.e. $f(x)_{\{x_1,x_2\}} = \frac{F(x_2) - F(x_1)}{x_2 - x_1}$.

Or, "How much will my probability of $X \in \{0,x_0\}$ increase if I include $\{x_1, x_2\}$ in my range, normalised by the size of the range $|\{x_1, x_2\}|$"?.

You can now imagine, if there is a highly dense range $(0, \frac{1}{2})$ with probability of 1 within it's range (i.e. $F(1/2) - F(0) = 1$), then it's density would therefore be 2.

$$ f(x)_{(0, \frac{1}{2}]} = \frac{F(1/2) - F(0)}{1/2 - 0} = 2$$

1

In continuous probability distributions, probabilities are associated with intervals rather than individual points. The probability density function (PDF) represents the likelihood of a random variable falling within a specific interval. While the height of the PDF at a particular point might be greater than one, this doesn't directly represent a probability.

The key concept is that probabilities are defined over intervals, and the area under the curve between two distinct points on the interval represents the probability for that specific interval. The integral of the entire probability density function over its entire range must equal one, ensuring that the total probability is accounted for.

In simpler terms, the height of the curve at a given point doesn't provide a direct probability; instead, it's the collective area under the curve within an interval that corresponds to the probability of the variable falling within that interval.