82

I have a bit trouble distinguishing the following concepts:

  • probability measure
  • probability function (with special cases probability mass function and probability density function)
  • probability distribution

Are some of these interchangeable? Which of these are defined with respect to probability spaces and which with respect to random variables?

Marc
  • 3,403

2 Answers2

65

The difference between the terms "probability measure" and "probability distribution" is in some ways more of a difference in connotation of the terms rather than a difference between the things that the terms refer to. It's more about the way the terms are used.

A probability distribution or a probability measure is a function assigning probabilities to measurable subsets of some set.

When the term "probability distribution" is used, the set is often $\mathbb R$ or $\mathbb R^n$ or $\{0,1,2,3,\ldots\}$ or some other very familiar set, and the actual values of members of that set are of interest. For example, one may speak of the temperature on December 15th in Chicago over the aeons, or the income of a randomly chosen member of the population, or the particular partition of the set of animals captured and tagged, where two animals are in the same part in the partition if they are of the same species.

When the term "probability measure" is used, often nobody cares just what the set $\Omega$ is, to whose subsets probabilities are assigned, and nobody cares about the nature of the members or which member is randomly chosen on any particular occasion. But one may care about the values of some function $X$ whose domain is $\Omega$, and about the resulting probability distribution of $X$.

"Probablity mass function", on the other hand, is precisely defined. A probability mass function $f$ assigns a probabilty to each subset containing just one point, of some specified set $S$, and we always have $\sum_{s\in S} f(s)=1$. The resulting probability distribution on $S$ is a discrete distribution. Discrete distributions are precisely those that can be defined in this way by a probability mass function.

"Probability density function" is also precisely defined. A probability density function $f$ on a set $S$ is a function specifies probabilities assigned to measurable subsets $A$ of $S$ as follows: $$ \Pr(A) = \int_A f\,d\mu $$ where $\mu$ is a "measure", a function assigning non-negative numbers to measurable subsets of $A$ in a way that is "additive" (i.e. $\mu\left(A_1\cup A_2\cup A_3\cup\cdots\right) = \mu(A_1)+\mu(A_2)+\mu(A_3)+\cdots$ if every two $A_i,A_j$ are mutually exclusive). The measure $\mu$ need not be a probability measure; for example, one could have $\mu(S)=\infty\ne 1$. For example, the function $$ f(x) = \begin{cases} e^{-x} & \text{if }x>0, \\ 0 & \text{if }x<0, \end{cases} $$ is a probability density on $\mathbb R$, where the underlying measure is one for which the measure of every interval $(a,b)$ is its length $b-a$.

AnthonyC
  • 127
  • 1
    The measure is not determined by the pdf. The standard normal density $x\mapsto\dfrac1{\sqrt{2\pi}} e^{-x^2/2}$ is also a probability density with respect to the SAME measure. Every time you see an expression like $\displaystyle\int_a^b f(x),dx$, you're talking about integrating with respect to that measure. ${}\qquad{}$ – Michael Hardy Dec 18 '14 at 22:05
  • 1
    One instance is when the measure of a set is simply the number of members of the set, and in that case a probability density is the same thing as a probability mass function. – Michael Hardy Dec 19 '14 at 01:56
  • Sorry to dig up such an old post... You've clearly distinguished between (PMF and PDF) and (P.Measure and P.Function). But what about the third dotpoint: Probability distribution? Are you implying that probability distribution and functions are the same thing? And so would I be correct in saying PMF/PDF's are special cases of probability measures/distributions/functions? – Mistakamikaze Sep 12 '21 at 04:23
  • @Mistakamikaze : A probability distribution is a function that assigns a probability to each measurable set: $A \mapsto \Pr(A). \qquad$ – Michael Hardy Sep 12 '21 at 17:42
  • Sorry I mixed up P.Function and P.Distribution in my comment you just replied to. So my question is... is a P.Function an interchangeable term with P.Distribution? And so am I correct in saying PMF/PDF's are special cases of probability measures/distributions/functions? – Mistakamikaze Sep 14 '21 at 05:37
  • @Mistakamikaze : A p.m.f. or a p.d.f. completely determines a probability distribution, but it is not the same thing. If $Z\sim\operatorname N(0,1),$ then $z\mapsto\dfrac 1 {\sqrt{2\pi,,}}, e^{-z^2/2}$ is the probability density function, and $$\displaystyle A\mapsto \int\limits_A \dfrac 1 {\sqrt{2\pi,,}}, e^{-z^2/2}, dz\tag 1$$ is the probability density function with respect to Lebesgue measure. But one may say $$ \dfrac 1 {\sqrt{2\pi,,}}, e^{-z^2/2}, dz\tag 2$$ (with "$dz$") is the probability distribution, where $dz$ is Lebesgue measure. In effect $(1)$ is the same as $(2).$ – Michael Hardy Sep 16 '21 at 15:29
  • 1
    Just to reiterate: probability measures and probability distributions are the same thing? thanks everyone! – stats_noob Feb 11 '22 at 07:04
3

This is my 2 cents, though I'm not an expert:

"Probability Measure" is used in the context of a more precise, math theoretical, context. Kolmogorov in the year 1933 laid down some mathematical constructs to help better understand and handle probabilities from a mathematically rigid point of view. In a nutshell - he defined a "Probability Space" which consists of a set of events, a ($\sigma$)-algebra/field on that set ($\approx$ all the different ways you can subset that original set), and a measure which maps these subsets to a number that measures them. This became the standard way of understanding probability. This framework is important because once you start thinking about probability the way mathematicians do, you encounter all kind of edge cases and problems - which the framework can help you define or avoid.

So, I would say that people who use "Probability Measure" are either involved with deep probability issues, or are simply more math oriented by their education.

Note that a "Probability Space" precedes a "Random Variable" (also known as a "Measurable Function") - which is defined to be a function from the original space to measurable space, often real-valued. I'm not sure, but I think the main point here, is that this allows us to use more "number-oriented" math, than "space-oriented" math. We map the "space" into numbers, and now we can work more easily with it. (There's nothing to prevent us to start with a "number space", e.g., $\mathbb R$ and define the identity mapping as the Random Variable; But a lot of events are not intrinsically numbers - think of Heads or Tails, and the mapping of them into numbers 0 or 1).

Once we are in the realm of numbers (real line $\mathbb R$), we can define Probability Functions to help us characterize the behavior of these fantastic probability beasts. The main function is called the "Cumulative Distribution Function" (CDF) - it exists for all valid probability spaces and for all valid random variables, and it completely defines the behavior of the beast (unlike, say, the mean of a random variable, or the variance: you can have different probability beasts with the same mean or the same variance, and even both). It keeps tracks on how much the probability measure is distributed across the real line.

If the random variable mapping is continuous, you will also have a Probability Density Function (PDF), if it's discrete you will have a Probability Mass Function (PMF). If it's mixed, it's complicated.

I think "Probability Distribution" might mean either of these things, but I think most often it will be used in less mathematically precise as it's sort of an umbrella term - it can refer to the distribution of measure on the original space, or the distribution of measure on the real line, characterized by the CDF or PDF/PMF.

Usually, if there's no need to go deep into the math, people will stay on the level of "probability function" or "probability distribution". Though some will venture to the realms of "probability measure" without real justification except the need to be absolutely mathematically precise.

  • Your answer gave me insight into why more mathematically oriented inferential statistics books begin with the CDF, while more applied ones emphasize the PMF and PDF. – Mihai Aug 22 '24 at 11:25