The conceptual difference between probability space and distribution of random variable

Question

Probability space is defined as triple $(\Omega, \sigma, P)$, where $\Omega$ is a set, $\sigma$ is a sigma-algebra on this set, and $P$ is probability measure. A random variable $f$ is a measurable function from a probability space to a measurable space $(\mathcal{X}, \Sigma)$. A random variable $f:\Omega \to \mathcal{X}$ turns $(\mathcal{X}, \Sigma)$ into the probability space $(\mathcal{X}, \Sigma, f_{*}P)$ where $f_{*}P$ is the pushforward of $P$ by $f$. This probability space is the distribution of the random variable $f$. At the other hand, every probability space is a probability distribution of a random variable over itself (via identity map). So every probability space is the distribution of a random variable, and distribution of every random variable is a probability space. Why, then, do we need two different concepts? Yes, several random variables can be defined on the same probability space. But measurable functions of random variables are also random variables, and we can take several measurable functions of the same random variable. So what is the conceptual difference between the two types of objects?

A probability space can never be a random variable and a random variable can never be a probability space. — Kavi Rama Murthy, Apr 29 '25 at 23:14
@geetha290krm, I corrected the sloppy use of language. I meant distributions of random variables. — Daigaku no Baku, Apr 29 '25 at 23:16
@Mittens, it is not even close. I can't even decide how to react to this, because my question has literally nothing in common with that one apart from the fact that both are about probability distributions. I literally provided the formal definition of probability distribution in the question body. — Daigaku no Baku, Apr 30 '25 at 00:48

crydiso · Answer 1 · 2025-04-30T00:30:01.727

First of all, saying that the probability space $(\mathcal{X}, \Sigma, f_\ast P)$ is a distribution of the random variable (RV) $f$ is a bit of an abuse of notation. Usually (at least to the best of my knowledge), only the pushforward $f_\ast P$ is called a distribution (of the RV $f$).

As you said, one can look at multiple RVs $\{f_j \, : \, j \in J \}$ on a given probability space $(\Omega, \sigma, P)$. And, yes, as you said, one can form new RVs from them by concatenating with measurable functions $g_{\alpha} : (\mathcal{X}, \Sigma) \rightarrow (\mathcal{E}_{\alpha}, \Xi_{\alpha})$ ($\alpha$ some indices of some index set), i.e. $H_{j,\alpha} = (g_\alpha \circ f_j)$, or simply $H_{j, \alpha} = g_\alpha(f_j)$ (to use a common notation in probability theory).

However, it should be pointed out that it is not the same! That is, the RVs of the form $H_{1,\alpha} = g_{\alpha} \circ f_{1}$ are fully determined by the stochastic behavior of $f_1$. In fact, they are the opposite of (stochastic) independent from $f_1$ as it gets (in a sense). One way to formulate this is the following: the $\sigma$-algebra $\sigma(H_{1,\alpha})$ generated by the concatenated RV $H_{1, \alpha}$ is always a subalgebra of the $\sigma$-algebra $\sigma(f_1)$ generated by $f_1$ (recall: $\sigma(f) \subseteq \sigma$ is the smallest subalgebra of $\sigma$ in $\Omega$ for which $f : \Omega \rightarrow \mathcal{X}$ is measurable). Simply because for any measurable subset $A \subseteq \mathcal{E}_{\alpha}$ the preimage $H_{1, \alpha}^{-1}(A) = f_{1}^{-1}(B) \in \sigma(f_1)$ where $B := g_{\alpha}^{-1}(A)$.

On the other hand, the RVs $f_j$ are, in general, not stochastic "dependent" (in the above sense) on any other $f_{k}$. In other words, neither $\sigma(f_j) \subseteq \sigma(f_k)$ nor $\sigma(f_j) \supseteq \sigma(f_k)$ for $j \neq k$ in general. This cannot be achieved by considering only a single $f = f_{0}$ and then build other $f_j$ by concatenating with functions $g_j : \mathcal{X} \rightarrow \mathcal{X}$, i.e. setting $f_j = g_j(f_0)$. (I got the impression that is something you had in mind. Sorry if that was not the case and this is already clear to you :) ).

In general one thinks of the probability space $(\Omega, \sigma, P)$ as an "anonymous" background space on which one considers the various "interesting" RVs $f_j$, potentially with different ranges $\mathcal{X}_j$, i.e. $f_j : \Omega \rightarrow (\mathcal{X}_j, \Sigma_j)$. Then their associated distribution $(f_j)_{\ast}P$ only contains a "part" of the full stochastic information encoded in the measure $P$. In particular, one can model various degrees of mutual (partial) dependence and (partial) independence between the various $f_j$. These "partial" dependencies between the $f_j$ are typically expressed via conditional expectations $\mathbf{E}[f_j \vert \sigma(f_k : k \in J')]$ where $J' \subseteq J$ is a subset of the index set.

However, in view of the "necessity" of considering a general probability space $(\Omega, \sigma, P)$ contained in your question, there is at least a partial answer. If one considers a fixed family $\{ f_j : \Omega \rightarrow \mathcal{X}_j\}$ of RVs, then one can replace the "abstract" probability space $(\Omega, \sigma, P)$ with the induced probability space on the product space of all the $(\mathcal{X}_j, \Sigma_j)$. That is, one considers $$\tilde{\Omega} := \prod_{j\in J} \mathcal{X}_j \quad \text{with $\sigma$-algebra } \quad \tilde{\sigma} := \bigotimes_{j\in J} \Sigma_j .$$ Here $\bigotimes_{j\in J} \Sigma_j$ denotes the product $\sigma$-algebra generated by cylinder sets of the form $A_j \times \prod_{k\neq j} \mathcal{X}_k$ for any $j \in J$ and measurable $A_j \in \Sigma_j$. On this space, one can induce the pushforward measure $\tilde{P}$ w.r.t. to the family of maps $$ F:= (f_j)_{j_J} : (\Omega, \sigma) \rightarrow \tilde{\Omega} = \prod_{j\in J} \mathcal{X}_j \qquad\text{and define}\qquad \tilde{P} := F_\ast P .$$ Then the new probability space $(\tilde{\Omega}, \tilde{\Sigma}, \tilde{P})$ contains the same stochastic information as the original (possibly "larger") probability space $(\Omega, \sigma, P)$.

The conceptual difference between probability space and distribution of random variable

1 Answers1