Understanding the Relationship Between Measure Theory and Probability Theory?

Question

I was at a restaurant with some of my friends who study pure math. While at the restaurant, they were having a discussion about the relationship between Measure Theory and Probability Theory. I tried to follow along with the discussion but I unfortunately could not.

Prior to the conversation, here is what I already knew:

I already loosely know what Probability Theory is - from the undergraduate level courses I took in university, I learned about different concepts from Probability Theory in a very applied way. This included learning about different mathematical properties of Probability Distribution Functions and Random Variables.
I am less familiar with Measure Theory. Based on some readings I have done, it seems like Measure Theory is involved with assigning "quantities" to "subsets of a set". For example, if you flip 2 coins - the sample space is HH, HT, TH, TT. In this case, a "Measure" refers to a probability - and by using a special function called a "Probability Measure", we assign probabilities to each element in the sample space.
The mathematician Andrey Kolmogorov created a set of Mathematical Axioms within Probability Theory. For any given "experiment" (i.e. Measure Space) - there must be a sample space, an event space (i.e. an "event" corresponds to a given subset of the sample space) and a probability measure which assigns probabilities to each event within the sample space. Kolmogorov's Axioms tell us that negative probabilities are not possible, the probability of at least one event occurring is 1 and the sum of the probabilities for all (disjoint) events is 1. These Axioms then allow us to derive important rules that can be used to further analyze and interpret probabilities, e.g. P(A U B) = P(A) + P(B) - P(A & B)

Now, here is what I did not understand in the conversation:

Supposedly Kolmogorov's Axioms were so important that they revolutionized the field of Probability Theory. Kolmogorov was the first to explicitly describe the relationship between Probability Theory and Measure Theory.
But why exactly were these Axioms so important? How exactly did the relationship between Measure Theory and Probability Theory revolutionize Probability Theory?
If I understand things correctly, it seems like the field of Probability Theory made significant progress before Kolmogorov was even born. For example, the Normal Distribution was defined by Gauss - far before the birth of Kolmogorov. On the other hand, important results in Probability Theory such as Chebyshev's Inequality and Markov's Inequality were also defined before Kolmogorov. Thus, if the relationship between Measure Theory and Probability Theory is so important - how were these results possible when this relationship was not defined?
In other words: What "things" could not have been done prior to defining this relationship between Probability Theory and Measure Theory? And what "things" could now be done after defining this relationship between Probability Theory and Measure Theory?
To summarize - why is the relationship between Measure Theory and Probability Theory important?

Can someone please help me understand these points?

Thanks!

Maybe https://hsm.stackexchange.com/ is a better place for this question? At any rate, discoveries in mathematics can predate axiomatization. And the relevance of measure theory to probability was clear to (say) E. Borel several decades before K's axiomatization. — kimchi lover, Aug 06 '23 at 17:27
"If I understand things correctly, it seems like the field of Probability Theory made significant progress before Kolmogorov was even born." Formal axiomatization of a subject indicates that it has reached a level of maturity where people know enough about it to know what axioms to choose. Therefore, it comes very late in the development of the subject, not at the beginning. People were studying geometry long before Euclid; real numbers long before Dedekind, etc. — Ted, Aug 06 '23 at 19:54
compare: https://math.stackexchange.com/questions/3805307/problems-with-any-non-kolmogorovian-frequentistic-subjective-etc-approaches/3810060#3810060 — Nap D. Lover, May 27 '24 at 22:08
and also maybe: https://math.stackexchange.com/questions/3546899/couldnt-the-third-axiom-of-probability-be-a-theorem-instead/3547159#3547159 — Nap D. Lover, May 27 '24 at 22:09

score 5 · Answer 1 · answered Aug 06 '23 at 17:51

I think Durrett is a great reference for understanding the connection here. The organization of his textbook neatly lays out the relationship:

Chapter 1 starts with the pure measure theory side of probability. A probability distribution $(\Omega, \mathcal{F}, \Bbb{P})$ is a measure space, having the probability that the measure of the entire space is finite and equal to 1: $\Bbb{P}(\Omega) = 1$. From there he defines random variables (i.e. integrable functions $X: \Omega \to S$, where the codomain space $S$ is often $\Bbb{R}^d$ with the Lebesgue measure) and expectation (i.e. the integrals $\Bbb{E}X := \int X d \Bbb{P}$ of these random variables), with a review of the basic measure theory results (e.g. Fatou's lemma) needed to proceed.
My favorite line in this entire book kicks off Chapter 2 on Laws of Large Numbers: "Measure theory ends and probability begins with the definition of independence." Durrett walks us through the definitions of independent events, independent $\sigma$-algebras, and independent r.v.'s, and looks at classic results such as weak & strong LLN, as well as modes of convergence of random variables and random series.

So essentially, probability is looking at a specific type of measure space (a finite one, whose measure is normalized to equal $1$), but the nomenclature is tailored to the applications: we speak of "events" instead of "measurable sets", "random variables" instead of "measurable functions", etc. and we also take an interest in independence relationships between events/r.v.'s/etc. which may not apply to more general measure spaces (particularly infinite ones).

Nice :) +1 -- great explanation of why probability theory, despite being a subset of measure theory, gains its own identity by the specific restrictions is makes upon measure theory and its focus on things like independence (i.e., product measures etc) — , Aug 06 '23 at 21:37

score 3 · Answer 2 · answered Aug 06 '23 at 21:35

I agree that you may want to ask this on the history of math and science site. But, I can see it fitting here (added two tags to show this isn't the usual "help me solve/prove something" post)

It looks like you are actually concerned with two issues:

Why did probability need measure theory?
What differentiates measure-theoretic probability theory from measure theory.

@Rivers McForge gave a nice answer to (2) above as to how measure-theoretic probability relates to measure theory as a whole: it is restricted to studying measure spaces $\left(\Omega, \mathcal{F},\mu\right)$ whose measure, $\mu$, satisfies Kolmorogov's Axioms.

Note that this axiomatic approach leverages measure theory to provide a rigorous definition of the elements of a measure space, then it adds particulars (so ruling out large classes of measures that don't satisfy the axioms).

The advantage here is that we trade generality for the ability to prove more locally "powerful" theorems, since we are able to leverage Kolmorogov's axioms to prove more about a smaller class of measures.

Related question: What distinguishes measure theory and probability theory?

I'll focus on the second implicit question

Why did probability need measure theory?

This has been asked on this site before, so I'll link in the requisite posts here: why measure theory for probability?

There is also a nice little video where a mathematician (post-doc at the time) explains this exact question that we all go through when we see measure-theoretic probability: https://www.youtube.com/watch?v=RjPXfUT7Odo

For me, the take-way I got when I tried to answer this for myself was:

Measure theory is an elegant and very useful mathematical language to express probability theory -- in a way that would be very clumsy or imprecise otherwise (e.g., defining what constitutes a consistent event space).

However, this isn't merely aesthetic -- the beauty of measure theory is that you get a couple very powerful theoretical tools:

Lebesgue Integration: It allows you to sensibly integrate a much wider array of expressions than Riemann integral we all learn in Calc 101.
Dominated Convergence Theorem to prove different types of convergence of random variables.
The ability generalize beyond continuous and discrete sample spaces to all kinds of others (e.g., mixed spaces, function spaces, and more esoteric things that cannot nicely map to things on the Real number line).

My 2 cents :)

Understanding the Relationship Between Measure Theory and Probability Theory?

2 Answers2

Linked