I am taking a course on Communication Systems (from an engineering point of view). While I'm usually very interested in the formal mathematics, this time I would like to avoid it, since I don't have a good background on probability theory; also, many of my colleagues have little to no interest in formal mathematics at all, and I would like to understand this concept in a way that would be easy to propagate to them.
So, apparently to understand the meaning of ergodicity, one needs to know what is the ensemble average and what is the time average of a random process. After reading this answer on Math.SE, and three related entries on Wikipedia (Ergodic Process, Ergodicity, Stationary ergodic process), this is what I understand:
A random process is like a random variable, but it's outcomes are "waveforms" (a.k.a. functions) instead of numbers.
The ensemble average is the average of the outcomes of the random process, and therefore is another function (waveform) by itself. A given random variable will have one ensemble average (one function).
As opposed to the ensemble average, a random process can have many (possibly infinitely many) time averages, since every outcome of the random process (i.e., every waveform) has its own time average, which is the average value of the waveform. That is, given an outcome $x(t)$, it's time average will be given by
$$\lim_{T \to \infty} \dfrac{1}{T} \int_{-\frac{T}{2}}^{\frac{T}{2}}x(t)dt$$
After reading those wikipedia pages, it seems that an ergodic process is a process that satisfies "the time average is equal to the ensemble average". But, which time average? To my understanding there are many time averages. Which one? All of them? Their mean? At least one of them? Or what?
Also, I would like to take a better look on the following examples, found in the linked wikipedia pages:
Example 1.
Suppose that we have two coins: one coin is fair and the other has two heads. We choose (at random) one of the coins, and then perform a sequence of independent tosses of our selected coin. Let X[n] denote the outcome of the nth toss, with 1 for heads and 0 for tails. Then the ensemble average is ½ (½ + 1) = ¾; yet the long-term average is ½ for the fair coin and 1 for the two-headed coin. Hence, this random process is not ergodic in mean.
Is this correct? (of course my doubt here is a consequence of the fact that I don't know the answer for my bold question above).
Example 2.
Ergodicity is where the ensemble average equals the time average. Each resistor has thermal noise associated with it and it depends on the temperature. Take N resistors (N should be very large) and plot the voltage across those resistors for a long period. For each resistor you will have a waveform. Calculate the average value of that waveform. This gives you the time average. You should also note that you have N waveforms as we have N resistors. These N plots are known as an ensemble. Now take a particular instant of time in all those plots and find the average value of the voltage. That gives you the ensemble average for each plot. If both ensemble average and time average are the same then it is ergodic.
It says "take a particular instant of time in all those plots". Does any instant works? Or rather, do I have to take "all of them" (one at a time)? Taking only one instant of time doesn't seem right... Rather, shouldn't I be taking some sort of limit to infinity? Also, it refers to "the time average" as if it was only one, but to my understanding there are N different time averages here, since there are N waveforms.