In my statistics for beginners course we've just been introduced to the CLT, where it's been stated that a distribution of sample means tends to the normal dist. as your sample size $n$ increases to infinity.
But what if your population is finite (i.e. of size $N$), so that your max sample size can only be of size $N \ll \infty$? Will such a distribution (which must be that of nearly all practical statistical surveys)not follow the CLT?
My best attempt at thoughts on this so far go like this: If I were to take a random sample from my population of size N, each sample though containing just a single member of the pop, calculate and plot the 'mean' of each sample (which would just equal the single value) until I've sampled and plotted every member and done so for each only once, I would eventually of course replicate exactly the population distribution.
Suppose then I repeat the experiment, increasing my sample size each repetition, until my sample is of size $N$. I take a single sample, plot its mean, then by definition this is the same as the population mean $\mu$.
So here, as my sample size has increased, my distribution of sample means hasn't tended to the Normal - with an ever thinner distribution with flatter tails and a taller peak - but more like a hyper-idealised version of the Normal - a single value at the population mean.
Clearly then, for finite populations - if I've understood the idea behind the CLT correctly, which is a big if -the CLT does not apply, rather in these practical cases, their sample mean distribution approaches something approximately Normal? Is it the case then that the CLT is more a theoretical concept, that applies to infinitely large populations, from which samples sizes can tend to infinity?
Further to this, I've read for the CLT to apply, the random variables of your population have to be I.I.D - if I'm using SRS without replacement for a finite population, does that mean the variables aren't I.I.D anymore, and thus the CLT would also not apply because of this? If the population were infinite though and I used SRSWOR, would the r.v.'s then be I.I.D, thus meaning the CLT would apply?
I appreciate all your insight on this; I'm very new to statistics, so I apologise if a lot of this is pretty basic and if my thoughts were way off. Thanks for any help you can lend, really appreciate it.
On you 2nd point, why doesn't it matter if your pop is finite if you use SRS? If I had a pop of size N, and I took samples of size N, the sample means would each = population mean. Their distribution would thus be uniform, not normal? Many thanks.
– TheRealPaulMcCartney Dec 09 '17 at 18:21