As far as I understand it, when estimating the population mean from a sample without knowing the population standard deviation $\sigma$, we cant use the $Z$-test. According to the Central Limit Theorem, the sampling distribution of sample means is a normal distribution with mean${}=\mu$ and variance${} = \sigma / \sqrt n$. But the sample standard deviation $s = \frac {\sum(x_i - \mu)^2}{n} $ underestimates the true population parameter $\sigma$ (i.e. $s < \sigma$). For that reason, we cannot apply Central Limit Theorem using the sample standard deviation directly.
But we use Bessel's correction precisely for this reason! Ww write $s = \frac {\sum(x_i - \mu)^2}{n-1} $ so that now $s ≈ \sigma$. My question is, after applying Bessel Correction, why cant we use Z test directly to estimate population mean $\mu$ ?
The $T$-distribution is a bit flatter than the $Z$-distribution. This essentially says the fact that the population variance is a bit more than the variance we estimated from the sample. But did we not already take that into consideration by applying Bessel correction?
Now another question arises in this context. From the Central Limit Theorem, the sampling distribution is essentially a Normal distribution and nothing else. Never any other distribution, and certainly NOT a $T$-distribution. Just because we may fail to estimate the variance does not mean you should change the sampling distribution of sample means from a $Z$-distribution to a $T$-distribution. If instead of the $Z$-distribution, you would have taken some other Normal distribution with variance "a bit more than 1", at least that would have made sense. A $T$-Distribution looks like a normal distribution, but is NOT a $Z$-distribution with variance increased by some amount. Just because there may be some uncertainty in determination of the $\sigma$, why are you assuming that, some entirely other distribution should better approximate the distribution of sample means?
Note: I have already looked into the following answers and they do not satisfactorily answer my question
Estimating population SD when calculating t-statistic
When the population variance is unknown, we should use t-distribution.