Suppose I have a finite-dimensional, real random variable $X$ (not necessarily unidimensional). For convenience, let's say it's $d$-dimensional. Let $\Sigma$ denote the covariance of $X$, i.e. $$\Sigma = \mathbb{E}\left[\left(X-\mathbb{E}\left[X\right]\right)\left(X-\mathbb{E}\left[X\right]\right)^T\right]$$ Let $\{X_k\}_{k\in\mathbb{N}}$ be a sequence of random variables that are i.i.d. as $X$. Fix $n\in \mathbb{N}$ (and let it be "large"). Let $\widehat{\Sigma}$ be the sample covariance, i.e. $$ \widehat{\Sigma} = \frac{1}{n-1}\sum_{k=1}^{n}(X_k-\widehat{X})(X_k-\widehat{X})^T $$ where $\widehat{X}$ is the sample mean, i.e. $$ \widehat{X} = \frac{1}{n}\sum_{i=1}^{n}X_i $$ It is a known fact that $\widehat{\Sigma}$ is an unbiased estimator for $\Sigma$, i.e $$\mathbb{E}\left[\widehat{\Sigma}\right]= {\Sigma}$$ It can also be shown (and I'm not going to do it) that there exists a constant, positive semi-definite $d\times d$ matrix $\Delta$ such that $$\lim_{n\to \infty} n \mathbb{E}\left(\left\|\widehat{\Sigma} - \Sigma\right\|^2_F \right) = \mathrm{tr}\left(\Delta\right)$$ which, clearly, implies that $$\lim_{n\to \infty} \sqrt{n} \sqrt{\mathbb{E}\left(\left\|\widehat{\Sigma} - \Sigma\right\|^2_F\right)} = \sqrt{\mathrm{tr}\left(\Delta\right)}$$
(Note: $\left\| \cdot \right\|_F$ is the Frobenius norm.) Here is the crux of my question:
Q1. Can I swap the square root and expectation operator in this very contrived scenario? That is, is $$\sqrt{\mathbb{E}\left(\left\|\widehat{\Sigma} - \Sigma\right\|^2_F\right)} = \mathbb{E}\left(\sqrt{\left\|\widehat{\Sigma} - \Sigma\right\|^2_F}\right) = \mathbb{E}\left({\left\|\widehat{\Sigma} - \Sigma\right\|_F}\right) $$
true? Note: I have some limited experimental evidence that suggests this could be true for multivariate normal distributions, but obviously that's not enough to prove that this has to be true (even for multivariate normal distributions).
Q2a: If Q1 is not true, what extra assumptions do I need to add to Q2 for the commuting between square root and expectation to hold?
Q2b: If Q1 is true, are there any generalizations for when the commuting between square root and expectation might hold?