I am self-studying empirical process theory. I have encountered the covering number $N(\delta,\mathcal{G},P)$, as well as the empirical version $N(\delta,\mathcal{G},P_n)$. It seems intuitive to expect some kind of convergence: $$ N(\delta,\mathcal{G},P_n)\rightarrow N(\delta,\mathcal{G},P) $$ Yet, I have no idea how to prove this. Can such a result be shown? Or are there counterexamples?
Definitions
Covering number: Let $P$ be a probability measure on the Borel-$\sigma$-algebra over $\mathbb{R}$. For $p\in[1,\infty)$ let $L^p(P)$ be the set of Borel-measurable mappings $\mathbb{R}\rightarrow\mathbb{R}$, for which $\int_\mathbb{R} |f|^p dP<\infty$. Let $\mathcal{G}$ be a totally bounded subset of $L^p(P)$. For some $\delta>0$, we can define the covering number of $\mathcal{G}$ as the smallest $N\in\mathbb{N}$, such that there exists a finite subset $G\subset \mathcal{G}$ with the following property: For any $g\in\mathcal{G}$, there exists a $h\in G$, such that $||g-h||_p<\delta$. This number is denoted by $N(\delta,\mathcal{G},P)$.
Empirical measure: Let $P$ be as above. Let $\{X_n\}_{n\in\mathbb{N}}$ be a sequence of independent $P$-distributed random variables. If $\delta_{X_i}$ denotes the dirac-measure, the empirical measure $P_n$ is defined as: $$ P_n:\mathcal{B}(\mathbb{R})\rightarrow[0,1],\quad E\mapsto \frac{1}{n}\sum_{i=1}^n\delta_{X_i}(E) $$