This is a question I've always found very natural to think about (and which has honestly bothered me for a long time), but where the proofs in the literature (Schoenberg's; Cavaretta's) are rather burdensome. They give explicit constants, at the cost of introducing complications and extensive computations. The other proofs I've seen only go up to $k = 2$, and don't treat the general case in any detail.
If we're willing to indulge in some more advanced techniques, we can obtain a quick way of showing this, but without an explicit constant.
Let $\varphi \in C^{\infty}_c(0, 1)$, and observe that by integration-by-parts, $$\int_{[0, 1]} f_n \cdot D^k \varphi = (- 1)^k \int_{[0, 1]} D^k f_n \cdot \varphi,$$ where $D = d / d x$, $D^k = d^k / d x^k$.
By hypothesis, $f_n \to f$ and $f_n^{(k)} \to g$, all in $C[0, 1]$. As a consequence, we have $$\int_{[0, 1]} f \cdot D^k \varphi = (- 1)^k \int_{[0, 1]} g \cdot \varphi.$$
Now, let $0 < \epsilon_0 < \frac{1}{2}$, $0 < \epsilon \ll \epsilon_0$, and take $\phi$ to be a nonnegative smooth function with unit mass, compactly supported in $(- \frac{1}{2}, \frac{1}{2})$. We observe that when $x \in [\epsilon_0, 1 - \epsilon_0]$, the convolution $(f * \phi_{\epsilon})(x)$ will be of the form described above, with $\phi_{\epsilon}(x - y)$ compactly supported in $(0, 1)$ as a function of $y$. Consequently, on this subinterval we have $$\frac{d^k}{d x^k} (f * \phi_{\epsilon})(x) = (g * \phi_{\epsilon})(x).$$
We then integrate from the midpoint of the interval. By the Cauchy formula for $k$ repeated integrations, we obtain $$(f * \phi_{\epsilon})(x) = p_{\epsilon}(x) + \frac{1}{(k - 1)!} \int_{1/2}^{x} (x - y)^{k - 1} (g * \phi_{\epsilon})(y) \, dy$$ for all $x \in [\epsilon_0, 1 - \epsilon_0]$, where $p_{\epsilon}$ is a polynomial of degree $\leq k - 1$.
We want to do the obvious thing and take the limit in $\epsilon$. To see that this is allowed, we bound $$|p_{\epsilon}(x) - p_{\epsilon'}(x)| \leq |(f * \phi_{\epsilon})(x) - (f * \phi_{\epsilon'})(x)| + \frac{1}{C_k} \int_{\epsilon_0}^{1 - \epsilon_0} |(g * \phi_{\epsilon})(y) - (g * \phi_{\epsilon'})(y)| \, dy,$$ and by the continuity of $f$, the first term tends to zero uniformly in $x$ as $\epsilon, \epsilon' \to 0$. For the same reason with $g$, we obtain vanishing of the integrand and hence the second term.
However, the space of complex polynomials of degree $\leq k - 1$ on $[\frac{1}{4}, \frac{3}{4}]$ (say) is a finite-dimensional normed space, and it is well known that all norms on such a space are equivalent. So the $p_{\epsilon}$ are Cauchy in the uniform norm, but we can also use the $\ell^1$ norm, to see that each one of the $k$ coefficients of $p_{\epsilon}$ is converging to some definite value. This means we recover a polynomial $p$ of degree $\leq k - 1$ in the limit, on the entire interval.
Thus we have $$f(x) = p(x) + \frac{1}{(k - 1)!} \int_{1/2}^{x} (x - y)^{k - 1} g(y) \, dy$$ for all $x \in [\epsilon_0, 1 - \epsilon_0]$, and as $\epsilon_0$ was arbitrary and polynomials are rigid, we see that this equality holds everywhere in $(0, 1)$ with the same polynomial $p$, and then on $[0, 1]$ by continuity. So from this representation, we can tell $f$ is a function in $C^k[0, 1]$, and this shows the graph of $T : D(T) \to X$ is closed in $X \times X$.
From this, we easily recover that $C^k[0, 1]$ with the graph norm $\|f\|_{C^0[0, 1]} + \|D^k f\|_{C^0[0, 1]}$ is complete, and then the closed graph theorem indicates $$\|f\| + \|D^k f\| \simeq \sum_{i = 0}^{k} \|D^i f\|.$$
By translation, we can see that since this holds for $[0, 1]$, it is possible to patch this for the whole real line. So if $f$ is $k$-times continuously differentiable and $f, f^{(k)}$ are bounded on $\mathbb{R}$, then every intermediate derivative $f^{(r)}$ is also uniformly bounded, and obeys $$\sup_{x \in \mathbb{R}} |f^{(r)}(x)| \lesssim_k \sup_{x \in \mathbb{R}} |f(x)| + \sup_{x \in \mathbb{R}} |f^{(k)}(x)|.$$
Finally, introducing a rescaling, we obtain the optimal dependence on norms of $$\|f^{(r)}\| \lesssim \|f\|^{1 - r / k} \|f^{(k)}\|^{r / k}$$ (up to some constant that the above-mentioned authors compute sharply), and the multidimensional case of the inequality follows from parametrizing over lines and looking at directional derivatives.