Interestingly, Maxime Bôcher (Introduction to Higher Algebra, 1907, pp.297-299) used exactly this result to prove that every nonsingular matrix has a square root. Although I rediscovered this result by considering the square roots of Jordan blocks, Jordan form is not needed in the proof of this result.
Bôcher's argument can be easily generalized to prove the following statement.
- When $n$ is a positive integer, $f$ is a polynomial over an algebraically closed field of characteristic $0$ and $f(0)\ne0$, there exists a polynomial $g$ such that $fg+x$ is a perfect $n$-th power.
(Consequently, for every positive integer $n$, every nonsingular square matrix $A$ over an algebraically closed field of characteristic zero has a matrix $n$-th root that can be expressed as a polynomial in $A$. If $A$ is symmetric, its matrix $n$-th root can also be taken to be symmetric.)
Here is a sketch of his proof. Let
$$f(x)=c\prod_{i=1}^m(x-\lambda_i)^{a_i}$$
where the $\lambda_i$s are distinct and nonzero (because $f(0)\ne0$). For each $i$, let $h_i(x)=c\prod_{j\neq i}(x-\lambda_j)^{a_j}$, the polynomial obtained from $f$ by omitting the factor $(x-\lambda_i)^{a_i}$. Bôcher's idea is to consider a polynomial of the form
$$\chi=\sum_{i=1}^mq_ih_i$$
where $\deg q_i\leq a_i-1$. He wanted to show that the $q_i$s can be suitably chosen so that $\left(\chi(x)\right)^n-x$ is divisible by $f$.
Since $(x-\lambda_i)^{a_i}$ divides $h_j$ whenever $i\ne j$, we have
$$\chi^n-x \equiv q_i^nh_i^n-x \mod (x - \lambda_i)^{a_i}.$$
So, if each $q_i$ is chosen such that each $q_i^nh_i^n-x$ is divisible by $(x - \lambda_i)^{a_i}$, then $\chi^n-x$ is divisible by each $(x - \lambda_i)^{a_i}$ and in turn also by $f$.
It remains to choose an appropriate $q_i$. Define
$$p_i=q_i^nh_i^n-x.$$
Then $p_i$ is divisible by $(x - \lambda_i)^{a_i}$ if and only if $0=p(\lambda_i)=p'(\lambda_i)=\cdots=p^{(a_i-1)}(\lambda_i)$. If we take
$$q_i(x)=\sum_{r=0}^{a_i-1}c_r(x-\lambda_i)^r,$$
the system of equations reduces to
$$\begin{align}0=p(\lambda_i)&=c_0^nh_i(\lambda_i)^n+\lambda_i,\tag{1}\\ 0=p^{(r)}(\lambda_i)&=nr!c_0^{n-r}c_rh_i(\lambda_i)^n+d_r\quad(\text{for } 1\le r<a_i),\tag{2}\end{align}$$
where $d_r$ is some constant that depends only on $c_0,c_1,\ldots,c_{r-1}$ but not on $c_i$ for each $i\ge r$.
Since the underlying field is algebraically closed and $h_i(\lambda_i)\ne0$, we may take $c_0=-\lambda_i^{1/n}/h_i(\lambda_i)$ in equation $(1)$, where $\lambda_i^{1/n}$ is any $n$-th root of $\lambda_i$. As $\lambda_i$ is also nonzero, we have $nr!c_0^{n-r}h_i(\lambda_i)^n\ne0$ in equation $(2)$. Hence we may determine $c_1,c_2,\ldots,c_{a_i-1}$ from $(2)$ successively and obtain $q_i$.
(I guess the proof in Shengtong Zhang's answer is essentially the same, but he has used something called "Hensel lifting" that I have never heard of and so I don't really understand his answer.)