The fisher information only has a precise meaning when you are dealing with a normally distributed value. In that case, the log likelihood function will be parabolic, and the fisher information will equal the curvature at the MLE. It turns out mathematically that the curvature of the log likelihood is the inverse of the variance of the associated normal random variable.
This is what guides the intuition surrounding fisher information, even though it will only hold approximately for non-normal variables (although, subject to some tehnical conditions, it will usually be asymptotically true.) Also, it serves as a good lower bound on the variance, see here.
Demonstration of the relationship between $I$ and $\sigma$ for gaussian likelihood for the mean
Let $f(X;\theta):= f(x;\mu,\sigma) = \frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$
Take the logarithm of this:
$$\log(f) = -\log\sqrt{2\pi} - \log\sigma -\frac{(x-\mu)^2}{2\sigma^2}$$
We are looking at the likelihood for the mean ($\mu$) given a sample of data ($\mathbf{x}$) so we treat the above as a function of $\mu$. Given a sample $\mathbf{x}$, we can formulate the likelihood function for $\mu$.
$$L(\mu|\mathbf{x},\sigma) =-n\log\sqrt{2\pi} - n\log\sigma -\frac{1}{2\sigma^2}\sum\limits_{x_i \in \mathbf{x}}(x_i-\mu)^2$$
This function will be quadratic in $\mu$. Therefore, when we take $\frac{d^2L}{d\mu^2}$ we will get a constant value (due to it being quadratic). Specifically:
$$\frac{dL}{d\mu} =\frac{1}{\sigma^2}\sum\limits_{x_i \in \mathbf{x}}(x_i-\mu) \implies \frac{d^2L}{d\mu^2} = \frac{-n}{\sigma^2} = \textrm{constant} $$
Therefore,
$$-E_{\mu}\left(\frac{-n}{\sigma^2}\right) = \frac{n}{\sigma^2} = I(\mu)$$
Which is the Fisher Information about $\mu$, but:
$$ se(\hat \mu) = \frac{\sigma}{\sqrt{n}} \implies I(\mu) = \frac{1}{se(\mu)^2} = \frac{1}{\sigma^2_{\hat \mu}}$$
Therefore, the Fisher Information is the inverse of the variance of the MLE.