The physical interpretation of the fractional order derivative may best be described by the Grunwald-Letnikov definition. Without loss of generality, consider the autonomous dynamical system:
$$\dot{x}=f(x)$$
We can model this dynamical system in discrete-time. Like this:
$$\frac{x(t)-x(t-\delta)}{\delta}\approx f\left(x(t-\delta)\right)$$
Or in traditional form, by replacing $t-\delta$ with $k$:
$$x(k+1)=F(x(k))$$
What does the above equation say? The evident fact about this dynamical model is, the future state of the system is only a function of the current state.
Now think about how do we generalize the definition of the derivative. We know that:
$$\mathcal D^1 x(t)= x'(t)=\lim_{h\to 0}{\frac{x(t)-x(t-h)}{h}}$$
By induction and using the Leibniz rule:
$$\mathcal D^n x(t)= x^{(n)}(t)=\lim_{h\to 0}{\frac{1}{h^n}\sum_{i=0}^{\infty}(-1)^i \binom{n}{i} x(t-ih)}$$
This definition is generalized by Grunwald-Letnikov for fractional derivatives, replacing $n$ by $\alpha\in\mathbb R_+$ and defining:
$$\binom{\alpha}{i}=\frac{\Gamma(\alpha+1)}{\Gamma(\alpha-i+1)\text{ }i!}$$
Back to the story, now consider the fractional-order system:
$$\mathcal D^{\alpha} x(t)=f(x)$$
We can say that:
$${\frac{1}{\delta^{\alpha}}\sum_{i=0}^{\infty}(-1)^i \binom{\alpha}{i}x(t-i\delta)}\approx f\left(x(t-\delta)\right)$$
Hence
$$x(t)=\delta^{\alpha}f\left(x(t-\delta)\right)+\alpha x(t-\delta)- \sum_{i=2}^{\infty}(-1)^i \binom{\alpha}{i}x(t-i\delta)$$
Similar to the integer-order system, one can convert the above equation to a standard form like:
$$x(k+1)=F(x(k),\alpha)+\sum_{i=1}^{\infty}(-1)^i \binom{\alpha}{i+1} x(k-i)$$
Note the difference of the above equation with the integer-order model. As you see, the next state of the system is not only a function of the current state, but also a function of all previous states (some call it the memory of the system). Since $\binom{\alpha}{i+1}$ gets smaller as $i$ increases, the effect of this memory fades in time.
tl,dr
According to the Grunwald-Letnikov definition, one may conclude that those systems with effective memories can be better modeled by fractional derivatives. There are already tons of papers with this approach in various areas such as visco-elasticity, diffusion, fluid dynamics, etc. and the number is growing fast.