Imagine you have a function like this: $$f(x) = p_3(p_2(p_1(x))).$$
Now imagine that you find a pair $\Delta_0, \Delta_1$ such that $p_1(x \oplus \Delta_0) = p_1(x) \oplus \Delta_1$ with probability $2^{-n_1}$, $\Delta_2$ such that $p_2(x \oplus \Delta_1) = p_2(x) \oplus \Delta_2$ with probability $2^{-n_2}$, and $\Delta_3$ such that $p_3(x \oplus \Delta_2) = p_3(x) \oplus \Delta_3$ with probability $2^{-n_3}$. Then:
- $\Delta_0 \xrightarrow{f} \Delta_3$ is a differential for $f$ (with probability at least $2^{-n_1 n_2 n_3}$ if we can assume independence);
- $\Delta_0 \xrightarrow{p_1} \Delta_1 \xrightarrow{p_2} \Delta_2 \xrightarrow{p_3} \Delta_3$ is a differential trail, also known as differential characteristic, for $f$ (with probability $2^{-n_1 n_2 n_3}$ if we can assume independence).
So, a differential trail contains not only the input and output differences going into your function, but also all of the intermediate values (and respective probabilities). You can also say that a differential is the combination of all the differential trails with the same input and output values.