It is because there is a tangential diffeomorphism that is involved.
In general for any family of immersions $\phi : M\times [0,T)\to \mathbb R^{n+1}$, the MCF equation is $\partial_t \phi=H\nu$. But one can also consider
$$\tag{1} (\partial_t \tilde \phi )^\perp = H\nu, \text{ or } \langle \partial_t \tilde \phi, \nu \rangle = H. $$
The difference is that $\partial _ t\tilde \phi$ might contain non-zero tangential part $X = (\partial_t \tilde \phi)^\top$. However, one can consider the following: let $Y$ be the time-dependent vector fields on $M$ such that $\tilde \phi_* Y = X$. Consider the one-parameter groups of diffeomorphism on $M$ generated by $-Y$:
$$\tag{2} y: M \times [0,T) \to M, \ \ \frac{\partial y}{\partial t} (y(x, t), t) = -Y(y(x, t)).$$
Then $\tilde \phi$ satisfies (1) if and only if the composition
$$ \phi (x, t) = \tilde \phi((y(x, t), t)$$
satisfies the usual MCF.
Going back to the graphical cases. We are really solving (1), which becomes
$$ \frac{\partial _t f }{ \sqrt{1+ |\nabla f|^2}} = H\Rightarrow \partial _t f = \sqrt{1+|\nabla f|^2} \operatorname{div} \left(\frac{\nabla f}{\sqrt{1+|\nabla f|^2}}\right)
$$
Remark The trick to use (1) instead of the usual MCF is called the DeTurck trick and is orginated from Ricci flow. You can see that explained in here or p.548 here. (a related question)
Remark It seeems that whether or not (1) is really equivalent to the MCF depends on whether (2) as a ODE is solvable. When $M$ is compact this can always be done. In the non-compact cases, there might be some reason why (2) can always be solving, but I cannot find it in literature (As you can see in Ecker-Huisken's paper, they did not even bother to explain). In fact a doubt is raised in this MSE post