Yes, as mentioned in the comments, the energy energy integral is useful here. Let $u$ be defined as in the question.
Define
$$E(t)=\int_{U}{u(t,x)}^2~\mathrm d^m x$$
Note $E$ is bounded below by $0$.
Now, observe
$$\dot E(t)=\int_U 2 ~u(t,x)~\partial_tu(t,x)~\mathrm d^m x \\ =2\int_U (u ~\Delta u)(t,x)\mathrm d^mx \\ =2\int_U \big(u ~\nabla\cdot( \nabla u)\big)(t,x)\mathrm d^mx$$
Recall the generalized integration by parts:
$$\int_U \phi~\nabla\cdot v~\mathrm d\mu^m=\int_{\partial U}n\cdot \phi v~\mathrm d\mu^{n-1}-\int_{U}v\cdot \nabla\phi~\mathrm d\mu^m$$
Taking in our case $\phi=u$ and $v=\nabla u$, we get
$$\dot E(t)=2\int_U \big(u ~\nabla\cdot( \nabla u)\big)(t,x)\mathrm d^mx \\ =2\int_{\partial U} \big( n\cdot (u\nabla u)\big)(t,x)\mathrm d^m x-2\int_U |\nabla u|^2(t,x)\mathrm d^m x$$
The first integral is zero due to the assumptions on the boundary data of $u$, and therefore we obtain
$$\dot E(t)=-2\int_U|\nabla u|^2(t,x)\mathrm d^m x$$
Poincare's inequality implies
$$\dot E(t)=-2\int_U |\nabla u|^2\mathrm d^m x=2{\left\Vert\nabla u(t,\cdot)\right\Vert_2}^2\leq -2C {\Vert u(t,\cdot)\Vert_2}^2$$
Since ${\Vert u(t,\cdot)\Vert_2}^2=E$, we have
$$\dot E\leq -2C E$$
Hence by Gronwall's inequality we get $E(t)\leq \mathrm e^{-2Ct}E(0)$ which implies $E\to 0 $ which implies ${\Vert u(t,\cdot)\Vert_2}\to 0$ as $t\to\infty$.
So, we have shown that $\Vert u(t,\cdot)\Vert_2\to 0$, but this is not enough to show that $\Vert u(t,\cdot)\Vert_\infty\to 0$, as desired in the question. However, this is rectified using the strong maximum principle.
Define the parabolic cylinder and its boundary
$$U(T):=(0,T]\times U \\ \Gamma(T)=\overline{U(T)}\setminus U(T)=(\{0\}\times \bar U)\cup ([0,T]\times\partial U)$$
THEOREM: Strong maximum principle for the heat equation.
Assume $u$ is a classical solution of the heat equation in $U(T)$. Then, (i)
$$\max_{\overline{U(T)}}u=\max_{\Gamma(T)}u$$
Furthermore (ii), if $U$ is connected and there exists a point $(t_0,x_0)\in U(T)$ such that $u(t_0,x_0)=\max_{\overline{U(T)}}u$, then $u$ is constant in $\overline{U(t_0)}$.
(For proof: See page 55 of Evans PDE book.)
The (ii) statement means that, if ever $u$ assumes its maximum inside $U$ at any positive time, then $u$ must be constant.
In our case, we know that $u=0$ on $\partial U$. That means that, aside from the trivial case $u\equiv 0$, we know that
$$\operatorname{argmax}_{\overline{U(T)}}|u|\in \{0\}\times U$$
I.e, it must occur in the open domain $U$ at $t=0$. However, the (ii) statement of the above theorem now tells us that , aside from the trivial solution $u\equiv 0$, for all times $t>0$,
$$\sup_U |u(t,\cdot)| < \sup_U |u(0,\cdot)|$$
Because otherwise , $u$ would be constant in the domain $U(t)$, and we know the only constant solution satisfying our BCs is the zero solution. So, we have shown that the function
$$M(t)=\sup_{U}|u(t,\cdot)|$$
Is a decreasing function bounded below by zero, and hence $M\to M_\infty\in\mathbb R_{\geq 0}$.
To show $M_\infty=0$ is a little bit more difficult. But essentially, the idea is, the only way for $\Vert u(t,\cdot)\Vert _2\to 0$ (as already shown) while maintaining $\Vert u(t,\cdot)\Vert_\infty \to M_\infty >0$ would be if the family of functions $u(t,\cdot)$ "clustered" around some finite collection of points $\{x_0,x_1,...x_{N-1}\}$, i.e $u(t,x)\to 0$ as $t\to\infty$ for all $x\in U\setminus \{x_0,...,x_{N-1}\}$ but with $u(t,x)\to L\leq M_\infty$ for all $x\in \{x_0,...,x_{N-1}\}$.
However, such "clustering" is impossible, as it would violate the smoothing properties of the heat equation. To make the proof easier, assume that only one "cluster point" $x_0$ exists satisfying $\Vert u(t,x_0) \Vert_\infty\to M_\infty>0$. Since we already know that ($\star$) $u(t,x)\to 0$ as $t\to\infty$ $\forall x\neq x_0$, this means that, given $t$ large enough and $\epsilon$ small enough, we can make the derivative estimate
$(\star) ~:~ \text{proof needed!}$
$$\left|\frac{u(t,x_0+\epsilon\upsilon)-u(t,x_0)}{\epsilon}\right| \\ (\upsilon \in \mathbb R^m , |\upsilon|=1)$$
Arbitrarily large. But, by the mean value theorem, we know that $\exists x^*$ on the line segment connecting $x_0$ and $x_0+\epsilon\upsilon$ such that
$$\upsilon \cdot \nabla u(t,x^*)=\frac{u(t,x_0+\epsilon\upsilon)-u(t,x_0)}{\epsilon}$$
Which implies
$$|\upsilon \cdot \nabla u(t,x^*)|=|\nabla u(t,x^*)|=\left|\frac{u(t,x_0+\epsilon\upsilon)-u(t,x_0)}{\epsilon}\right|$$
And since we know we can make the RHS arbitrarily large, that means we can make $|\nabla u(t,x^*)|$ arbitrarily large as long as we choose a point $x^*$ close enough to $x_0$ and a time $t$ large enough. However, we know from theorem 9 on page 61 of Evans PDE that $\exists c\in\mathbb R_+$ such that
$$\max_{C(t,x;r/2)}|\nabla u|\leq \frac{c}{r^{m+3}}\Vert u \Vert_{L^1\big(C(t,x;r)\big)}$$
Where $C(t,x;r)$ is the cylinder
$$C(t,x;r)=\{(s,y)\in\mathbb R_{\geq 0}\times\mathbb R^m : |x-y|\leq r ~\text{and}~t-r^2\leq s\leq t\}$$
But, since $L^2$ convergence implies $L^1$ convergence we know that $\Vert u \Vert_{L^1\big(C(t,x;r)\big)}\to 0$ as $t\to\infty$, and thus our ability to make $|\nabla u|$ arbitrarily large would violate our initial assumptions. Therefore, it is not possible for the families of functions $u(t,\cdot)$ to "cluster" around a point $x_0$ and therefore the only possible value for $M_\infty$ is $0$, in other words,
$$\boxed{\Vert u(t,\cdot)\Vert_\infty\to 0~~\text{as}~t\to\infty}$$
As desired. $\blacksquare$.
Addendum: Details.
The key point is to show that $\Vert u \Vert_{L^1\big(C^1(t,x;r)\big)}\to 0$ as $t\to\infty$. We already know that $E(t)={\Vert u(t,\cdot) \Vert_2}^2\leq \mathrm e^{-2Ct}E(0)$ from Gronwall's inequality. This means that, for any $0<T<t$, we have
$$\sqrt{\int_{t-T}^{t}{E(t')}^2\mathrm dt}=\Vert u \Vert_{L^2\big(U(t)\setminus U(t-T)\big)}\to 0 ~~\text{as}~t\to\infty$$
But, since $C(t,x;\sqrt{T})\subset \big(U(t)\setminus U(t-T)\big)$ which implies $\Vert u \Vert_{L^2\big(C(t,x;\sqrt{T})\big)}\to 0$ as $t\to\infty$ for all positive $T$. But, since $L^2$ convergence implies $L^1$ convergence, we get $\Vert u \Vert_{L^1\big(C(t,x;r)\big)}\to 0 $ as $t\to\infty$ for any $x\in U$ and suitable $r>0$.