I don't know if this is what you're looking for or if the following answer is useful to you, but let me write it here anyways.
I'm no expert on the Weierstrass transforms (or any transforms for that matter) but the Gaussian $$ W_t(y) = \frac{1}{(4\pi t)^{n/2}} e^{- \frac{|x|^2}{4t} } $$ is actually the convolution kernel of the so-called heat semigroup.
It is a standard result that $\hat{W}_t(\xi) = e^{-t|\xi|^2}$, i.e. that the Fourier transform of a Gaussian is once again a Gaussian (I don't have a reference for that result on hand at the moment, but I can give one if needed).
Since the Fourier transform maps convolution to multiplication, we obtain that $ (W_s * W_t)(x) = W_{s+t}(x)$; this is where the name semigroup comes from.
The reason I'm typing this is the following:
If we differentiate the Fourier transform of $W_t$ with respect to $t$ we arrive at $\partial_t \hat{W}_t(\xi) = -|\xi|^2 \hat{W}_t(\xi)$. However, the multiplication with $-|\xi|^2$ in Fourier space corresponds to taking the Laplacian (the second derivative in 1 dimension, otherwise it is the trace of the Hessian matrix), that is, $\partial_ t W_t(x) = \Delta W_t(x)$ (this can also be checked by a brute force calculation).
This now bears a striking resemblance to the ordinary differential equation $y'(t) = Ay(t)$, the solution to which is of the Form $y(t) = e^{tA}y(0)$. So, pretending that $\Delta$ is actually not some differential operator but rather a scalar, we would actually have the expression $W_t(x) = e^{t\Delta} \delta_0(x)$, which is also what you obtained.
To be a little more rigorous, the family of convolution operators $H_t : u \mapsto W_t * u$, defined on a suitable space of functions, say $L^p(\mathbb{R}^n)$ for $1 \leq p < \infty$ or $C_0(\mathbb{R}^n)$, furnishes a semi-group of continuous linear operators that is furthermore strongly continuous, meaning that, for any fixed function $u$, $t \mapsto W_t * u$ is continuous as a (Banach space valued) function. Also, $\lim\limits_{t \to 0^+} W_t * u = u$ in the appropriate topology also holds (this basically means that $W_0 = \delta_0$, or that $W_t$ is an approximate convolution identity), i.e. $\{ H_t \}_{t \geq 0}$ is what is known as a strongly continuous $C_0$-semigroup.
Now for the interesting part: If for a function $u$ the limit $$ L := \lim\limits_{t \to 0^+} \frac{W_t * u - u}{t} $$ exists in the relevant function space, then it turns out that $L = \Delta u$. This is why the heat semigroup is also sometimes written as $e^{t\Delta}$, or, in your case $e^{tD^2}$.
To maybe get to the point:
The Weierstrass transform (or, alternatively, the Heat semi-group), can be interpreted as a device to give meaning to the expression $e^{tD^2}$ ($e^{t\Delta}$) and to also extend it. For example, for a merely continuous (and bounded) function, the formal expression $e^{D^2}$ as a power series in $D$ surely does not make any sense, the Weierstrass transform, however, certainly does. Also, the two different expressions coincide if the function is, say analytic (analyticity should also be necessary I think, as otherwise the series that results from applying $e^{D^2}$ probably doesn't converge).
In general, if $e^{D^2}$ exists for an analytic function $u$, i.e. the resulting series converges, it would make sense that the same holds true for $W[u]$, I can't really think of a proof for that fact, though.
Also, since you've already used the fact that $f(x-y) = e^{-yD}f(x)$ in your answer, I'll note that this can also be made rigorous by the method of semigroups, as $D$ actually generates the translation semigroup.
Again, I'm sorry if this answer is useless to you but I hope that it might shed some light on the nature of formal expressions like $e^{D^2}$.
edit: Since this is too long for a comment, let me edit my answer. This is partially in response to Steven's comment.
Having taken another look, I think a sufficient condition is something like $u$ having an integrable (distributional) Fourier transform such that the sequence $$ \mu_k := \int_{\mathbb{R}} |\xi|^{2k} |\hat{u}(\xi)| d\xi $$ satisfies $\left( \frac{\mu_k}{k!} \right)_{k \in \mathbb{N}} \in \ell^1(\mathbb{N}) $. You also obtain $e^{tD^2}u = W_t * u$ by the Fourier inversion theorem, as, if the sequence $\mu_k$ has the desired property, the dominated convergence theorem applies, yielding $$ e^{tD^2}u(x) = \int_{\mathbb{R}} e^{-t|\xi|^2} \hat{u}(\xi) e^{ix\xi}d\xi = (W_t * u)(x). $$
This condition implies analyticity (differentiation in coordinate space corresponds to multiplication by $\xi$ in Fourier space).
Functions whose Fourier transform is compactly supported ($u(t) := \frac{\sin t}{t}$, for example) satisfy it, which are restrictions of entire functions to $\mathbb{R}$ (i.e. analytic) which grow at most exponentially at infinity (Paley Wiener theorem), suggesting that the coefficients in the series expansion should also be of order $\frac{1}{n!}$, as outlined in my comment.
However, if we take $u$ to be a Gaussian for example, i.e. $\hat{u}(\xi) = e^{-|\xi|^2}$, then $\frac{\mu_k}{k!} = \frac{\Gamma(k+\frac{1}{2})}{\Gamma(k+1)}$ up to a positive constant which is of order $k^{-\frac{1}{2}}$ as $k \to \infty$, so the Gaussian does not satisfy the conditions on the Fourier transform. The identity $e^{tD^2}u(x) = \sum_{n=0} u^{2n}(x) \frac{t^n}{n!}$ should still hold, as calculated in the original post.
So the condition I've given is only sufficient, not necessary. Similarly, the coefficients $a_k$ in the series expansion of $u$ being of order $\frac{1}{k!}$ is also sufficient, but not necessary either, as the Gaussian once again shows (here, $a_k =\Gamma(k/2+1)$ if $k$ is even and $a_k = 0$ if $k>0$ and $k$ is odd).