There isn't really an issue with taking derivatives of stochastic processes like $W$, so long as you interpret the resulting process appropriately. Even the usual white noise process "$\xi = \frac{dW}{dt}$" should really be interpreted as a generalized stochastic process, that is the realizations of $\xi$ are generalized functions. This is because - as you state - realizations of $W$ are almost surely nowhere differentiable. However, they do have derivatives "in the sense of distributions" that is, generalized derivatives, and this is one way to attack the problem (the Ito calculus/stochastic differential form $dW_t$ approach is another way). If you have never seen the theory of generalized functions (called distributions elsewhere but in probability that word has another meaning), the following will probably not make too much sense to you, but this is how I work with these things. Gel'fand and Vilenkin ("Generalized Functions Volume IV") is the classic reference for this approach but there are probably better modern refs.
To define a generalized stochastic process $\eta$, you fix a space of test functions - usually smooth, compactly supported functions $\mathcal{D} = C_0^\infty$. Then, a generalized stochastic process $\eta(\omega)$ is a random element of $\mathcal{D}^\prime$ (a map $\eta:\Omega\rightarrow\mathcal{D}^\prime$ where $(\Omega,\mathcal{F},\mathbb{P})$ is a probability space). A much more convenient way to say this is that given any test function $\varphi\in\mathcal{D}$, we have that
$$
X_\varphi = \langle \eta,\varphi\rangle
$$ is an ordinary real random variable. The bracket notation is intended to "look like" an inner product, i.e. you can think of $\langle \eta,\varphi\rangle = \int \eta(x)\varphi(x)dx$, though this isn't really correct because $\eta$ is "not a function".
The mean and covariance are then defined as
$$
\langle\mathbb{E}[\eta],\varphi\rangle = \mathbb{E}[\langle\eta,\varphi\rangle] = \mathbb{E}[X_\varphi]
$$ and
$$
Cov(\varphi,\psi) = \mathbb{E}[X_\varphi X_\psi]
$$ From this, you can extract the covariance operator via
$$
\mathbb{E}[X_\varphi X_\psi] = \langle \mathcal{C}\varphi,\psi\rangle
$$ This formula is difficult to parse until you work some examples - we'll see in a second how this works.
Returning to your original question: suppose we want to define $\dot{W}$ using this approach. Well, in the theory of generalized functions, we have the definition
$$
X_\varphi = \langle \dot{W},\varphi\rangle = - \langle W,\dot{\varphi}\rangle
$$The negative sign comes from "integration by parts". Now, because $W$ is (almost surely) continuous and $\dot{\varphi}$ is smooth, we can use integrals instead of "abstract brackets":
$$
X_\varphi(\omega) = -\int_{-\infty}^\infty W(t,\omega)\dot{\varphi}(t) dt
$$ Thus (interchanging limits requires a moment of justification):
$$
\mathbb{E}[X_\varphi(\omega)] = -\int_{-\infty}^\infty \mathbb{E}[W(t,\omega)] \dot{\varphi}(t) dt = 0
$$ and
$$
\mathbb{E}[X_\varphi(\omega)X_\psi(\omega)] = \int_{-\infty}^\infty\int_{-\infty}^\infty \mathbb{E}[W(s,\omega)W(t,\omega)] \dot{\varphi}(s)\dot{\psi}(t) dsdt = \int_{-\infty}^\infty\int_{-\infty}^\infty \min(s,t) \dot{\varphi}(s)\dot{\psi}(t) dsdt
$$ To see how this results in "$k(s,t) = \delta(s-t)$" covariance, you do a bit of calculus, remembering that $\varphi(s)$ and $\psi(t)$ are smooth and compactly supported so all the integration by parts boundary terms vanish, and you see that
$$
\int_{-\infty}^\infty\int_{-\infty}^\infty \min(s,t) \dot{\varphi}(s)\dot{\psi}(t) dsdt = \int_{-\infty}^\infty \varphi(t) \psi(t) dt
$$ Thus we have written
$$
\mathbb{E}[X_\varphi X_\psi] = \langle \mathcal{C}\varphi,\psi\rangle
$$where $\mathcal{C}$ is the "identity operator", that is the convolution operator with kernel $\delta(s-t)$.
If you want to do the same thing but with $\ddot{W}$, you would start with the definition of the generalized ("distributional") second derivative:
$$
\langle\ddot{W},\varphi\rangle = \langle W,\ddot{\varphi}\rangle
$$ You can then work through the same process to see that
$$
\langle\mathbb{E}[\ddot{W}],\varphi\rangle = \langle\mathbb{E}[W],\ddot{\varphi} \rangle = 0
$$ and
$$
\mathbb{E}[X_\varphi X_\psi] = \int_{-\infty}^\infty\int_{-\infty}^\infty\min(s,t) \ddot{\varphi}(s)\ddot{\psi}(t) dsdt = -\int_{-\infty}^\infty \ddot{\varphi}(t)\psi(t) dt = \langle \mathcal{C}\varphi,\psi\rangle
$$ Thus the covariance operator is the negative second derivative, i.e. the covariance kernel function is $-\ddot{\delta}(s-t)$.
Additional note In response to a good comment, how do we know that the processes $\dot{W}$ and $\ddot{W}$ are Gaussian? First, a generalized Gaussian random process $\eta$ is one for which any random vector formed by testing against $N$ functions is (multivariate) Gaussian, i.e. if
$$
X_{\varphi_1:\varphi_N} = [\langle \eta,\varphi_1\rangle,\ldots,\langle \eta,\varphi_N\rangle ]^t \in \Bbb{R}^N
$$ then $\eta$ is Gaussian if and only if $X_{\varphi_1:\varphi_N}$ is Gaussian for every choice of $(\varphi_1,\ldots,\varphi_N)\in \mathcal{D}^N$. With this definition, it is easy to show that if $W$ is a classical Gaussian random process - say one with almost surely continuous paths such as the Wiener process - then $W$ is also a generalized Gaussian random process.
Then, Gaussianity of the (generalized) derivatives of $W$ follows from the definitions
$$
\langle\dot{W},\varphi\rangle := - \langle W,\dot{\varphi}\rangle\\
\langle\ddot{W},\varphi\rangle := \langle W,\ddot{\varphi}\rangle
$$ Since $W$ is a generalized Gaussian R.P., $\dot{W}$ and $\ddot{W}$ are as well.