So I am a physicist, and in general I deal with so-called Langevin equations of the type \begin{equation} \frac{dx}{dt} = f(x,t) +g(x,t)\xi(t), \end{equation} where $\xi(t)$ is what we call a "random Langevin force". In the case of white noise, this force is delta-correlated in time, such that \begin{equation} \mathbf{E}\left[\xi(t)\xi(t')\right] = \delta(t-t'). \end{equation} The Langevin equation can also be expressed in terms of the Ito equation \begin{equation} dx = f(x,t)dt+g(x,t)dW(t), \end{equation} where the Wiener process $dW(t)=\xi(t)dt$. Basically this means that \begin{equation} \xi(t) = \frac{dW(t)}{dt}, \end{equation} even though technically the Wiener process is not differentiable. This is where things get confusing for me. I know that $\mathbf{E}[dW(t)] = 0$ and Var$[dW(t)] = dt$, but how exactly is it that $\xi(t)$ has delta autocorrelation? Is there a mathematical way to get to this result? \begin{equation} \mathbf{E}\left[\xi(t)\xi(t')\right] = \mathbf{E}\left[\frac{dW(t)}{dt}\frac{dW(t')}{dt'}\right] = \delta(t-t')\ ? \end{equation} I have looked everywhere in the literature and I can't see how this is obtained, it seems to be a detail that is glossed over all the time, especially in physics.
If $\xi(t)$ is truly represented as the rate of change in the Wiener process $W(t)$, then surely there must be a mathematical way to derive the result $\mathbf{E}\left[\xi(t)\xi(t')\right] = \delta(t-t')$ from the properties of $W(t)$ alone right?