What kind of approximation is used in deriving Runge-Kutta methods?

Question

I've been trying to get a better understanding of how Runge-Kutta methods are derived by reading the explanations found for example in this and this answers. I am, however, a bit confused as to what kind of approximation exactly we are using for the integral.

Preliminaries and context

Following the notation of this answer, consider an initial value problem $x'(t)=f(t,x(t))$.

Given a step size $h$, the naive way to compute $x(t+h)$ would be to do the standard approximation $\int_t^{t+h}g\simeq hg'(t)$ with the function $t\mapsto f(t,x(t))$, thus obtaining: $$x(t+h)= x(t)+\int_t^{t+h}\!d\tau f(\tau,x(\tau)) \simeq x(t)+h[\underbrace{f_t(t,x(t))+f_x(t,x(t))f(t,x(t))}_{\equiv f_t+f_xf}].$$ I recon we do not do this, however, because we do not want an expression with $f_t$ or $f_x$. Fine. We then go and try a different approach, which is to write $$\int_t^{t+h}\!d\tau f(\tau,x(\tau))\simeq h\sum_{i=1}^N \omega_i f(t+\nu_i h,x(t+\nu_i h)),$$ for some yet do be determined coefficients $\nu_i$ and $\omega_i$. This still look fine, as it seems that we are simply going for a Newton-Cotes approximation of the integral. However, if we were actually doing a Newton-Cotes approximation, the $\omega_i$ coefficients would be independent by $f$, and only determined by the way we decided to partition the interval $(t,t+h)$ (that is, by the coefficients $\nu_i$).

So we are not doing Newton-Cotes, I guess because that would require us to know $x(t+\nu_i h)$, which we still don't know. Ok. We are instead trying a different kind of approximation for the integral, which consists in writing $$x(t+h)-x(t)=\int_t^{t+h}\!d\tau f(\tau,x(\tau))\simeq\sum_i \omega_i K_i,$$ with \begin{align} &K_1\equiv h f(t,x(t)), \\ &K_2\equiv h f(t+\alpha h,x(t)+\beta_1 K_1), \\ &K_3\equiv h f(t+\alpha'h,x(t)+\beta_1' K_1+\beta_2' K_2)), \end{align} and so on. We then Taylor-expand the $x(t+h)-x(t)$ term of the LHS and find the parameters that make the equation satisfied.

Actual question

I do not understand what sort of approximation is this. Why use this specific kind of structure for the $K_i$? Is there any intuition or justification behind this choice, apart from the mere fact that it works?

Lutz Lehmann · Answer 1 · 2019-05-31T09:29:16.690

It is easy to check that you get from the Euler method to a second order method by taking the slope at the midpoint as in the explicit midpoint or modified Euler method $$y(x+Δx)=y(x)+f(x+\tfrac12Δx, y(x)+\tfrac12f(x,y(x))Δx)Δx$$ or by combining two slopes as in the explicit trapezoidal or improved Euler or Heun's 2nd order method $$y(x+Δx)=y(x)+\tfrac12f(x,y(x))Δx+\tfrac12f(x+Δx, y(x)+f(x,y(x))Δx)Δx.$$

Heun in 1900 generalized the structural elements of these two methods by computing the step increment as a linear combination of "rough" increments $$ y(x+h)=y(x)+Δy,\\Δy=\sum_{\nu=1}^n\alpha_\nu\, Δ^0_νy $$ where then the increments $Δ^0_νy$ for each term are computed in a series of nested Euler steps $$ Δ^0_ν y=f(x+ε^1_ν\,Δx,y(x)+ε^1_ν\,Δ^1_ν y)Δx,\\ Δ^1_ν y=f(x+ε^2_ν\,Δx,y(x)+ε^2_ν\,Δ^2_ν y)Δx,\\...,\\Δ^{m_\nu}_νy=f(x,y(x)) Δx. $$

The original 3rd order Runge method (1895) also had this "comb" structure. The "comb" structure makes the Taylor analysis for the order conditions relatively easy, at the cost of more terms in the methods found. Heun was also motivated by the high order Gaussian quadrature rules and computed examples that are based on some of them.

Kutta (1901) observed that this "comb" structure is in general wasteful in terms of evaluations of $f$ and proposed to use all values of $f$ previously computed in the step $x\to x+Δx$ in the computation of the next of the increments $Δ_ν y$ for the final linear combination $Δ y=\sum α_ν\,Δ_ν y$. This ends up to be the structure that we know today as explicit Runge-Kutta method. Then in exploring the conditions for 4th order methods he explored examples that devolve to the Simpson quadrature rule when applied to $f$ that do not depend on $y$. (See also my answer here for exact citations.)

thanks! I don't fully understand the first formula you write though. If I apply Euler's method to get $y(x+\Delta x)$ from $y(x+\Delta x/2)$ the expression I get is a bit different than yours: $y(x+\Delta x)=y(x+\Delta x/2)+f(x+\Delta x/2,y(x+\Delta x/2))\Delta x/2$, and applying Euler again I would get $y+\Delta x/2 f + \Delta x/2 f(x+\Delta x/2,y+\Delta x/2 f)$ — glS, May 31 '19 at 08:58
This is the explicit midpoint method using that $y(x+Δx)=y(x)+y'(x+\tfrac12Δx)Δx+O(Δx^3)$. The implicit midpoint method, to compare, is a combination of an implicit and explicit Euler step with half the step size. — Lutz Lehmann, May 31 '19 at 09:04
ah, that makes sense, thanks. What about the second formula? Where does that come from? Might there be a typo in there? It looks like there is a $\Delta x^2$ term in the second $f$ — glS, May 31 '19 at 09:16
Thanks. I first wrote the $Δx$ in front, then changed the order of factors to look like the original sources. The trapezium formula uses that $y'(x+\tfrac12Δx)=\tfrac12(y'(x)+y'(x+Δx))+O(Δx^2)$ so that the method still has local truncation error $O(Δx^3)$ and thus global error order $2$. — Lutz Lehmann, May 31 '19 at 09:24

What kind of approximation is used in deriving Runge-Kutta methods?

Preliminaries and context

Actual question

1 Answers1

Linked

Related