Numerical Differential Equations

Table of Contents

Notation

  • $L$ almost always referes to the Lipschitz constant
  • $f \big( t, y(t) \big)$ refers to LHS of a first-order system
  • $y_i$ denotes a "sampled" value of $y(t)$ at $t_i$, using our approximation to $y(t)$
  • $\Phi_{t_0, h}$ refers to a flow-map, with $\Phi_h$ assuming $t_0 = 0$ (as we can always do)
  • $\Psi_{t_0, h}$ refers to the discrete approx. of the flow-map
  • $D$ refers to the spatial domain, i.e. $x : y(t) = x$ for some $t$
  • $\mathbb{P}_s$ denote the space of real polynomials of degree $\le s$
  • $c_i$ denotes the i-th abscissa point

Definitions

Continuous mechanics

A system of differential equations for which $f(t, y)$ can be written as a function of $y$ only is an autonomous-differential-equation (often referred to as continuous mechanics):

\begin{equation*}
\frac{dy}{dt} = f(y), \quad y, f \in \mathbb{R}^d
\end{equation*}

Euler's method

We start by taking the Taylor expansion of the solution $y(t)$ of the system:

\begin{equation*}
\begin{split}
  y(t_0 + \Delta) &= y(t_0) + \Delta t y'(t_0) + \frac{\Delta t^2}{2} y''(\tau) \\
  \implies y(t_0 + \Delta t) &= y(t_0) + \Delta t y'(t_0) \\
  &= y(t_0) + \Delta t f \big( t_0, y(t_0) \big)
\end{split}
\end{equation*}

for some $\tau \in [t_0, t_0 + \Delta t]$.

We have the ODE

\begin{equation*}
\frac{dy}{dt} = f(t, y)
\end{equation*}

and the steps are defined as

\begin{equation*}
y_{n + 1} = y_n + h f(t_n, y_n)
\end{equation*}

for

\begin{equation*}
n = 0, 1, \dots, N - 1, \quad N h = T - t_0
\end{equation*}

where $y_n$ denotes our approximation for $y(t_n)$.

What we're doing in Euler's method is taking the linear approximation to $y(t)$ at each point $t_i$, but in an interative manner, using our approximation for $y$, at the previous step $t_{i - 1}$ to evaluate the point $t_{i - 1}$.

This is because we can evaluate $y'(t)$, since $y'(t) = f \big( t, y(t) \big)$ but we of course DON'T know $y(t)$.

y = @(t, y0) (y0 * exp(t) ./ (1 - y0 + y0*exp(t))
y0 = 0.2; T=0:0.5:5; plot(T, y(T, t0));

Butcher table

In matrix notation:

\begin{array*}{c|c}
  c & A \\
  \hline
  & b^T
\end{array*}
1

or elementwise:

\begin{array*}{c|ccc}
c_1 & a_{11} & \dots & a_{1s} \\
\vdots & \vdots & \dots & \vdots  \\
c_s & a_{s1} & \dots & a_{ss} \\
\hline
& b_1 & \dots & b_s
\end{array*}
1

Example of RK method with 4 stages:

\begin{array*}{c|cccc}
0 \\
1/2 & 1 / 2 \\
1/2 & 0 & 1/2 \\
1 & 0 & 0 & 1 \\
\hline
& 1 / 6 & 1/3 & 1/3 & 1/6
\end{array*}
1

Flow map

Consider the IVP:

\begin{equation*}
\frac{dy}{dt} = f(t, y), \quad y(a) = y_0
\end{equation*}

which is assumed to have a unique solution on $[a, b]$.

The flow map $\Phi$ is then an approximation to a specific solution (a trajectory), i.e. starting at some $t_0$ such that $y(t_0) = a$ and $t \le b$, which we write

\begin{equation*}
y(t_0 + h; t_0, y_0) = \Phi_{t_0, h}(y_0)
\end{equation*}

Observe that it's NOT a discrete approximation, but is continuous.

\begin{equation*}
\begin{split}
\log (z) &= \log r e^{i \theta} = \ln (|z|) + i \arg(z) = \{ \ln(|z|) + i \theta \mid \theta \in \arg(z) \} \\
&= \{ \ln(|z|) + i \text{Arg}(z) + 2 \pi i k \mid k \in Z \} \\
&= \{ \ln r + i \theta + i 2 \pi k \mid k \in Z  \}
\end{split}
\end{equation*}

Adjoint sensitivity analysis

The (local) sensitivity of the solution ot a parameter is defined by how much the solution would change by changes in the parameter, i.e. the sensitivity of the i-th independent variable to the j-th parameter is $\pdv{u_i}{p_j}$.

Sensitivity analysis serves two major purposes:

  1. Diagnostics useful to understand how the solution will change wrt. parameters
  2. Provides a cheap way to compute the gradient of the solution which can be used in parameter estimation and other optimization tasks

Theorems

Lipschitz convergence

Suppose a given sequence of nonnegative numbers satisfies

\begin{equation*}
v_{n + 1} \le A v_n + B
\end{equation*}

where $A, B$ are nonnegative numbers $A > 1$, then, for $n = 0, 1, \dots$

\begin{equation*}
v_n \le A^n v_0 + \frac{A^n - 1}{A - 1} B
\end{equation*}

Local Existence / Uniqueness of Solutions

Suppose $f: \mathbb{R} \times \mathbb{R}^d \to \mathbb{R}^d$ is:

  • continuous
  • has continuous partial derivatives wrt. all components of the dependent variable $y$ in a neighborhood of the point $(t_0, y_0)$

Then there is an interval $I = (t_0 - \delta, t_0 + \deta)$ and unique function $y$ which is continuously differentiable on $I$ s.t. it satisfies

\begin{equation*}
\frac{dy}{dt} = f(t, y), \quad y(t_0) y_0
\end{equation*}

Methods

Euler method

Trapezoidal rule

The Trapezoidal rule computes the average change on the interval $[t_n, t_{n + 1}]$, allowing more accurate approximation to the integral, and the update rule is defined as follows:

\begin{equation*}
y_{n + 1} = y_n = \frac{h}{2} \Big[ f(t_n, y_n) + f(t_{n + 1}, y_{n + 1}) \Big]
\end{equation*}

Heun-2 (implicit approx. to Trapezoidal rule)

The Heun-2 update rule is an implicit approximation to Trapezoidal rule, where we've replaced the explicit computation of the endpoint $f(t_{n + 1}, y_{n + 1})$ which depend on $y_{n + 1}$, with a function of the current estimate $y_n$:

\begin{equation*}
y_{n + 1} = y_n + \frac{h}{2} \Big[ f(t_n, y_n) + f \big(t_n + h, y_n + h f(t_n, y_n) \big) \Big]
\end{equation*}

Convergence of one-step methods

Notation

  • $i, j = 1, \dots, s$
  • $b_i := \int_{0}^{1} \ell_i(x) \ dx$
  • $a_{ij} := \int_{0}^{c_i} \ell_j(x) \ dx$
  • $Y_i = y_n + h \sum_{j} a_{ij} F_i$ denotes the stage values, s.t. $F_i = f(Y_i)$

Approximation

In this chapter we focus on the autonomous differential equation

\begin{equation*}
y'(t) = f \big( y(t) \big)
\end{equation*}

but the non-autonomous differential equation

\begin{equation*}
y'(t) = f \big( t, y(t) \big)
\end{equation*}

can be treated using similar methods.

The global error after $n$ time steps is the difference between the discrete approx. and the exact solution, i.e.

\begin{equation*}
e_n := y_n - y(t_n)
\end{equation*}

A method is said to be convergent if, for every $T$,

\begin{equation*}
\underset{h \to 0}{\lim} \max_{n = 0,1, \dots, N} \norm{e_n} = 0 \quad \text{where} \quad h = T / N
\end{equation*}

We have the Taylor expansion given by

\begin{equation*}
y(t + \tau) = y(t) + \tau y'(t) + \frac{\tau^2}{2}y''(t) + \dots
\end{equation*}

for some "perturbation" $\tau > 0$ about $t$.

For a scalar function, assuming $y(t)$ is $\nu$ times continuously differentiable (i.e. $y(t) \in \mathcal{C}^\nu$), Taylor's theorem states that

\begin{equation*}
\exists t^* \in [t, t + \tau] : \quad y(t + \tau) = \sum_{i=0}^{\nu - 1} \frac{\tau^i}{i!} \frac{d^i y}{dt^i}(t) + \frac{\tau^\nu}{\nu!} \frac{d^\nu}{dt^\nu} (t^*)
\end{equation*}

Thus, the norm of the last (remainder) term is bounded on $[t, t + \tau]$, and we have

\begin{equation*}
\norm{\frac{\tau^\nu}{\nu!} \frac{d^\nu y}{dt^\nu}(t)}\le C \tau^\nu
\end{equation*}

for some constant $C$, and thus we write

\begin{equation*}
y(t + \tau) = \sum_{i=0}^{\nu - 1} \frac{\tau^i}{i!} \frac{d^i y}{dt^i}(t) + \mathcal{O}(\tau^v)
\end{equation*}

in general.

For a vector function ($d > 1$), Taylor's theorem still holds componentwise, but in general the mean value will be attained at a different $t^*$ for each component.

Local error of a numerical method is the difference between the (continuous) flow-map $\Phi$ and its (discrete) approx. $\Psi$:

\begin{equation*}
le(y, h) = \Psi_h(y) - \Phi_h(y)
\end{equation*}

Which measures how much error is introduced in a single timestep of size $h$.

A method is said to be consistent if and only if the local error satisfies

\begin{equation*}
\norm{le(y, h)} \le C h^{p + 1}
\end{equation*}

where $C$ is a constant that depends on $y(t)$ and its derivatives, and $p \ge 1$.

A method is said to be stable if and only if it satisfies an h-dependent Lipschits condition on $D$ (the spatial domain)

\begin{equation*}
\norm{\Psi_h(u) - \Psi_h(v)} \le (1 + h \hat{L}) \norm{u - v}, \quad \forall u, v \in D
\end{equation*}

where $\hat{L}$ is not necessarily the Lipschitz constant for the vector field.

Given a differential equation and a generalized one-step method $\Psi_h$ which is consistent and stable, the global error satisfies

\begin{equation*}
\max_{n=0, \dots, N} \norm{e_n} = \mathcal{O}(h^p)
\end{equation*}

Let the error be

\begin{equation*}
e_n = y_n - y(t_n)
\end{equation*}

so

\begin{equation*}
e_{n + 1} = y_{n + 1} - y(t_{n + 1}) = \Psi_h(y_n) - \Phi_h \big( y(t_n) \big)
\end{equation*}

Then, adding and subtracting by $\Psi_h \big( y(t_n) \big)$,

\begin{equation*}
\begin{split}
  \norm{e_{n + 1}} &= \norm{\Psi_h(y_n) - \Psi_h \big( y(t_n) \big) + \Psi_h \big( y(t_n) \big) - \Phi_h \big( y(t_n) \big)} \\
  & \le \norm{\Psi_h(y_n) - \Psi_h \big( y(t_n) \big)} + \norm{\Psi_h \big( y(t_n) \big) - \Phi_h \big( y(t_n) \big)}
\end{split}
\end{equation*}

Then, using the fact that the method is consistent and stable, we write

\begin{equation*}
\begin{split}
  \norm{e_{n + 1}} & \le (1 + h \hat{L}) \norm{y_n - y(t_n)} + C h^{p + 1} \\
  &= (1 + h \hat{L}) \norm{e_n} + C h^{p + 1}
\end{split}
\end{equation*}

Using this theorem, yields

\begin{equation*}
\norm{e_n} \le \frac{Ch^{p + 1}}{\kappa - 1}(\kappa^n - 1) + \kappa^n \norm{e_0}
\end{equation*}

where $\kappa = 1 + h \hat{L}$. Finally, since $1 + h \hat{L} \le e^{h \hat{L}}$ and therefore $\kappa^n \le e^{T \hat{L}}$, if we assume the initial condition is exact ($e_0 = 0$), we get the uniform bound

\begin{equation*}
\norm{e_n} \le h^p \frac{C}{\hat{L}} \big( e^{T \hat{L}} - 1 \big)
\end{equation*}

which proves convergence at order $p$.

A really good way of estimating the order of convergence $p$ for the maximum global error wrt. $h$, is to plot $e$ vs. $h$ on a log-log plot since

\begin{equation*}
\log e = \log C + p \log h + \mathcal{O}(h)
\end{equation*}

Thus, $p$ is the slope of $\log e$.

problem-2loglog.png

Figure 1: Observe that for smaller values of $h$ we have $p = 1$, i.e. $e$ decreases wrt. $h$, while for larger values of $h$ we have $p < 1$, i.e. decrease in $e$ is not even on the order of 1.

Consider a compact domain $D \subset \mathbb{R}^d$ and suppose $f$ is a smooth vector field on $D$ and has a Lipschitz constant $L$ on $D$ (since $f$ is smooth, we can take $L = \max_D \norm{\frac{\partial f}{\partial y}}$).

Then, since

\begin{equation*}
\begin{split}
  \norm{\Phi_h(y) - \Phi_h(z)} &= \norm{ y + hf(y) - \big(z + hf(z) \big)} \\
  & \le \norm{y - z} + h \norm{f(y) - f(z)} \\
  & \le \norm{y - z} + h L \norm{y - z} \\
  &= (1 + hL) \norm{y - z}
\end{split}
\end{equation*}

the numerical flow map is Lipschitz with $\hat{L} = L$.

The exact solution satisfies

\begin{equation*}
\Phi_h(y) = y + h \frac{dy}{dt} + \frac{h^2}{2} \frac{d^2 y}{dt^2} + \mathcal{O}(h^3)
\end{equation*}

Therefore the local error is

\begin{equation*}
le(y, h) = y +  h f(y) - \Bigg[ y + hf(y) + \frac{h^2}{2} \frac{d^2 y }{dt^2} + \mathcal{O}(h^3) \Bigg] = \mathcal{O}(h^2)
\end{equation*}

and we can apply the convergence theorem for one-step methods with $C = \max_D \norm{\frac{d^2 y}{dt^2}}$ to show that Euler's method is convergent with order $p = 1$.

Construction of more general one-step methods

Usually one aims to approximate the integral between two steps:

\begin{equation*}
y(t + h) - y(t) = \int_{t}^{t + h} f \big( y(\tau) \big) \ d \tau
\end{equation*}

A numerical method to approximate a definite integral of one independent variable is known as numerical quadrature rule.

Example: Trapezoidal rule

The approximation rule becomes:

\begin{equation*}
\int_{t}^{t + h} f \big( y(\tau) \big) \ d \tau \approx \frac{h}{2} \bigg( f \big( y(t) \big) + f \big( y( t + h) \big) \bigg)
\end{equation*}

which results in the trapezoidal rule numerical method:

\begin{equation*}
y_{n + 1} = y_n + h \big( f (y_n) + f(y_{n + 1} \big) / 2
\end{equation*}

This is an implicit method, since we need to solve for $y_{n + 1}$ to obtain the step.

Polynomial interpolation

Given a set of abscissa points $c_1 < \dots < c_s, c_i \in \mathbb{R}$ and the corresponding data $g_1, \dots, g_s$, there exists a unique polynomial $P(x) \in \mathbb{P}_{s - 1}$ satisfying

\begin{equation*}
P(c_i) = g_i, \quad i = 1, \dots, s
\end{equation*}

called the interpolating polynomial.

The Lagrange interpolating polynomials $\ell_i, i = 1, \dots, s$ for a set of abscissae are defined by

\begin{equation*}
\ell_i(x) = \prod_{j =1, j \ne i}^s \frac{x - c_j}{c_i - c_j}
\end{equation*}

Observe that $\ell_i(x)$ then has the property

\begin{equation*}
\ell_i(x) =
\begin{cases}
  1 & \text{if } x = c_i \\
  0 & \text{if } x = c_j, \quad \forall j \ne i \\
  \frac{x - c_j}{c_i - c_j} & \text{otherwise}
\end{cases}
\end{equation*}

Thus, the set of Lagrange interpolating polynomials form a basis for $\mathbb{P}_s$. In this basis, the interpolating polynomial $P(x)$ assumes the simple form

\begin{equation*}
P(x) = \sum_{i=1}^{s} g_i \ell_i(x)
\end{equation*}

We also know from Stone-Weierstrass Theorem that the polynomials are dense in $\mathcal{C}(X)$ for some compact metric space $X$ (e.g. $\mathbb{R}$).

Also, if it's not 100% clear why $\ell_i$ form a basis, simply observe that if $x = c_i$ then

\begin{equation*}
\begin{split}
  \ell_i(c_i) &= \prod_{j =1, j \ne i}^s \frac{x - c_j}{c_i - c_j} \\
  &= \frac{c_i - c_1}{c_i - c_1} \frac{c_1 - c_2}{c_i - c_2} \cdots \frac{c_i - c_{i - 1}}{c_ - c_{i - 1}} \frac{c_i - c_{i + 1}}{c_i - c_{i + 1}} \cdots \frac{c_i - c_s}{c_i - c_s} \\
  &= 1
\end{split}
\end{equation*}

and if $i \ne j$, then we'll multiply by a factor of $c_i - c_i = 0$, hence

\begin{equation*}
\ell_j(c_i) = 0
\end{equation*}

as wanted. Thus we have $s = \dim \mathbb{P}_s$ lin. indep. elements, i.e. a basis.

Numerical quadrature

We approximate the definite integral of $g$ based on the interval $[0, 1]$ by exactly integrating the interpolating polynomial of order $s - 1$ based on $s$ points $0 < c_1 < \dots < c_s \le 1$.

The points $c_i$ are known as quadrature points.

We make the following observation:

\begin{equation*}
g(x) \approx P(x) = \sum_{i = 1}^{s} \ell_i(x) g_i
\end{equation*}

We're interested in approximating the integral of $g(t)$ on the interval $[t_0, t_0 + h]$, thus

\begin{equation*}
\int_{t_0}^{t_0 + h} g(x) \ dx = h \int_{0}^{1} g(t_0 + hx) \ dx
\end{equation*}

using the change of variables $t(x) = t_0 + hx$. With the polynomial approx. we then get

\begin{equation*}
\begin{split}
\int_{t_0}^{t_0 + h} g(x) \ dx &= h \int_{0}^{1} g(t_0 + hx) \ dx \\
& \approx \int_{0}^{1} P(x) \ dx \\
& = h \sum_{i=1}^{s} g_i \int_{0}^{1} \ell_i(x) \ dx \\
&= h \sum_{i=1}^{s} b_i g_i
\end{split}
\end{equation*}

where we've let

\begin{equation*}
b_i = \int_{0}^{1} \ell_i(x) \ dx
\end{equation*}

The integral-approximation above is called a quadrature formula.

By construction, a quadrature formula using $s$ distinct abscissa points will exactly integrate any polynomial in $\mathbb{P}_{s - 1}$ (remember that $b_i$ are integrals).

We can do better; we say that quadrature rule has order $p$ if it exactly integrates any polynomial $\mathbb{P}_{p - 1}$. It can be shown that $p \ge s$ always hold, and, for optimal choice of $c_i$, we have $p = 2s$.

That is, if we want to exactly integrate some polynomial of order $p$, then we only need half as many quadrature points $c_i$!

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

x = np.arange(10)
y = x + 6 * x ** 2 + 0.4 * x ** 3 - 0.01 * x ** 5


def polynomial_fit(abscissae, y):
    s = abscissae.shape[0]
    zeros = np.zeros((s, s - 1))

    for i in range(s):
        if i == 0:
            zeros[i] = x[i + 1:]
        elif i == s - 1:
            zeros[i] = x[:i]
        else:
            zeros[i] = np.hstack((x[:i], x[i + 1:]))

    denominators = np.prod(abscissae.reshape(-1, 1) - zeros, axis=1) ** (-1)

    def p(x):
        return np.dot(np.prod(x - zeros, axis=1) * denominators, y)

    return p

p = polynomial_fit(x, y)

xs = np.linspace(0, 10, 100)
ys = [p(xs[i]) for i in range(xs.shape[0])]
plt.plot(xs, ys)
plt.suptitle("Polynomial fit")
plt.title("$x + 6x^2 + 0.4x^3 - 0.01x^5$")
_ = plt.scatter(x, y, c="orange")

polynomial_fit.png

One-step collocation methods

Let $u(t) \in \mathbb{R}^d$ be the collocation polynomial of degree $s$, satisfying

\begin{equation*}
\begin{split}
  u(t_0) &= y_0 \\
  u'(t_0 + c_i h) &= f \big( u(t_0 + c_i h) \big), \quad i = 1, \dots, s
\end{split}
\end{equation*}

Then we attempt to approximate the gradient as a polynomial, letting $x = \frac{t - t_0}{h}$:

\begin{equation*}
u'(t) = \sum_{i=1}^{s} F_i \ell_i \bigg( \frac{t - t_0}{h} \bigg)
\end{equation*}

where we've let

\begin{equation*}
F_i = u'(t_0 + c_i h)
\end{equation*}

Then by the Fundamental Theorem of Calculus, and making the substitution $x = \frac{t - t_0}{h}$:

\begin{equation*}
u(t_0 + c_i h) - u(t_0) = \int_{t_0}^{t_0 + c_i h} u'(t) \ dt =  \int_{0}^{c_i} u'(x) h \ dx
\end{equation*}

And, using our Lagrange polynomial approximation:

\begin{equation*}
u'(t) = \sum_{i = 1}^{s} F_i \ell_i(x)
\end{equation*}

we get

\begin{equation*}
u(t_0 + c_i h) - u(t_0) = \int_{0}^{c_i} h \sum_{j=1}^{s} F_j \ell_j (x) \ dx
\end{equation*}

Aaand finally, $u(t_0) = y_0$ as stated earlier, hence

\begin{equation*}
u(t_0 + c_i h) = y_0 + h \sum_{j=1}^{s} F_j \int_{0}^{c_i} \ell_j(x) \ dx
\end{equation*}

for all $i = 1, \dots, s$. And since $F_i = u(t_0 + c_i h)$, we get

\begin{equation*}
F_i = y_0 + h \sum_{j = 1}^{s} F_j \int_{0}^{c_i} \ell_j(x) \ dx
\end{equation*}

which gives us a system of (non-linear) equations we need to solve simultaneously.

And therefore, our next step $y_{n + 1} \approx y(t_0 + h)$ is given by

\begin{equation*}
\begin{split}
  u(t_0 + h) & = u(t_0) + \int_{t_0}^{t_0 + h} u'(t) \ dt \\
  &= y_0 + h \int_{0}^{1} u'(x) \ dx \\
  &= y_0 + h \sum_{i = 1}^{s} \int_{0}^{1} F_i \ell_i(x) \ dx \\
  &= y_0 + h \sum_{i = 1}^{s} b_i F_i
\end{split}
\end{equation*}

where we've let

\begin{equation*}
b_i = \int_{0}^{1} \ell_i(x) \ dx
\end{equation*}

Thus,

\begin{equation*}
y_{n + 1} = y_n + h \sum_{i = 1}^{s} b_i F_i
\end{equation*}

Let $u(t) \in \mathbb{R}^d$ be the collocation polynomial of degree $s$, satisfying

\begin{equation*}
\begin{split}
  u(t_0) &= y_0 \\
  u'(t_0 + c_i h) &= f \big( u(t_0 + c_i h) \big), \quad i = 1, \dots, s
\end{split}
\end{equation*}

and let $F_i$ be the values of the (as of yet undetermined) interpolating polynomial at the nodes:

\begin{equation*}
F_i := u'(t_0 + c_i h), \quad i = 1, \dots, s
\end{equation*}

Then we attempt to approximate the gradient as a polynomial, letting $x = \frac{t - t_0}{h}$:

\begin{equation*}
u'(t) = \sum_{i=1}^{s} F_i \ell_i \bigg( \frac{t - t_0}{h} \bigg)
\end{equation*}

The following is something I was not quite getting to grips with, at first.

Now we make the following observation:

\begin{equation*}
u(t_0 + c_i h) - u(t_0) = \int_{t_0}^{t_0 + c_i h} u'(t) \ dt
\end{equation*}

by the Fundamental Theorem of Calculus. Thus, making the substitution $x = \frac{t - t_0}{h}$, gives

\begin{equation*}
u(t_0 + c_i h) - u(t_0) = \int_{0}^{c_i} u'(x) h \ dx
\end{equation*}

And, using our Lagrange polynomial approximation:

\begin{equation*}
u'(t) = \sum_{i = 1}^{s} F_i \ell_i(x)
\end{equation*}

we get

\begin{equation*}
u(t_0 + c_i h) - u(t_0) = \int_{0}^{c_i} h \sum_{j=1}^{s} F_j \ell_j (x) \ dx
\end{equation*}

Aaand finally, $u(t_0) = y_0$ as stated earlier, hence

\begin{equation*}
u(t_0 + c_i h) = y_0 + h \sum_{j=1}^{s} F_j \int_{0}^{c_i} \ell_j(x) \ dx
\end{equation*}

for all $i = 1, \dots, s$.

Let us denote

\begin{equation*}
a_{ij} := \int_{0}^{c_i} \ell_j(x) \ dx, \quad b_i := \int_{0}^{1} \ell_i(x) \ dx, \quad i, j = 1, \dots, s
\end{equation*}

Then, a collocation method is given by

\begin{equation*}
\begin{split}
  F_i &= f \Big( y_n + h \sum_{j=1}^{s} a_{ij} F_j \Big), \quad i = 1, \dots, s \\
  y_{n + 1} &= y_n + \sum_{i=1}^{s} b_i F_i
\end{split}
\end{equation*}

where one first solves the coupled sd-dimensional nonlinear system $F_i$, and the update $y_{n + 1}$ explicitly.

Important: here we're considering $f(y)$, rather than the standard $f(t, y)$, but you can do exactly the same but by plugging in $f(t_0 + c_ih, u(t_0 + c_i h))$ in the condition for $u'$ instead of what we just did.

Comparing to interpolating polynomial, $u(t_0 + c_ih)$ becomes the $g_i$ sort of.

Using the collocation methods we obtain a piecewise continuous solution, or rather, a continuous approximation of the solution $u(t)$ on each interval $[t_n, t_{n + 1}]$.

In terms of order of accuracy, the optimal choice is attained by using so-called Guass-Legendre collocation methods and placement of nodes at the roots of a shifted Legendre polynomial.

Runge-Kutta Methods

A natural generalization of collocation methods is obtained by:

  • allow coefficients $c_i$, $b_j$ and $a_{ij}$ to take on arbitrary values (not necessarily related to the quadrature rules)
  • no longer assume $c_i$ to be distinct

We then get the class of Runge-Kutta methods!

We introduce the stage values $Y_i$, ending up with what's called the Runge-Kutta methods:

\begin{equation*}
\begin{split}
  Y_i &= y_n + h \sum_{j=1}^{s} a_{ij} f(Y_j), \quad i = 1, \dots, s \\
  y_{n + 1} &= y_n + h \sum_{i=1}^{s} b_i f(Y_i)
\end{split}
\end{equation*}

where we can view $Y_i$ as the intermediate values of the solution at $y$ at time $t_n + c_i h$.

For which we use the following terminology:

  • $s$ the number of stages of the Runge-Kutta method
  • $b_i$ are the weights
  • $a_{ij}$ are the internal coefficients

Euler's method and trapezoidal rule are Runge-Kutta methods, where

\begin{equation*}
s = 1, \ b_1 = 1, \ a_{11} = 0, \ c_1 = 0
\end{equation*}

gives us the Euler's method.

If the matrix $A = (a_{ij})$ is strictly lower-triangular, then the method is explicit.

Otherwise, the method is implicit, as the we might have "circular" dependency between two stages $Y_i$ and $Y_j$, $i \ne j$.

Examples of Runge-Kutta methods

Runge-Kutta method with 4 stages - "THE Runge-Kutta method"
\begin{equation*}
\begin{split}
  Y_1 &= y_n \\
  Y_2 &= y_n + \frac{h}{2} f(Y_1) \\
  Y_3 &= y_n + \frac{h}{2} f(Y_2) \\
  Y_4 &= y_n + hf(Y_3) \\
  y_{n + 1} &= y_n + h \bigg( \frac{1}{6} f(Y_1) + \frac{1}{3} f(Y_2) + \frac{1}{3}f(Y_3) + \frac{1}{6} f(Y_4) \bigg)
\end{split}
\end{equation*}

Which often are represented schematically in a Butcher table.

Accuracy of Runge-Kutta methods

Notation
Stuff
  • In general, we cannot have an exact expression for the local error $\text{le}(y; h)$
  • Can approximate this by computing Taylor expansion wrt. $h$

Consider the case of continuous mechanics, the derivates can be related directly to the solution itself by using the differential equation:

\begin{equation*}
\begin{split}
  y' &= f \\
  y'' &= f' f \\
  y''' &= f''(f, f) + f'f'f \\
  & \vdots
\end{split}
\end{equation*}

Then, we can write the flow map as

\begin{equation*}
\Phi_h(y) = y + hf + \frac{h^2}{2} f' f + \frac{h^3}{6} [f''(f, f) + f' ff] + \mathcal{O}(h^4)
\end{equation*}
Order conditions for Runge-Kutta methods

With

\begin{equation*}
z' = \sum_{i = 1}^{s} b_i F_i + h \sum_{i = 1}^{s} b_i F_i'  
\end{equation*}

we have

\begin{equation*}
z'(0) = \sum_{i=1}^{s} b_i F_i(0) = \sum_{i=1}^{s} b_j f \big( Y_i(0) \big) = \bigg( \sum_{i=1}^{s} b_i \bigg) f(y)
\end{equation*}

For the method to have order of $p = 1$, we must have

\begin{equation*}
\sum_{i=1}^{s} b_i = 1
\end{equation*}

since we will then have

\begin{equation*}
z'(0) = f(y)
\end{equation*}

and thus, the method will be of order $p = 1$. This is the first what is termed as order conditions.

Then, to get the second order condition, we simply compute the second order derivative of $z(h)$:

\begin{equation*}
\begin{split}
  z'' &= \sum_{i=1}^{s} b_i F_i' + \sum_{i=1}^{s}b_i F_i' + h \sum_{i=1}^{s} b_i F_i'' \\
  &= 2 \sum_{i=1}^{s} b_i F_i' + h \sum_{i=1}^{s} b_i F_i''
\end{split}
\end{equation*}

Substituting in the expressions for $F_i$ and $Y_i$, we obtain

Variable stepsize

Notation

  • $\hat{\Phi}_{t, h}$ and $\hat{\Phi}_{t, h}'$ are two different methods for numerical approximations to the flow map
  • $\hat{\Phi}_{t, h}$ is order $p$ (local error $h^{p + 1}$)
  • $\hat{\Phi}_{t, h}'$ is order $p + 1$ (local error $h^{p + 2}$)
  • $\ell = L(t_n, y_n, h) \approx C(t_n, y_n) h^{p + 1}$

Stuff

Difference in an estimate of the local error by two different methods

\begin{equation*}
L(t, y, h) = | \hat{\Phi}_{t, h}(y) - \hat{\Phi}_{t, h}'(y) |
\end{equation*}

And we assume the local error satisfies

\begin{equation*}
L(t, y , h) = C(t, y) h^{p + 1} + \mathcal{O}(h^{p + 2})
\end{equation*}

i.e. that $h^{p + 1}$ is the dominating factor.

Error control

Assumptions:

  • $\hat{\Phi}_{t, h}$ is a convergent one-step method
  • Desire local error at each step less than a given $\tau$
  • $L(t, y, h)$ estimates the local error at a given point $(t, y)$ when we take a step size of $h$

Idea:

  • Estimate the local error with $L$
  • If $L > \tau$ then decrease $h$ and redo step
  • If $L < \tau$ then keep step; may want to increase $h$

Question: how much do we decrease / increase $h$?

Suggestion: raise/lower timestep by a factor $\rho = \bigg( \frac{\tau}{\ell} \bigg)^{1 / (p + 1)}$

  • Can be inefficient in the case where we need to do this guessing multiple times, can instead do:

    \begin{equation*}
\rho = \gamma \bigg( \frac{\tau}{\ell} \bigg)^{1 / (p + 1)}
\end{equation*}

    for some $\gamma \in [0, 1]$, e.g. $\gamma = 0.9$, i.e. $\gamma$ is a decaying factor for the "guess" of the stepsize.

Stability of Runge-Kutta Methods

Notation

  • $\mathcal{F} = \{ y \in \mathbb{R}^d : f(y) = 0 \}$ denotes the set of fixed points of the flow map $\Phi_{t, h}$
  • $\mathcal{F}_h = \{ y \in \mathbb{R}^d : \Psi_h(y) = y \}$ denotes the set of fixed points for the numerical estimation of the flow map
  • $\rho(K) = \max_{\lambda} \text{Eigenvalues}(K)$ denotes the spectral radius of a matrix $K$
  • $R(\mu) = \frac{P(\mu)}{Q(\mu)}$ is the ratio between two polynomials $P(\mu)$ and $Q(\mu)$ called the stability function

Stuff

An equilibrium point $y^*$ of the scalar differential equation

\begin{equation*}
\frac{dy}{dt} = f(y)
\end{equation*}

is a point for which

\begin{equation*}
f(y^*) = 0
\end{equation*}

Fixed points which are NOT equilibrium points are called extraneous fixed points.

These are points such that

\begin{equation*}
y_{n + 1} = y_n \quad \text{and} \quad f(y) \ne 0
\end{equation*}

Stability of Equilibrium points

Suppose that $f$ in

\begin{equation*}
\frac{dy}{dt} = f(y), \quad y, f \in \mathbb{R}^d
\end{equation*}

is $C^2$, i.e. $f \in C^2$, and has equilibrium point $y^*$.

If the eigenvalues of

\begin{equation*}
J = f'(y^*)
\end{equation*}

all lie strictly in the left complex half-plane, then the equilibrium point $y^*$ is asymptotically stable.

If $J$ has any eigenvalue in the right complex half-plane, then $y^*$ is an unstable point.

Stability of Fixed points of maps

We say a numerical method is A-stable if the stability region includes the entire left half-plane, i.e. $\{ z \in \mathbb{C} : \text{Re}(z) < 0 \}$ is a contained in the stability region.

A A-stable numerical method has the property that it's stable at the origin.

I was wondering why $\text{Re}(z) < 0$? I believe it's because we're looking at solutions which are exponential, i.e. $\exp \big( R(\lambda h) \big)$, hence if $\text{Re}\big(R(\lambda h) \big) < 0$, then the exponential will have modulus smaller than 1:

\begin{equation*}
\Big| \exp \Big( \text{Re}(z) + i \text{Im}(z) \Big) \Big| = \exp\big(\text{Re}(z)\big) < 1
\end{equation*}

if $\text{Re}(z) < 0$, for $z \in \mathbb{C}$.

We say a numerical method is L-stable if it has the properties:

  1. it's A-stable
  2. $R(\mu) \to 0$ as $\mu \to \infty$

Stability of Numerical Methods

Stability functions
  • Scalar case

    Consider the scalar first-order diff. eqn

    \begin{equation*}
y' = \lambda y
\end{equation*}

    When applied to this diff. eqn., many methods, including all explicit Runge-Kutta methods, have the form

    \begin{equation*}
y_{n + 1} = P(\lambda) y_n
\end{equation*}

    for some polynomial $P(x)$.

    More generally, all RK methods have the form

    \begin{equation*}
y_{n + 1} = R(h \lambda) y_n
\end{equation*}

    where $R(\mu)$ is rational polynomial, i.e.

    \begin{equation*}
R(\mu) = \frac{P(\mu)}{Q(\mu)}
\end{equation*}

    Then we call $R$ the stability function of the RK method.

    \begin{equation*}
Y_i = y_n + \mu \sum_{j=1}^{s} a_{ij} Y_j, \quad i = 1, \dots, s
\end{equation*}

    $\mathbf{1} \in \mathbb{R}^s$, then we can write this sum in a matrix form:

    \begin{equation*}
\mathbf{Y} = y_n \mathbf{1} + \mu A \mathbf{Y}
\end{equation*}

    which has the solution

    \begin{equation*}
\mathbf{Y} = y_n \mathbf{1} + \Big( I - \mu A \Big)^{-1}
\end{equation*}

    and therefore, finally, we can write

    \begin{equation*}
y_{n + 1} = y_n + \mu \mathbf{b}^T \mathbf{Y} = y_n + \mu \mathbf{b}^T \Big( y_n \mathbf{1} + \big(  I - \mu A \big)^{-1} \Big)
\end{equation*}
    \begin{equation*}
y_{n + 1} = y_n (1 + \mu \mathbf{b}^T \mathbf{1}) + \mu \mathbf{b}^T \big( I - \mu A \big)^{-1}
\end{equation*}

    which gives us the stability function $R$ for RK:

    \begin{equation*}
y_{n + 1} = R(\mu) y_n , \quad R(\mu) = 1 + \mu \mathbf{b}^T \big( 1 - \mu A \big)^{-1} \mathbf{1}
\end{equation*}

Stability of Numerical Methods: Linear case

Linear system of ODEs

\begin{equation*}
\mathbf{y}' = B \mathbf{y}
\end{equation*}

where $B_{d \times d}$ is diagonalisable. Can write solution as

\begin{equation*}
\mathbf{y}_n = \alpha_1^{(n)} \mathbf{u}_1 + \alpha_2^{(n)} \mathbf{u}_2 + \dots + \alpha_d^{(n)} \mathbf{u}_d
\end{equation*}
\begin{equation*}
\mathbf{y}' = B \mathbf{y}
\end{equation*}

where $B$ is a diagonalisable matrix.

Further, let an RK method be given with stability function $R$. The origin is stable for numerical method is stable if and only if

\begin{equation*}
|R(\mu)| \le 1, \quad \forall \mu = h \lambda, \quad \lambda \in \sigma(B)
\end{equation*}

Linear Multistep Methods

Notation

  • $f_n = f(t_n, y_n)$
  • $z = h \lambda$ where $y' = \lambda y$

Definitions

A linear k-step method is defined as

\begin{equation*}
\sum_{j=0}^{k} \alpha_j y_{n + j} = h \sum_{j=0}^{k} \beta_j f \big( y_{n + j} \big)
\end{equation*}

where $\alpha_k \ne 0$ and either $\alpha_0 \ne 0$ or $\beta_0 \ne 0$.

The coefficients are usually normalized such that either

\begin{equation*}
\alpha_k = 1 \quad \text{or} \quad \sum_{j}^{} \beta_j = 1
\end{equation*}

In implementation is's assumed that the values $y_{n + j}, j = 0, \dots, k-1$ are already computed, so that $y_{n + k}$ is the only unknown in the formula.

If $\beta_k$ is non-zero the method is implicit, otherwise it's explicit.

Examples

$\theta$ method - generalization of one-step methods

The $\theta$ multistep method is of the form

\begin{equation*}
y_{n + 1} - y_n = h (1 - \theta) f(y_n) + h \theta f(y_{n + 1})
\end{equation*}

which is a linear k-step method with coefficients $\alpha_0 = -1$, $\alpha_1 = 1$, $\beta_0 = 1 - \theta$ and $\beta_1 = \theta$.

Examples:

Leapfrog

Leapfrog is an explicit two-step method ($k = 2$) given by

\begin{equation*}
y_{n + 2} - y_n = 2 h f(y_{n + 1})
\end{equation*}

i.e. $\alpha_0 = -1$, $\alpha_1 = 0$, $\alpha_2 = 1$ and $\beta_1 = 2$.

Adams methods

The class of Adams methods have

  • $\alpha_k = 1$
  • $\alpha_{k - 1} = -1$
  • $\alpha_j = 0$ for $j < k - 1$

If we also have the additional $\beta_k = 0$, we have explicit Adams methods, known as Adams-Bashforth methods, e.g.

\begin{equation*}
\begin{split}
  y_{n + 1} - y_n &= h f_n \\
  y_{n + 2} - y_{n + 1} &= h \bigg( \frac{3}{2} f_{n + 1} - \frac{1}{2}f_n \bigg) \\
  y_{n + 3} - y_{n + 2} = h \bigg( \frac{23}{12} f_{n + 2} - \frac{4}{3} f_{n + 1} + \frac{5}{12} f_n \bigg)
\end{split}
\end{equation*}

where $f_n = f(y_n)$.

Adams-Moulton methods are implicit, i.e. $\beta_k \ne 0$.

Backward differentiation formulae (BDF)

The Backward differentiation formulae (BDF) are a class of linear multistep methods satisfying

\begin{equation*}
\beta_j = 0, \quad j < k
\end{equation*}

and generalizing backward Euler.

E.g. the two-step method (BDF-2) is

\begin{equation*}
y_{n + 2} - \frac{4}{3} y_{n + 1} + \frac{1}{3} y_n = h \frac{2}{3} f(y_{n + 2})
\end{equation*}

Order of accuracy

Associated with the linear multistep method are the polynomials

\begin{equation*}
\rho(\zeta) = \sum_{j=0}^{k} \alpha_j \zeta^j, \qquad \sigma(\zeta) = \sum_{j=0}^{k} \beta_j \zeta^j
\end{equation*}

Let $y(t)$ be the exact solution at times $y(t_{n + j})$ for $j = 0, \dots, k$.

Then, subsituting $y(t)$ into the linear k-step method expression

\begin{equation*}
r_n := \sum_{j=0}^{k} \alpha_j y(t_{n + j}) - h \sum_{j=0}^{k} \beta_j y'(t_{n + j})
\end{equation*}

(which is actually the residual accumulated in the $(n + k - 1)$ th step, but for notational convenience we will denote it $r_n$).

A linear multistep method has order of consistency $p$ if (with $r_n$ being the residual)

\begin{equation*}
r_n = \mathcal{O}(h^{p + 1})
\end{equation*}

for all sufficiently smooth $f$, or (it can be shown) equivalently have the following properties:

  • Coeffiicents $\alpha_j$ and $\beta_j$ satisfy

    \begin{equation*}
\sum_{j=0}^{k} \alpha_j = 0 \quad \text{and} \quad \sum_{j=0}^{k} \alpha_j j^i = j \sum_{j=0}^{k}   \beta_j j^{i - 1}, \quad \text{for} i = 1, \dots, p
\end{equation*}
  • The polynomials $\rho(\zeta)$ and $\sigma(\zeta)$ satisfy

    \begin{equation*}
\rho(e^z) - z \sigma(e^z) = \mathcal{O}(z^{p + 1})  
\end{equation*}
  • The polynomials $\rho(\zeta)$ and $\sigma(\zeta)$ satisfy

    \begin{equation*}
\frac{\rho(z)}{\log z} - \sigma(z) = \mathcal{O} \Big( (z-1)^p \Big)  
\end{equation*}

Unlike the single-step methods, e.g. Euler method, consistency is insufficient to ensure convergence!

Convergence

A linear multistep method is said to satisfy the root condition, if all roots $\zeta$ of

\begin{equation*}
\rho(\zeta) = 0
\end{equation*}

lie in the unit disc, i.e. $|\zeta| \le 1$, and any root on the boundary, i.e. $|\zeta| = 1$, has algebraic multiplicity one.

Otherwise, the modulus of the solution grows in time.

Suppose a linear multistep method is quipped with a starting procedure satisfying

\begin{equation*}
\lim_{h \to 0} y_j = y(t_0 + jh), \quad j = 1, \dots, k - 1
\end{equation*}

Then the method converges to the exact solution of

\begin{equation*}
\frac{dy}{dt} = f(t, y), \quad y(t_0) = y_0, \quad t \in [t_0, t_0 + T]
\end{equation*}

on a fixed interval as $h \to 0$ if and only if it has order of accuracy $p \ge 1$ and satisfies the root condition.

Weak Stability and the Strong Root Condition

A convergent multistep method has to be:

  1. Consistent: so $\rho(1) = 0$ and $\rho'(1) = \sigma(1)$ where $\rho$ and $\sigma$ are the characteristic polynomials
  2. Initialized with a convergent starting procedure: i.e. a method to compute $y_1, y_2, \dots y_{k - 1}$ for the first multistep iteration, to get started, satisfying

    \begin{equation*}
\lim_{h \to 0} | y_j - y(t_0 + hj)| = 0, \quad j = 1, 2, \dots, k - 1   
\end{equation*}
  3. Zero-stable: i.e. it satisfies the root condition; the roots of $\rho(z)$ are in the unit disc and simple if on the boundary.

If they're not simple, we get

\begin{equation*}
C_i \zeta_i^n + D_i n \zeta_i^n \to \infty
\end{equation*}

as we iterate, which is a problem!

We refer to a multistep method that is zero-stable but which has some extraneous roots on the boundary of the unit disk as a weakly stable multistep method.

Same as root condition, but all roots which are not $\zeta = 1$ are within the unit disk.

Asymptotic Stability

The stability region $\mathcal{S}$ of a linear multistep method is the set of all points $z$ such that all roots $\zeta$ of the polynomial equation

\begin{equation*}
\rho(\zeta) - z \sigma(\zeta) = 0
\end{equation*}

lie on the unit disc $|\zeta| \le 1$, and those roots with modulues one are simple.

On the boundary of the stability region $\mathcal{S}$, precisely one root has modulus one, say $\zeta = e^{i \theta}$. Therefore an explicit representation for the boundary of $\mathcal{S}$ is:

\begin{equation*}
\partial \mathcal{S} = \left\{ z = \frac{\rho(e^{i \theta})}{\sigma(e^{i \theta})} : \theta \in [- \pi, \pi] \right\}
\end{equation*}

This comes from applying the multistep method to the diff. eqn. (Dahlquist test equation)

\begin{equation*}
y' = \lambda y
\end{equation*}

Then

\begin{equation*}
\sum_{j=0}^{k} \alpha_j y_{n + j} = h \sum_{j=0}^{k} \beta_j f(y_{n + j}) = h \lambda \sum_{j=0}^{k} \beta_j y_{n + j}
\end{equation*}

Letting $z = h \lambda$, we get

\begin{equation*}
\sum_{j=0}^{k} ( \alpha_j - z \beta_j) y_{n + j} = 0
\end{equation*}

For any $z$, this is a recurrence relation / linear difference with characteristic polynomial

\begin{equation*}
\sum_{j=0}^{k} \big( \alpha_j z \beta_j \big) \zeta^j = 0 = \rho(\zeta) -  z \sigma(\zeta) \implies z = \frac{\rho(\zeta)}{\sigma(\zeta)}
\end{equation*}

The stability region is $z \in \mathbb{C}$ s.t. all roots of $\rho(\zeta) - z \sigma(\zeta)$ lie in the unit disc and those on the boundary are simply, which is just what we stated above.

Geometric Integration

Focused on constructing methods for specific problems to account for properties of the system, e.g. symmetry.

Notation

First integrals

For some special functions $I(y) : \mathbb{R}^d \to \mathbb{R}$ we may find that $I$ is constant along every solution to an /autonomous differential equation

\begin{equation*}
\frac{dy}{dt} = f(y), \quad y \in \mathbb{R}^d
\end{equation*}

Such a function $I$ is called a first integral.

This closely related to the conservation we in Hamiltonian physics.

Let

\begin{equation*}
\frac{dy}{dt} = f(y)
\end{equation*}

have a quadratic first integral. A Runge-Kutta method preserves this quadratic first integral if

\begin{equation*}
b_i b_j - b_i a_{ij} - b_j a_{j i} = 0
\end{equation*}

Splitting methods

  • Useful for constructing methods with specific properties

Suppose we wish to solve the differential equation

\begin{equation*}
\frac{dz}{dt} = f(z) = f_1(z) + f_2(z)
\end{equation*}

where each of the two differential equations

\begin{equation*}
\begin{split}
  \frac{dz}{dt} = f_1(z) \quad \text{and} \quad \frac{dz}{dt} = f_2(z)
\end{split}
\end{equation*}

happen to be completely integrable (i.e. analytically solvable).

Then, any point in the phase space, the vector field can be written as two compontents $f_1$ and $f_2$. We first step forward along one tangent, and then step forward in the second, each time for a small timestep $h$.

That is, we start from some point $z_n$ and solve first the IVP

\begin{equation*}
\frac{dz}{dt} = f_1(z), \quad z(0) = z_n
\end{equation*}

for time $h$, which takes us to the point $z^* = z_1(h; z_n)$ where $z_1$ represents the exact solution of the differential equations on vector field $f_1$.

Next, we start from $z^*$ and solve the IVP

\begin{equation*}
\frac{dz}{dt} = f_2(z), \qquad z(0) = z^*
\end{equation*}

for time $h$, taking us to the point $z_{n + 1} = z_2(h; z^*)$.

Denoting the flow maps of the vector fields $f_1$ and $f_2$ as $\Phi_{t, f_1}$ and $\Phi_{t, f_2}$ we have just computed

\begin{equation*}
z_{n + 1} = \Big( \Phi_{h, f_2} \circ \Phi_{h, f_1} \Big)(z_n) = \Big( \Phi_{h, f_1} \circ \Phi_{h, f_2} \Big)(z_n)
\end{equation*}

Such a method is called a splitting method.

The splitting method $\Phi_{h, f_2} \circ \Phi_{h, f_1}$ where $f = f_1 + f_2$, has local error

\begin{equation*}
\text{le}(y; h) = \frac{h^2}{2} \acomm{f_1}{f_2} + \mathcal{O}(h^3)
\end{equation*}

where $\acomm{f}{g}$ are the commutator of the vector fields $f$ and $g$, which is another vector field

\begin{equation*}
\acomm{f}{g} = \frac{\partial f}{\partial y} g - \frac{\partial g}{\partial y} f
\end{equation*}

More generally

The splitting method $\Phi_{h, f_2} \circ \Phi_{h, f_1}$ where $f = f_1 + f_2$, has local error

\begin{equation*}
\text{le}(y; h) = \frac{h^2}{2} \acomm{f_1}{f_2} + \frac{h^3}{12} \big( \acomm{f_2}{\acomm{f_2}{f_1}} - \acomm{f_1}{\acomm{f_2}{f_1}} \big) + \dots 
\end{equation*}

where $\acomm{f}{g}$ are the commutator of the vector fields $f$ and $g$, which is another vector field

\begin{equation*}
\acomm{f}{g} = \frac{\partial f}{\partial y} g - \frac{\partial g}{\partial y} f
\end{equation*}

If the fields commute, i.e. $\acomm{f_1}{f_2} = 0$, then the splitting is exact.