Stochastic Differential Equations

Table of Contents

Books

  • Handbook of Stochastic Methods
  • Øksendal
  • Probability with Martingales

Overview

Notation

  • Correlation between two points of a random field or a random process:

    \begin{equation*}
\mathbb{E} \Big( X(t) \otimes X(s) \Big)
\end{equation*}

    to allow the possibility of an infinite number of points. In the discrete case this simply corresponds to the covariance matrix.

  • $\mathscr{B}$ denotes the Borel σ-algebra
  • Partition $P = \left\{ t_0 = 0 < t_1 < \dots < t_n = T \right\}$
  • We write

    \begin{equation*}
\left| P \right| = \sup_{k} \left| t_{k + 1} - t_k \right|
\end{equation*}
  • Whenever we write stochastic integrals as Riemann sums, e.g.

    \begin{equation*}
\int_{0}^{T} \dd{W}^2 \dd{w} = \lim_{\left| P \right| \to 0} \sum_{k=0}^{n - 1} W^2(t_k) \big( W(t_{k + 1}) - W(t_k) \big)
\end{equation*}

    the limit is to be understood in the mean-square sense, i.e.

    \begin{equation*}
\mathbb{E} \Bigg[ \left| \int_{0}^{T} \dd{W}^2 \dd{w} - \bigg( \lim_{\left| P \right| \to 0} \sum_{k=0}^{n - 1} W^2(t_k) \big( W(t_{k + 1}) - W(t_k) \big) \bigg) \right|^2 \Bigg] \to 0
\end{equation*}

Definitions

A stochastic process $X_t \in L^2$ is called a either

  • second-order stationary
  • wide-sense stationary
  • weakly stationary

if

  • the first moment $\mathbb{E}[X_t]$ is constant
  • covariance function depends only on the difference $t - s$

That is,

\begin{equation*}
\mathbb{E}[X_t] = \mu, \quad \mathbb{E} \Big[ \big( X_t - \mu \big) \big( X_s - \mu \big) \Big] = C(t - s)
\end{equation*}

A stochastic process is called (strictly) stationary if all FDDs are invariant under time translation, i.e. for all $k \in \mathbb{N}$, for all times $t_i \in T$, and $\{ \Gamma_i \}_{i = 1}^k \subset \mathcal{B}$,

\begin{equation*}
\mathbb{P} \big( X_{t_1} \in \Gamma_1, \dots, X_{t_k} \in \Gamma_k \big) = \mathbb{P} \big( X_{s + t} \in \Gamma_1, \dots, X_{s + t_k} \in \Gamma_k \big)
\end{equation*}

for $s > 0$ such that $s + t_i \in T$, for every $i = 1, \dots, k$.

The autocorrelation function of a second-order stationary process enables us to associate a timescale to $X_t$, the correlation time $\tau_{\text{cor}}$:

\begin{equation*}
\tau_{\text{cor}} = \frac{1}{C(0)} \int_{0}^{\infty} C(\tau) \ d \tau = \frac{1}{\mathbb{E}[X_0^2]} \int_{0}^{\infty} \mathbb{E}[X_{\tau} X_0] \ d \tau
\end{equation*}

Martingale continuous processes

Let

  • $\{ \mathscr{F}_t \}_{t \in [0, T]}$ be a filtration defined on the probability space $\big( \Omega, \mathscr{F}, \mu \big)$
  • $\{ X_t \}_{t \in [0, T]}$ be to $\mathscr{F}_t$, i.e. $X_t$ is measurable on $\mathscr{F}_t$, with $X_t \in L^1(0, T)$.

We say $X_t$ is an $\mathscr{F}_t$ martingale if

\begin{equation*}
\mathbb{E} \big[ X_t \mid \mathscr{F}_s \big] = X_s, \quad \forall t \ge s
\end{equation*}

Gaussian process

A 1D continuous-time Gaussian process is a stochastic process for which $\Omega = \mathbb{R}$, and all finite-dimensional distributions are Gaussians.

That is, for every finite dimensional vector

\begin{equation*}
\big( X_{t_1}, X_{t_2}, \dots, X_{t_k} \big) \sim \mathcal{N} \big( \boldsymbol{\mu}_k, \mathbf{K}_k \big)
\end{equation*}

for some symmetric non-negative definite matrix $\mathbf{K}_k$, for all $k \in \mathbb{N}$ and $t_1, t_2, \dots, t_k \in \mathbb{R}$.

Theorems

Bochner's Theorem

Let $C(t)$ be a continuous positive definite function.

Then there exists a unique nonnegative measure $\rho$ on $\mathbb{R}$ such that $\rho(\mathbb{R}) = 0$ and

\begin{equation*}
C(t) = \int_{\mathbb{R}} e^{i \omega t } \rho \big( d \omega \big), \quad \forall t \in \mathbb{R}
\end{equation*}

i.e. $\rho$ is the Fourier transform of the function $C(t)$.

Let $X_t$ be a second-order stationary process with autocorrelation function $C(t)$ whose Fourier transform is $\rho(d \omega)$.

The measure $\rho(d \omega)$ is called the spectral measure of the process $X_t$.

If the spectral measure is absolutely continuous wrt. the Lebesgue measure on $\mathbb{R}$ with density $S(\omega)$, i.e.

\begin{equation*}
\rho(d \omega) = S(\omega) \ d \omega
\end{equation*}

then the Fourier transform $S(\omega)$ of the covariance function is called the spectral density of the process:

\begin{equation*}
S(\omega) = \frac{1}{2 \pi} \int_{-\infty}^{\infty} e^{it \omega} C(t) \ dt
\end{equation*}

Ito vs. Stratonovich

  • Purely matheamtical viewpoint: both Ito and Stratonovich calculi are correct
  • Ito SDE is appropriate when continuous approximation of a discrete system is concerned
  • Stratonovich SDE is appropriate when the idealization of a smooth real noise process is concerned

Benefits of Itô stochastic integral:

Benefits of Stratonovich stochastic integral:

  • Leads to the standard Newton-Leibniz chain rule, in contrast to Itô integral which requires correction
  • SDEs driven by noise with nonzero correlation time converge to the Stratonovich SDE, in the limit as the correlation time tends to 0

Practical considerations

  • Rule of thumb:
    • White noise is regarded as short-correlation approximation of a coloured noise → Stratonovich integral is natural
      • "Expected" since the standard chain rule should work fro smooth noise with finite correlation

Equivalence

Suppose $X(t)$ solves the following Stratonovich SDE

\begin{equation*}
\dd{X} = b(X, t) \dd{t} + \sigma(X, t) \circ \dd{W}
\end{equation*}

then $X(t)$ solves the Itô SDE:

\begin{equation*}
\dd{X} = \bigg( b(X, t) + \frac{1}{2} \sigma(X,t) \partial_X \sigma(X, t) \bigg) \dd{t} + \sigma(X, t) \dd{W}
\end{equation*}

Suppose $X(t)$ solves the Itô SDE:

\begin{equation*}
\dd{X} = b(X, t) \dd{t} + \sigma(X, t) \dd{W}
\end{equation*}

then $X(t)$ solves the Stratonovich SDE:

\begin{equation*}
\dd{x} = \bigg( b(X, t) - \frac{1}{2} \sigma(X, t) \partial_X \sigma(X, t) \bigg) \dd{t} + \sigma(X, t) \circ \dd{W}
\end{equation*}

That is, letting $\tilde{\sigma}(X, t) = \frac{1}{2} \sigma(X, t) \partial_X \sigma(X, t)$,

  • Stratonovich → Itô: $+ \tilde{\sigma}(X, t) \dd{t}$
  • Itô → Stratonovich: $- \tilde{\sigma}(X, t) \dd{t}$

In the multidimensional case,

\begin{equation*}
\begin{split}
  \dd{X_i} = b_i \dd{t} + \sum_{j}^{} \sigma_{ij} \circ \dd{W_j} \quad & \iff \quad \dd{X_i} = \bigg( b_i + \frac{1}{2} \sum_{k, j}^{} \partial_{x_k} \sigma_{ij} \sigma_{kj} \dd{t} \bigg) + \sum_{j}^{} \sigma_{ij} \dd{W_j} \\
  \dd{X_i} = b_i \dd{t} + \sum_{j}^{} \sigma_{ij} \dd{W_j} \quad & \iff \quad \dd{X_i} = \bigg( b_i - \frac{1}{2} \sum_{k, j}^{} \partial_{x_k} \sigma_{ij} \sigma_{kj} \dd{t} \bigg) + \sum_{j}^{} \sigma_{ij} \circ \dd{W_j}
\end{split}
\end{equation*}

To show this we consider $X$ staisfying the two SDEs

\begin{equation*}
\dd{X} = b \dd{t} + \sigma \dd{W} \quad \text{and} \quad \dd{X} = \tilde{b} \dd{t} + \tilde{\sigma} \circ \dd{W}
\end{equation*}

i.e. the satisfying the stochastic integrals

\begin{equation*}
X(t) = X(0) + \int b \dd{t} + \int \sigma \dd{W} \quad \text{and} \quad X(t) = X(0) + \int \tilde{b} \dd{t} + \int \tilde{\sigma} \circ \dd{W}
\end{equation*}

We have

\begin{equation*}
  \int \tilde{\sigma} \circ \dd{W} = \lim_{\left| P \right| \to 0} \sum_{k=0}^{n - 1} \tilde{\sigma} \bigg( \frac{X(t_k) + X(t_{k + 1})}{2}, \ t_k \bigg) \big( W(t_{k + 1}) - W(t_k) \big)
\end{equation*}

Equivalently, we can write

\begin{equation*}
\tilde{\sigma} \bigg( \frac{X(t_k) + X(t_{k + 1})}{2}, \ t_k \bigg) = \tilde{\sigma} \bigg(X_{t_k} + \frac{X(t_{k + 1}) - X(t_k)}{2}, \ t_k \bigg)
\end{equation*}

Assuming the $\tilde{\sigma}$ is smooth, we can then Taylor expand about $X_{t_k}$ and evaluate at $\frac{X_{t_{k + 1}} - X_{t_k}}{2}$:

\begin{equation*}
\tilde{\sigma} \bigg(X_{t_k} + \frac{X(t_{k + 1}) - X(t_k)}{2}, \ t_k \bigg) = \tilde{\sigma} \big( X_{t_k} \big) + \Big( \partial_X \tilde{\sigma} \big( X_{t_k} \big) \Big) \frac{X_{t_{k + 1}} - X_{t_k}}{2} + \dots
\end{equation*}

Substituting this back into the Riemann series expression for the Stratonovich integral, we have

\begin{equation*}
\begin{split}
  \int \tilde{\sigma} \circ \dd{W} &= \lim_{\left| P \right| \to 0} \sum_{k=0}^{n - 1} \tilde{\sigma} \bigg( \frac{X(t_k) + X(t_{k + 1})}{2}, \ t_k \bigg) \big( W(t_{k + 1}) - W(t_k) \big) \\
  &= \lim_{\left| P \right| \to 0} \sum_{k=0}^{n - 1} \tilde{\sigma} \big( X(t_k), t_k \big) \big( W(t_{k + 1}) - W(t_k) \big) + \frac{1}{2} \Big( \partial_X \tilde{\sigma} \big( X(t_k) \big) \Big) \big( X(t_{k + 1}) - X(t_k) \big) \big( W(t_{k + 1}) - W(t_k) \big) + \mathcal{O} \big( \left| P \right|^{3 / 2} \big)
\end{split}
\end{equation*}

Using the fact that $X(t)$ satisfies the Itô integral, we have

\begin{equation*}
X(t_{k + 1}) - X(t_k) = b \big( X(t_k), t_k \big) \big( t_{k + 1} - t_k \big) + \sigma \big( X(t_k), t_k \big) \big( W(t_{k + 1}) - W(t_k) \big) + \mathcal{O} \big( \left| P \right|^{3 / 2} \big)
\end{equation*}

and that

\begin{equation*}
\lim_{\left| P \right| \to 0} \sum_{k=0}^{n - 1} G(t_k) \big( W(t_{k + 1}) - W(t_k) \big)^2 = \int_0^T G(t) \dd{t}
\end{equation*}

we get

\begin{equation*}
\int_{0}^{T} \tilde{\sigma} (X, t) \circ \dd{W(t)} = \frac{1}{2} \int_{0}^{T} \sigma \big( X, t \big) \partial_X \tilde{\sigma} \big( X, t \big) \dd{t} + \int_{0}^{T} \tilde{\sigma} \big( X, t \big) \dd{W(t)}
\end{equation*}

Thus, we have the identity

\begin{equation*}
\int_{0}^{T} \tilde{b} \big( X, t \big) \dd{t} + \int_{0}^{T} \tilde{\sigma} \big( X, t \big) \circ \dd{W(t)} = \int_{0}^{T} \tilde{b} \big( X, t \big) + \frac{1}{2} \sigma \big( X, t \big) \partial_X \tilde{\sigma} \big( X, t \big) \dd{t} + \int_{0}^{T} \tilde{\sigma} \big( X, t \big) \dd{W(t)}
\end{equation*}

Matching the coefficients with the Itô integral satisfied by $X(t)$, we get

\begin{equation*}
b = \tilde{b} + \partial_X \tilde{\sigma} \big( X, t \big) \quad \text{and} \quad \sigma = \tilde{\sigma}
\end{equation*}

which gives us the conversion rules between the Stratonovich and Itô formulation!

Important to note the following though: here we have assumed that the $\sigma = \tilde{\sigma}$ is smooth, i.e. infinitely differentiable and therefore locally Lipschitz. Therefore our proof only holds for this case.

I don't know if one can relax the smoothness constraint of $\sigma$ and still obtain conversion rules between the two formulations of the SDEs.

Stratonovich satisfies chain rule proven using conversion

For some function $u(X, t)$, one can show that the Stratonovich formulation satisfies the standard chain rule by considering the Stratonovich SDE, converting to Itô, apply Itô's formula, and then converting back to Stratonovich SDE!

Variations of Brownian motion

Ornstein-Uhlenbeck process

Consider a mean-zero second-order stationary process with correlation function

\begin{equation*}
C(t) = C(0) e^{- \alpha |t|}, \qquad \alpha > 0
\end{equation*}

We will write $C(0) = \frac{D}{\alpha}$, where $D > 0$.

The spectral density of this process is

\begin{equation*}
\begin{split}
  S(\omega) &= \frac{1}{2 \pi} \int_{-\infty}^{ \infty} e^{- i \omega t} e^{- \alpha |t|} \ dt \\
  &= \frac{1}{2 \pi} \frac{D}{\alpha} \Bigg( \int_{-\infty}^{0} e^{- i \omega t} e^{\alpha t} \ dt + \int_{0}^{\infty} e^{- i \omega t} e^{- \alpha t} \ dt \Bigg) \\
  &= \frac{1}{2 \pi} \frac{D}{\alpha} \Bigg( \frac{1}{- i \omega + \alpha} + \frac{1}{i \omega + \alpha} \Bigg) \\
  &= \frac{D}{\pi} \frac{1}{\omega^2 + \alpha^2}
\end{split}
\end{equation*}

This $S(\omega)$ is called a Cauchy or Lorentz distribution.

The correlation time is then

\begin{equation*}
\tau_{\text{cor}} = \int_{0}^{\infty} e^{-\alpha t} \ dt =  \alpha^{-1}
\end{equation*}

A real-valued Gaussian stationary process defiend on $\mathbb{R}$ with correlation function as given above is called a Ornstein-Uhlenbeck process.

Look here to see the derivation of the Ornstein-Uhlenbeck process from it's Markov semigroup generator.

Fractional Brownian Motion

A (normalized) fractional Brownian motion $W_t^H$, $t \ge 0$, with Hurst parameter $H \in (0, 1)$ is a centered Gaussian process with continuous sample paths whose covariance is given by

\begin{equation*}
\mathbb{E} \Big[ W_t^H W_s^H \Big] = \frac{1}{2} \Big( s^{2 H } + t^{2 H} - |t - s|^{2H} \Big)
\end{equation*}

Hence, the Hurst parameter controls:

  • the correlations between the increments of fractional Brownian motion
  • the regularity of the paths: they become smoother as $H$ increases.

A fractional Brownian motion has the following properties

  1. When $H = \frac{1}{2}$, then $W_t^{1 / 2}$ becomes standard Brownian motion.
  2. We have

    \begin{equation*}
W_0^H = 0, \quad \mathbb{E} \Big[ \big(W_t^H\big)^2 \Big] = \left| t \right|^{2H}
\end{equation*}
  3. It has stationary increments, and

    \begin{equation*}
\mathbb{E} \Big[ \big( W_t^H - W_s^H \big)^2 \Big] = \left| t- s \right|^{2H}
\end{equation*}
  4. It has the following self-similarity property:

    \begin{equation*}
\Big( W_{\alpha t}^H, \ t \ge 0 \Big) = \Big( \alpha^H W_t^H, \ t \ge 0 \Big), \quad \alpha \ge 0
\end{equation*}

    where the equivalence is in law.

Karhunen-Loève Expansion

Notation

  • $f \in L^2(\mathscr{D})$ where $\mathscr{D} \subseteq \mathbb{R}^d$
  • $\{ e_n \}_{n = 1}^\infty$ be an orthonormal basis in $L^2(\mathscr{D})$
  • $T = [0, 1]$
  • $R(t, s) = \mathbb{E}[X_t X_s]$

Stuff

Let $\mathscr{D} = [0, 1]$.

Suppose

\begin{equation*}
X_t(\omega) = \sum_{n=1}^{\infty} \xi_n(\omega) e_n(t), \quad t \in [0, 1]
\end{equation*}

We assume $\{ \xi_n \}_{n = 1}^\infty$ are orthogonal or independent, and

\begin{equation*}
\mathbb{E} \big[ \xi_n \xi_m \big] = \lambda_n \delta_{nm}
\end{equation*}

for some positive numbers $\{ \lambda_n \}_{n = 1}^\infty$.

Then

\begin{equation*}
\begin{split}
  R(t, s) &= \mathbb{E}[X_t X_s] \\
  &= \mathbb{E} \bigg( \sum_{k=1}^{\infty} \sum_{m=1}^{\infty} \xi_k e_k(t) \xi_m e_m(s) \bigg) \\
  &= \sum_{k=1}^{\infty} \sum_{m=1}^{\infty} \mathbb{E}[\xi_k \xi_m] e_k(t) e_m(s) \\
  &= \sum_{k=1}^{\infty} \lambda_k e_k(t) e_k(s)
\end{split}
\end{equation*}

due to orthogonality of $e_k$ and $e_m$ for $k \ne m$.

Hence, for the expansion of $X_t(\omega)$ above, we need $R(t, s) = \sum_{k=1}^{\infty} \lambda_k e_k(t) e_k(s)$ to be valid!

The above expression for $R(t, s)$ also implies

\begin{equation*}
\int_{0}^{1} R(t, s) e_n(s) \ ds = \lambda_n e_n(t)
\end{equation*}

Hence, we also need the set $\{ \big( \lambda_n, e_n(t) \big) \}_{n = 1}^\infty$ to be a set of eigenvalues and eigenvectors of the integral operator whose kernel is the correlation function of $Xt, i.e. need to study the operator

\begin{equation*}
\mathscr{R} f := \int_{0}^{1}R(t, s) f(s) \ ds
\end{equation*}

which we will now consider as an operator on $L^2 \big( [0, 1] \big)$.

It's easy to see that $\mathscr{R}$ is self-adjoint and nonnegative in $L^2 \big( (0, 1) \big)$:

\begin{equation*}
\left\langle \mathscr{R}f, h \right\rangle = \left\langle f, \mathscr{R} h \right\rangle \quad \text{and} \quad \left\langle \mathscr{R} f, f \right\rangle \ge 0, \quad \forall f, h \in L^2 \big( (0, 1) \big)
\end{equation*}

Furthermore, it is a compact operator, i.e. if $\{ \phi_n \}_{n = 1}^\infty$ is a bounded sequence on $L^2 \big( (0, 1) \big)$, then $\{ \mathscr{R} \phi_n \}_{n = 1}^\infty$ has a convergent subsequence.

Spectral theorem for compact self-adjoint operators can be used to deduce that $\mathscr{R}$ has a countable sequence of eigenvalues tending to $0$.

Furthermore, for every $f \in L^2 \big( (0, 1) \big)$, we can write

\begin{equation*}
f = f_0 + \sum_{n=1}^{\infty} f_n e_n(t)
\end{equation*}

where $\mathscr{R} f_0 = 0$ and $\{ e_n(t) \}$ are the eigenfunctions of the operator $\mathscr{R}$ corresponding to the non-zero eigenvalues and where the convergence is in $L^2$, i.e. we can "project" $f$ onto the subspace spanned by eigenfunctions of $\mathscr{R}$.

Let $\{ X_t, t \in [0, 1] \}$ be an $L^2$ process with zero mean and continuous correlation function $R(t, s)$.

Let $\{ (\lambda_n, e_n(t)) \}_{n = 1}^\infty$ be the eigenvalues and eigenfunctions of the operator $\mathscr{R}$ defined

\begin{equation*}
\mathscr{R} f = \int_{0}^{1} R(t, s) f(s) \ ds
\end{equation*}

Then

\begin{equation*}
X_t = \sum_{n=1}^{\infty} \xi_n e_n(t), \quad t \in [0, 1]
\end{equation*}

where

\begin{equation*}
\begin{split}
  \xi_n &= \int_{0}^{1} X_t e_n(t) \ dt \\
  \mathbb{E} \big[ \xi_n \big] &= 0 \\
  \mathbb{E} \big[ \xi_n \xi_m \big] &= \lambda \delta_{n m}
\end{split}
\end{equation*}

The series converges in $L^2$ to $X$, uniformly in $t$!

Karhunen-Loève expansion of Brownian motion

  • Correlation function of Brownian motion is $R(t, s) = \min \left\{ t, s \right\}$.
  • Eigenvalue problem $\mathscr{R} \psi_n = \lambda_n \psi_n$ becomes

    \begin{equation*}
\int_{0}^{1} \min \left\{ t, s \right\} \psi_n(s) \ ds = \lambda_n \psi_n(t)
\end{equation*}
  • Assume $\lambda_n > 0$ (since $\lambda = 0$ would imply ψn(t) = 0$)
  • Consider intial condition $t = 0$ which gives $\psi_n(0) = 0$
  • Can rewrite eigenvalue problem

    \begin{equation*}
\int_{0}^{t} s \psi_n(s) \ ds + t \int_{t}^{1} \psi_n(s) \ ds = \lambda_n \psi_n(t)
\end{equation*}
  • Differentiatiate once

    \begin{equation*}
\frac{d}{dt} \bigg( \int_{0}^{t} s \psi_n(s) \ ds \bigg) + \int_{t}^{1} \psi_n(s) \ ds + t \frac{d}{dt} \bigg( \int_{t}^{1} \psi_n(s) \ ds \bigg) = \lambda_n \psi_n'(t)
\end{equation*}

    using FTC, we have

    \begin{equation*}
t \psi_n(t) + \int_{t}^{1} \psi_n(s) \ ds - t \psi_n(t) = \lambda_n \psi_n'(t)
\end{equation*}

    hence

    \begin{equation*}
\int_{t}^{1} \psi_n(s) \ ds = \lambda_n \psi_n'(t)
\end{equation*}
  • Obtain second BC by observing $\psi_n'(1) = 0$ (since LHS in the above is clearly 0)
  • Second differentiation

    \begin{equation*}
- \psi_n(t) = \lambda_n \psi_n''(t)
\end{equation*}
  • Thus, the eigenvalues and eigenfunctions of the integral operator whose kernel is the covariance function of Brownian motion can be obtained as solutions to the Sturm-Lioville problem

    \begin{equation*}
- \psi_n(t) = \lambda_n \psi_n''(t), \quad \psi_n(0) = \psi_n'(1) = 0
\end{equation*}
  • Eigenvalues and (normalized) eigenfunctions are then given by

    \begin{equation*}
\psi_n(t) = \sqrt{2} \sin \bigg( \frac{1}{2} \big( 2n - 1 \big) \pi t \bigg), \quad \lambda_n = \bigg( \frac{2}{(2n - 1) \pi} \bigg)^2
\end{equation*}
  • Karhunen-Loève expansion of Brownian motion on $[0, 1]$ is then

    \begin{equation*}
W_t = \sqrt{2} \sum_{n=1}^{\infty} \xi_n \frac{2}{(2n - 1) \pi} \sin \bigg( \frac{1}{2} (2n - 1) \pi t \bigg)
\end{equation*}

Diffusion Processes

Notation

  • $\Gamma$ denotes a Borel set
  • $\big( \Omega, \mathscr{F}, \mu \big)$ denotes a probability space
  • $X = X_t(\omega)$ denotes a stochastic processes with $t \in \mathbb{R}_+$ and state space $\big( \mathbb{R}^d, \mathscr{B} \big)$
  • $\sigma \big( X_t,  \ t \in \mathbb{R}_+ \big)$ denotes the σ-algebra generated by $\{ X_t, \ t \in \mathbb{R}_+ \}$, which is the smallest σ-algebra s.t. $X_t$ is a measurable function (random variable) wrt. it.

Markov Processes and the Chapman-Kolmogorov Equation

We define the σ-algebra generated by $\left\{ X_t, \ t \in \mathbb{R}_+ \right\}$, denoted $\sigma(X_t, t \in \mathbb{R}_+)$, to be the smallest σ-algebra s.t. the family of mappings $\{ X_t, \ t \in \mathbb{R}_+ \}$ is a stochastic process with

  • sample space $\Big( \Omega, \sigma(X_t, \ t \in \mathbb{R}_+) \Big)$
  • state space $\big( \mathbb{R}^d, \mathscr{B} \big)$
  • Idea: encode all past information about a stochastic process into an appropriate collection of σ-algebras

Let $\big( \Omega, \mathscr{F}, \mu \big)$ denote a probability space.

Consider stochastic process $X = X_t(\omega)$ with $t \in \mathbb{R}^+$ and state space $\big( \mathbb{R}^d, \mathscr{B} \big)$.

A filtration on $\big( \Omega, \mathscr{F} \big)$ is a nondecreasing family $\{ \mathscr{F}_t, \ t \in \mathbb{R}_+ \}$ of sub-σ-algebras of $\mathscr{F}$:

\begin{equation*}
\mathscr{F}_s \subseteq \mathscr{F}_t \subseteq \mathscr{F} \quad \text{for } s \le t
\end{equation*}

We set

\begin{equation*}
\mathscr{F}_{\infty} = \sigma \big( \bigcup_{t \in T}^{} \mathscr{F}_t \big)
\end{equation*}

Note that $\mathscr{F}_{\infty} \ne \mathscr{F}$ is a possibility.

The filtration generated by or natural filtration of the stochastic process $X_t$ is

\begin{equation*}
\mathscr{F}_t^X := \sigma(X_s; \ s \le t)
\end{equation*}

A filtration $\mathscr{F}_t^X$ is generated by events of the form

\begin{equation*}
\left\{ \omega \mid X_{t_1} \in \Gamma_1, X_{t_2} \in \Gamma_2, \dots, X_{t_n} \in \Gamma_n \right\}
\end{equation*}

with $0 \le t_1 \le t_2 \le \dots \le t_n \le t$ and $\Gamma_i \in \mathscr{B}(\mathbb{R}^d)$.

Let $X_t$ be a stochastic process defined on a probability space $\big( \Omega, \mathscr{F}, \mu \big)$ with values in $\mathbb{R}^d$, and let $\mathscr{F}_t^X$ be the filtration generated by $\{ X_t; \ t \in \mathbb{R}_+ \}$.

Then $\{ X_t; \  t \in \mathbb{R}_+ \}$ is a Markov process if

\begin{equation*}
\mathbb{P}(X_t \in \Gamma \mid \mathscr{F}_s^X) = \mathbb{P}(X_t \in \Gamma \mid X_s)
\end{equation*}

for all $t,s \in T$ with $t \ge s$, and $\Gamma \in \mathscr{B}(\mathbb{R}^d)$.

Equivalently, it's a Markov process if

\begin{equation*}
\mathbb{P}(X_t \in \Gamma \mid X_{t_1}, X_{t_2} , \dots , X_{t_n}) = \mathbb{P} (X_t \in \Gamma \mid X_{t_n})
\end{equation*}

for $n \ge 1$ and $0 \le t_1 \le t_2 \le \dots \le t_n \le t$ with $\Gamma \in \mathscr{B}(E)$.

Chapman-Kolmogorov Equation

The transition function $P(\Gamma, t \mid x, s)$ for fixed $t, x, s$ is a probability measure on $\mathbb{R}^d$ with

\begin{equation*}
P( \mathbb{R}^d, t \mid x, s) = 1
\end{equation*}

It is $\mathscr{B}(\mathbb{R}^d)$ measurable in $x$, for fixed $t, s, \Gamma$, and satisfies the Chapman-Kolmogorov equation

\begin{equation*}
P(\Gamma, t \mid x, s) = \int_{\mathbb{R}^d}^{} P(\Gamma, t \mid y, u)  \ P(dy, u \mid x, s)
\end{equation*}

for all

  • $x \in \mathbb{R}^d$
  • $\Gamma \in \mathscr{B}(\mathbb{R}^d)$
  • $s, u, t \in \mathbb{R}_+$ with $s \le u \le t$

Assuming that $X_s = x$, we can write

\begin{equation*}
P(\Gamma, t \mid x, s ) = \mathbb{P} \big[ X_t \in \Gamma \mid X_s = x \big]
\end{equation*}

since $\mathbb{P} \big[ X_t \in \Gamma \mid \mathscr{F}_s^X \big] = \mathbb{P} \big[ X_t \in \Gamma \mid X_s \big]$.

In words, the Chapman-Kolmogorov equation tells us that for a Markov process, the transition from $x$ at time $s$ to the set $\Gamma$ at time $t$ can be done in two steps:

  1. System moves from $x$ to $y$ at some intermediate step $u$
  2. Moves from $y$ to $\Gamma$ at time $t$

Generator of a Markov Process

  • Chapman-Kolmogorov equation suggests that a time-homogenous Markov process can be described through a semigroup of operators, i.e. a one-parameter family of linear operators with the properties

    \begin{equation*}
P_0 = I, \quad P_{t + s} = P_t \circ P_s \qquad \forall t, s \ge 0
\end{equation*}

Let $P(t, \cdot, \cdot)$ be the transition function of a homogenous Markov process and let $f \in C_b(\mathbb{R}^d)$, and define the operator

\begin{equation*}
\big( P_t f \big)(x) := \mathbb{E} \Big[ f(X_t) \mid X_0 = x \Big] = \int_{\mathbb{R}^d}^{} f(y)  \ P(t, x, dy)
\end{equation*}

Linear operator with

\begin{equation*}
\big( P_0 f \big)(x) = f(x)
\end{equation*}

which means that $P_0 = 1$, and

\begin{equation*}
\big( P_{t + s} f \big)(x) = \big( (P_t \circ P_s)f \big)(x)
\end{equation*}

i.e. $P_{t + s} = P_{t} \circ P_s$.

We can study properties of time-homogenous Markov process $X_t$ by studying properties of the Markov semigroup $P_t$.

This is an example of a strongly continuous semigroup.

Let $\mathscr{D}(\mathscr{L})$ be set of all $f \in C_b(E)$ such that the limit

\begin{equation*}
\mathscr{L} f := \lim_{t \to 0} \frac{P_t f - f}{t}
\end{equation*}

exists.

The operator $\mathscr{L}: \mathscr{D}(\mathscr{L}) \to C_b(\mathbb{R}^d)$ is called the (infinitesimal) generator of the operator semigroup $P_t$

  • Also referred to as the generator of the Markov process $X_t$

This is an example of a (infinitesimal) generator of a strongly continuous semigroup.

The semigroup property of the generator of the Markov process implies that we can write

\begin{equation*}
P_t = e^{t \mathscr{L}}
\end{equation*}

Furthermore, consider function

\begin{equation*}
u(x, t) = \big( P_t f \big)(x) = \mathbb{E} \Big[ f(X_t) \mid X_0 = x \Big]
\end{equation*}

Compute time-derivative

\begin{equation*}
\begin{split}
  \frac{\partial u}{\partial t} &= \frac{d}{dt} \big( P_t f \big) \\
  &= \frac{d}{dt} \Big( e^{t \mathscr{L}} f \Big) \\
  &= \mathscr{L} \Big( e^{t \mathscr{L}} f \Big) \\
  &= \mathscr{L} P_t f \\
  &= \mathscr{L} u
\end{split}
\end{equation*}

And we also have

\begin{equation*}
u(x, 0) = P_0 f(x) = f(x)
\end{equation*}

Consequently, $u(x, t)$ satisfies the IVP

\begin{equation*}
\begin{split}
  \frac{\partial u}{\partial t} &= \mathscr{L} u \\
  u(x, 0) &= f(x)
\end{split}
\end{equation*}

which defines the backward Kolmogorov equation.

This equation governs the evolution of the expectation of an observalbe $f \in C_b(\mathbb{R}^d)$.

Example: Brownian motion 1D

  • Transition function for Brownian motion is given by the fundamental solution to the heat equation in 1D
  • Corresponding Markov semigroup is the heat semigroup

    \begin{equation*}
P_t = \exp \bigg( \frac{t}{2} \frac{d^2}{dx^2} \bigg)
\end{equation*}
  • Generator of the 1D Brownian motion is then the 1D Laplacian $\frac{1}{2} \frac{d^2}{dx^2}$
  • The backward Kolmorogov equation is then the heat equation

    \begin{equation*}
\frac{\partial u}{\partial t} = \frac{1}{2} \frac{\partial^2 u}{\partial x^2}
\end{equation*}

Adjoint semigroup

Let $P_t$ be a Markov semigroup, which then acts on $C_b(\mathbb{R}^d)$.

The adjoint semigroup $P_t^*$ acts on probability measures:

\begin{equation*}
\begin{split}
  P_t^* \mu(\Gamma) &= \int_{\mathbb{R}^d}^{} \mathbb{P} \big( X_t \in \Gamma \mid X_0 = x \big) \ d\mu(x) \\
  &= \int_{\mathbb{R}^d}^{} P(t, x, \Gamma) \ d \mu(x)
\end{split}
\end{equation*}

The image of a probability measure $\mu$ under $P_t^*$ is again a probability measure.

The operators $P_t$ and $P_t^*$ are adjoint in the $L^2$ sense:

\begin{equation*}
\int_{\mathbb{R}^d}^{} P_t f(x) \  d\mu(x) = \int_{\mathbb{R}^d}^{} f(x) \ d \big( P_t^* \mu \big)(x)
\end{equation*}

We can write

\begin{equation*}
P_t^* = e^{t \mathscr{L}^*}
\end{equation*}

where $\mathscr{L}^*$ is the $L^2$ adjoint of the generator of the Markov process:

\begin{equation*}
\left\langle h, \mathscr{L} f \right\rangle = \int_{}^{} \mathscr{L} fh \ dx = \int_{}^{} f \mathscr{L}^* h \ dx = \left\langle \mathscr{L}^* h, f \right\rangle
\end{equation*}

Let $X_t$ be a Markov process with generator $P_t$ with $X_0 \sim \mu$, and let $P_t^*$ denote the adjoint Markov semigroup.

We define

\begin{equation*}
\mu_t := P_t^* \mu
\end{equation*}

This is the law of the Markov process. This follows the equation

\begin{equation*}
\frac{\partial \mu_t}{\partial t} = \mathscr{L}^* \mu, \quad \mu_0 =  \mu
\end{equation*}

Assuming that the initial distribution $\mu$ and the law of the process $\mu_t$ each have a density wrt. Lebesgue measure, denoted $\rho_0(\cdot)$ and $\rho(\cdot, t)$, respectively, the law becomes

\begin{equation*}
\frac{\partial \rho}{\partial t} = \mathscr{L}^* \rho, \quad \rho(y, 0) = \rho_0(y)
\end{equation*}

which defines the forward Kolmorogov equation.

"Simple" forward Kolmogorov equation
  • Consider SDE

    \begin{equation*}
\dd{X} = b \big( X, t \big) \dd{t} + \sigma  \big( X, t \big) \dd{W}
\end{equation*}
  • Since $X(t)$ is Markovian, its evoluation can be characterised by a transition probability $p(x, t \mid y, s)$:

    \begin{equation*}
\mathbb{P} \Big( X(t) \in [x, x + \dd{t}] : X(s) = y \Big) = p(x, t \mid y, s) \dd{x}
\end{equation*}
  • Consider $f \in C^2(\mathbb{R})$, then

    \begin{equation*}
\dd{\Big( f \big( X(u) \big) \Big)} = \partial_x f \Big( X(u) \Big) \Big[ b \big( X(u), u \big) \dd{u} + \sigma \big( X(u), u \big) \dd{W} \Big] + \frac{1}{2} \partial_{xx} f \Big( X(u) \Big) \sigma^2 \Big( X(u), u \Big) \dd{u}
\end{equation*}

Ergodic Markov Processes

Using the adjoint Markov semigroup, we can define the invariant measure as a probability measure that is invariant under time evolution of $X_t$, i.e. a fixed point of the semigroup $P_t^*$:

\begin{equation*}
P_t^* \mu = \mu
\end{equation*}

A Markov process is said to be ergodic if and only if there exists a unique invariant measure $\mu$.

We say the process is ergodic wrt. the measure $\mu$.

Furthermore, if we consider a Markov process $X_t$ in $\mathbb{R}^d$ with generator $\mathscr{L}$ and Markov semigroup $P_t$, we say that $X_t$ is ergodic provided that $0$ is a simple eigenvalue of $\mathscr{L}$, i.e.

\begin{equation*}
\mathscr{L} g = 0
\end{equation*}

has only constant solutions.

Thus, we can study the ergodic properties of a Markov process $X_t$ by studying the null space of the its generator.

We can then obtain an equation for the invariant measure in terms of the adjoint $\mathscr{L}^*$ of the generator.

Assume that $\mu$ has a density $\rho$ wrt. the Lebesgue measure. Then

\begin{equation*}
\lim_{t \to \infty} \frac{P_t^* \mu - \mu}{t} = 0 \iff \mathscr{L}^* \rho = 0
\end{equation*}

by definition of the generator of the adjoint semigroup.

Furthermore, the long-time average of an observable $f$ converges to the equilibrium expectation wrt. the invariant measure

\begin{equation*}
\lim_{T \to \infty} \frac{1}{T} \int_{0}^{T} f(X_s) \ ds = \int_{}^{} f(x) \ \mu(dx)
\end{equation*}

1D Ornstein-Uhlenbeck process and its generator

The 1D Ornstein-Uhlenbeck process is an ergodic Markov process with generator

\begin{equation*}
\mathscr{L} = - \alpha x \frac{d}{dx} + D \frac{d^2}{dx^2}
\end{equation*}

The null-space of $\mathscr{L}$ comprises of constants in $x$, hence it is an ergodic Markov process.

In order to find the invariant measure, we need to solve the stationary Fokker-Planck equation:

\begin{equation*}
\mathscr{L}^* \rho = 0, \quad \rho \ge 0, \quad \int_{}^{} \rho(x) \ dx = 1
\end{equation*}

Which clearly require that we have an expression for $\mathscr{L}^*$. We have $\mathscr{L}$, so

\begin{equation*}
\begin{split}
  \int_{\mathbb{R}}^{} \mathscr{L}f \ h \ dx &= \int_{\mathbb{R}}^{} \bigg[ \bigg( - \alpha x \frac{df}{dx} \bigg) + \bigg( D \frac{d^2 f}{dx^2} \bigg) \bigg] \ dx \\
  &= \int_{\mathbb{R}}^{} \Big[ f \partia_x \big( \alpha x h \big) + f \big( D \partial_x^2 h \big) \Big] \ dx \\
  &=: \int_{\mathbb{R}}^{} f \mathscr{L}^* h \ dx
\end{split}
\end{equation*}

Thus,

\begin{equation*}
\mathscr{L}^* h := \frac{d}{dx} \big( ax h \big) + D \frac{d^2 h}{dx^2}
\end{equation*}

Substituting this expression for $\mathscr{L}^*$ back into the equation above, we get

\begin{equation*}
\mu(dx) = \sqrt{\frac{\alpha}{2 \pi D}} \exp \bigg( - \frac{\alpha x^2}{2D} \bigg) \ dx
\end{equation*}

which is just a Gaussian measure!

Observe that in the above expression, the stuff on LHS before $dx$ corresponds to $\rho(x)$ in the equation we solved for $\mathscr{L}^*$.

If $X_0 \sim \mathcal{N}(0, D / \alpha)$ (i.e. distributed according to the invariant measure $\mu$ derived above), then $X_t$ is a mean-zero Gaussian second-order stationary process on $[0, \infty)$ with correlation function

\begin{equation*}
R(t) = \frac{D}{\alpha} e^{- \alpha |t|}
\end{equation*}

and spectral density

\begin{equation*}
f(x) = \frac{D}{\pi} \frac{1}{x^2 + \alpha^2}
\end{equation*}

as seen before!

Furthermore, Ornstein-Uhlenbeck process is the only real-valued mean-zero Gaussian second-order stationary Markov process with continuous paths defined on $\mathbb{R}$.

Diffusion Processes

Notation

  • $o(t)$ means depends on terms dominated by linearity in $t$

Stuff

A Markov process consists of three parts:

  • a drift
  • a random part
  • a jump process

A diffusion process is a Markov process with no jumps.

A Markov process $X_t$ in $\mathbb{R}$ with transition function $P(\Gamma, t \mid x, s)$ is called a diffusion process if the following conditions are satisfied:

  1. (Continuity) For every $x$ and $\varepsilon > 0$,

    \begin{equation*}
\int_{\left| x - y \right| > \varepsilon}^{} P \big( dy, t \mid x, s \big) = o(t - s)
\end{equation*}

    uniformly over $s < t$

  2. ( Drift coefficient ) There exists a function $b(x, s)$ s.t. for every $x$ and every $\varepsilon > 0$,

    \begin{equation*}
\int_{\left| y - x \right| \le \varepsilon}^{} \big( y - x \big) P \big( dy, t \mid x, s \big) = b(x,s) (t - s) + o(t - s)
\end{equation*}

    uniformly over $s < t$.

  3. ( Diffusion coefficient ) There exists a function $\Sigma(x, s)$ s.t. for every $x$ and every $\varepsilon > 0$,

    \begin{equation*}
\int_{\left| y - x \right| \le \varepsilon}^{} \big( y - x \big)^2 P \big( dy, t \mid x, s \big) = \Sigma(x, s) (t -s) + o(t - s)
\end{equation*}

    uniformly over $s < t$.

Important: above we've truncated the domain of integration, since we do not know whether the first and second moments of $X_t$ are finite. If we assume that there exists $\delta > 0$ such that

\begin{equation*}
\lim_{t \to s} \frac{1}{t - s} \int_{\mathbb{R}^d}^{} \left| y - x \right|^{2 + \delta} P(dy, t \mid ,x s ) = 0
\end{equation*}

then we can extend integration over all of $\mathbb{R}$ and use expectations in the definition of the drift and the diffusion coefficient , i.e. the drift:

\begin{equation*}
\lim_{t \to s} \mathbb{E} \bigg[ \frac{X_t - X_s}{t - s} \bigg| X_s = x \bigg] = b(x, s)
\end{equation*}

and diffusion coefficient:

\begin{equation*}
\lim_{t \to s} \mathbb{E} \bigg[ \frac{\left| X_t - X_s \right|^2}{t - s} \bigg| X_s = x \bigg] = \Sigma(x, s)
\end{equation*}

Backward Kolmogorov Equation

Let $f \in C_b(\mathbb{R})$, and let

\begin{equation*}
u(x, s) := \mathbb{E} \Big[ f(X_t) \mid X_s = x \Big] = \int_{}^{} f(y) \ P(dy, t \mid x, s)
\end{equation*}

with fixed $t$.

Assume, furthermore, that the functions $b(x, s)$ and $\Sigma(x, s)$ are smooth in both $x$ and $s$.

Then $u(x, s)$ solves the final value problem

\begin{equation*}
- \frac{\partial u}{\partial s} = b(x, s) \frac{\partial u}{\partial x} + \frac{1}{2} \Sigma(x, s) \frac{\partial^2 u}{\partial x^2}, \quad u(x, t) = f(x)
\end{equation*}

for $s \in [0, t]$.

For a proof, see Thm 2.1 in pavliotis2014stochastic. It's clever usage of the Chapman-Kolmogorov equation and Taylor's theorem.

For a time-homogenous diffusion process, where the drift and the diffusion coefficents are independent of time:

\begin{equation*}
b = b(x), \quad \Sigma = \Sigma(x)
\end{equation*}

we can rewrite the final value problem defined by the backward Kolmogorov equation as an initial value problem.

Let $T = t - s$, and introduce $U(x, T) = u(x, t - s)$. Then,

\begin{equation*}
\frac{\partial U}{\partial T} = b(x) \frac{\partial U}{\partial x} + \frac{1}{2} \Sigma(x) \frac{\partial^2 U}{\partial x^2}, \quad U(x, 0) = f(x)
\end{equation*}

Further, we can let $s = 0$, therefore

\begin{equation*}
\frac{\partial u}{\partial t} = \mathscr{L} u = b(x) \frac{\partial u}{\partial x} + \frac{1}{2} \Sigma(x) \frac{\partial^2 u}{\partial x^2}, \qquad u(x, 0) = f(x)
\end{equation*}

where

\begin{equation*}
u(x, t) = \mathbb{E} \Big[ f(X_t) \mid X_0 = x \Big]
\end{equation*}

is the solution to the IVP.

Forward Kolmorogov Equation

Assume that the conditions of a diffusion process are satisfied, and that the following are smooth functions of $y$ and $t$:

  • $p(y, t \mid \cdot, \cdot)$
  • $b(y, t)$
  • $\Sigma(y, t)$

Then the transition probability density is the solution to the IVP

\begin{equation*}
\frac{\partial p}{\partial t} = - \frac{\partial }{\partial y} \Big( b(t, y) \ p  \Big) + \frac{1}{2} \frac{\partial^2 }{\partial y^2} \Big( \Sigma(t, y) \ p \Big), \qquad p(y, s \mid x, s) = \delta(x - y)
\end{equation*}

For proof, see Thm 2.2 in pavliotis2014stochastic. It's clever usage of Chapman-Kolmogorov equation.

Solving SDEs

This section is meant to give an overview over the calculus of SDEs and tips & tricks for solving them, both analytically and numerically. Therefore this section might be echoing other sections quite a bit, but in a more compact manner most useful for performing actual computations.

Notation

Analytically

"Differential calculus"

  • $\big( dW \big)^2$ = $dt$
  • $\big( dW \big)^n = 0$ for all $n > 2$
  • If multivariate case, typically one will assume white noise to be indep.:

    \begin{equation*}
\dd{W}_i \dd{W}_j = \delta_{ij} \big( \dd{W}_i \big)^2 = \delta_{ij} \dd{t}
\end{equation*}
  • Suppose we have some SDE for $X$, i.e. some expression for $dX$. Change of variables $Y = f(X, t)$ then from Itô's lemma we have

    \begin{equation*}
\dd{Y} = \bigg( \frac{\partial Y}{\partial t} + \frac{\partial Y}{\partial X} + \frac{1}{2} \frac{\partial^2 Y}{\partial X^2} \bigg) \ \dd{X} + \frac{\partial Y}{\partial X} \big( \dd{X} \big)^2
\end{equation*}

    where $\big( \dd{X} \big)^2$ is computed by straight forward substitution by expression for $dX$ and using the "properties" of $\dd{W}$

  • Letting $X = W$ and $u(x) = x^n$ in Itô's lemma we get

    \begin{equation*}
\dd{W^n} = n W^{n - 1} \dd{W} + \frac{n (n - 1)}{2} \dd{t}
\end{equation*}
  • Letting $X = W$ and $u(x) = e^{\lambda x - \lambda^2 t / 2}$, then Itô's lemma gets us

    \begin{equation*}
\begin{split}
  d u &= - \frac{\lambda^2}{2} u \dd{t} + \lambda u \dd{W} + \frac{\lambda^2}{2} u \dd{W} \\
  &= \lambda u \dd{W}
\end{split}
\end{equation*}

    Hence $Y = e^{\lambda W - \lambda^2 t / 2}$ satisfies the geometric Brownian motion SDE.

Numerically

Numerical SDEs

Notation

  • $\Delta W_n = W(t_{n + 1}) - W(t_{n})$
  • We willl consider an SDE with the exact solution

    \begin{equation*}
X(t_{n + 1}) = X(t_n) + \int_{t_n}^{t_{n + 1}} f \Big( X(s) \Big) \dd{s} + \int_{t_n}^{t_{n + 1}} g \Big( X(s) \Big) \dd{W(s)}
\end{equation*}

Convergence

The strong error is defined

\begin{equation*}
e_{\Delta t}^{\text{strong}} := \sup_{0 \le t_n \le T} \mathbb{E} \big[ \left| X_n - X(t_n) \right| \big]
\end{equation*}

We say a method converges strongly if

\begin{equation*}
e_{\Delta t}^{\text{strong}} \to 0 \quad \text{as} \quad \Delta t \to 0
\end{equation*}

We say the method has strong order $p$ if

\begin{equation*}
e_{\Delta t}^{\text{strong}} \le K \Delta t^p, \quad \forall 0 \le \Delta t \le \Delta t^*
\end{equation*}

Basically, this notion of convergence talks about how "accurately" paths are followed.

Error on individual path level

Methods

Euler-Maruyama

\begin{equation*}
X_{n + 1} = X_n + f(X_n) \Delta t + g(X_n) \Delta W_n
\end{equation*}

Milstein method

Stochastic Taylor Expansion

Deterministic ODEs

  • Consider

    \begin{equation*}
\dv{X}{t} = b \Big( X(t) \Big)
\end{equation*}
  • For some function, we then write

    \begin{equation*}
f \Big( X(t) \Big) - f \Big( X(s) \Big) = \int_{s}^{t} \dv{x}{t} \frac{\partial f}{\partial x} \dd{s} = \int_{s}^{t} \Big( \tilde{L}_0 f \Big) \Big( X(s') \Big) \dd{s'}
\end{equation*}

    where

    \begin{equation*}
\tilde{L}_0 f = b(x) \partial_x f
\end{equation*}

    and the $\dv{x}{t} = b(x)$ is due to the above

  • This looks quite a bit like the numerical quadrature rule, dunnit kid?!
    • Because it is!
  • We can then do the same for $\tilde{L}_0 f$:

    \begin{equation*}
\begin{split}
  \big( \tilde{L}_0 f \big) \Big( X(s) \Big) - \big( \tilde{L}_0 f \big) \Big( X(s') \Big) &= \int_{s'}^{s} \frac{d}{ds''} \bigg( \dv{x}{t} \frac{\partial f}{\partial X} \bigg) \dd{s''} \\
  &= \int_{s'}^{s} \big( \tilde{L}_1 f \big) \big( X(s'') \big) \dd{s''}
\end{split}
\end{equation*}

    where

    \begin{equation*}
\big( \tilde{L}_1 f \big) \big( X(t) \big) = \dv{}{t} \bigg( \dv{x}{t} \frac{\partial f}{\partial x} \bigg) = \dv[2]{x}{t} \frac{\partial f}{\partial x} + \bigg( \dv{x}{t} \bigg)^2 \pdv[2]{f}{x}
\end{equation*}
  • We can then substitute this back into the original integral, and so on, to obtain higher and higher order
  • This is basically just performing a Taylor expansion around a point wrt. the stepsize $h$
  • Reason for using integrals rather than the the "standard" Taylor expansion to motivate the stochastic way of doing this, where we cannot properly talk about taking derivatives

Stochastic

  • Idea: extend the "Taylor expansion method" of obtaining higher order numerical methods for ODEs to SDEs
  • Consider

    \begin{equation*}
\dd{X(t)} = b \Big( X(t) \Big) \dd{t} + \sigma \Big( X(t) \Big) \dd{W(t)}
\end{equation*}
  • Satisfies the Itô integral

    \begin{equation*}
X(t + h) - X(t) = \int_{t}^{t + h} b \Big( X(s) \Big) \dd{s} + \int_{t}^{t + h} \sigma \Big( X(s) \Big) \dd{W(s)}
\end{equation*}
  • Can do the same as before for each of the integral terms:
  • Then we can substitute these expressions into our original expression for $X(t + h) - X(t)$:

    \begin{equation*}
\begin{split}
  X(t + h) - X(t) &= \int_{t}^{t + h} \bigg[ b \big( X(s') \big) + \frac{1}{2} \int_{s'}^{s} \pdv[2]{b}{X} \dd{s''} + \int_{s'}^{s} \pdv{b}{X} \dd{X(s'')} \bigg] \dd{s} \\
  & \quad + \int_{t}^{t + h} \bigg[ \sigma \Big( X(s') \Big) + \frac{1}{2} \int_{s'}^{s} \pdv[2]{\sigma}{X} \dd{s''} + \int_{s'}^{s} \pdv{\sigma}{X} \dd{X(s'')} \bigg] \dd{W(s)}
\end{split}
\end{equation*}
  • Observe that we can bring the terms $b \big( X(s') \big)$ and $\sigma \big( X(s') \big)$ out of the integrals:

    \begin{equation*}
\begin{split}
  X(t + h) - X(t) &= b \big( X(s') \big) \underbrace{\int_{t}^{t + h} \dd{s}}_{= h} +  \int_{t}^{t + h} \bigg[ \frac{1}{2} \int_{s'}^{s} \pdv[2]{b}{x} \dd{s''} + \int_{s'}^{s} \pdv{b}{X} \dd{X(s'')} \bigg] \dd{s} \\
  & \quad + \sigma \big( X(s') \big) \underbrace{\int_{t}^{t + h} \dd{W(s)}}_{= W(t + h) - W(t) = \Delta W} \\
  & \quad + \int_{t}^{t + h} \bigg[ \frac{1}{2} \int_{s'}^{s} \pdv[2]{\sigma}{X} \dd{s''} + \int_{s'}^{s} \pdv{\sigma}{X} \dd{X(s'')} \bigg] \dd{W(s)} \\
  &= b \big( X(s') \big) h +  \int_{t}^{t + h} \bigg[ \frac{1}{2} \int_{s'}^{s} \pdv[2]{b}{x} \dd{s''} + \int_{s'}^{s} \pdv{b}{X} \dd{X(s'')} \bigg] \dd{s} \\
  & \quad + \sigma \big( X(s') \big) \Delta W +  \int_{t}^{t + h} \bigg[ \frac{1}{2} \int_{s'}^{s} \pdv[2]{\sigma}{X} \dd{s''} + \int_{s'}^{s} \pdv{\sigma}{X} \dd{X(s'')} \bigg] \dd{W(s)}
\end{split}
\end{equation*}
  • Observe that the two terms that we just brought out of the integrals define the Euler-Maruyama method!!
    • Bloody dope, ain't it?

Connections between PDEs and SDEs

Suppose we have an SDE of the form

\begin{equation*}
\dd{X} = b \big( X, t \big) \dd{t} + \sigma \big( X, t \big) \dd{W}
\end{equation*}

Forward Kolmogorov / Fokker-Plank equation

Then the Fokker-Planck / Forward Kolmogorov equation is given by the following ODE:

\begin{equation*}
\begin{split}
  \partial_t p(x, t \mid y, s) &= - \partial_x \bigg( b(x, t) p(x, t \mid y, s) \bigg) + \frac{1}{2} \partial_{xx} \bigg( \sigma^2(x, t) p(x, t \mid y, s) \bigg) \\
  \text{subject to} \quad p(x, 0 \mid y, 0) &= \delta(x - y)
\end{split}
\end{equation*}

Derivation of Forward Kolmogorov

Suppose we have an SDE of the form

\begin{equation*}
\dd{X} = b \big( X, t \big) \dd{t} + \sigma \big( X, t \big) \dd{W}
\end{equation*}

Consider twice differentiable function $f: \mathbb{R} \to \mathbb{R}$, then Itô's formula gives us

\begin{equation*}
\begin{split}
  \dd{f \big( X(t) \big)} &= \partial_t f \big( X(t) \big) \dd{t} + \partial_x f \big( X(t) \big) \dd{X(t)} + \frac{1}{2} \partial_{xx} f \big( X(t) \big) \big( \dd{X(t)} \big)^2 \\
  &= \partial_x f \big( X(t) \big) \Big[ b \big( X(t), t \big) \dd{t} + \sigma^2 \big( X(t), t \big) \dd{W(t)} \Big] + \frac{1}{2} \partial_{xx} f \big( X(t) \big) \dd{t}
\end{split}
\end{equation*}

which then satisfies the Itô integral

\begin{equation*}
\begin{split}
  f \big( X(t) \big) - f \big( X(s) \big) &= \int_{s}^{t} b \big( X(\tau), \tau \big) \Big( \partial_x f \big( X(\tau) \big) \Big) + \frac{1}{2} \sigma^2 \big( X(\tau), \tau \big) \Big( \partial_{xx} f \big( X(\tau) \big) \Big) \dd{\tau} \\
  & \quad + \int_{s}^{t} \sigma \big( X(\tau), \tau \big) \Big( \partial_x f \big( X(\tau) \big) \Big) \dd{W(\tau)}
\end{split}
\end{equation*}

Taking the (conditional) expectation, the last term vanish, so LHS becomes

\begin{equation*}
\begin{split}
  \mathbb{E} \Big[ f \big( X(t) \big) - f \big( X(s) \big) \mid X(s) = x_0 \Big] &= \mathbb{E} \Big[ f \big( X(t) \big) \mid X(s) = x_0 \Big] - f(x_0) \\
  &= \bigg( \int_{}^{} f( x ) p(x, t \mid x_0, s) \dd{x} \bigg) - f(x_0)
\end{split}
\end{equation*}

and RHS

\begin{equation*}
\begin{split}
  & \mathbb{E} \Bigg[ \int_{s}^{t} b \big( X(\tau), \tau \big) \Big( \partial_x f \big( X(\tau) \big) \Big) + \frac{1}{2} \sigma^2 \big( X(\tau), \tau \big) \Big( \partial_{xx} f \big( X(\tau) \big) \Big) \dd{\tau} \ \bigg| \ X(s) = x_0 \Bigg] \\
  = \quad & \int_{}^{} \bigg( \int_{s}^{t} p(x, \tau \mid x_0, s) \bigg[ b \big( x, \tau \big) \Big( \partial_x f(x) \Big) + \frac{1}{2} \sigma^2 \big( x, \tau \big) \Big( \partial_{xx} f (x) \Big) \bigg] \dd{\tau} \bigg) \dd{x}
\end{split}
\end{equation*}

Taking the derivative wrt. $t$, LHS becomes

\begin{equation*}
\int_{}^{} f(x) \partial_t p(x, t \mid x_0, s) \dd{x}
\end{equation*}

and RHS

\begin{equation*}
\int p(x, t \mid x_0, s) \bigg[ b \big( x, t \big) \Big( \partial_x f(x) \Big) + \frac{1}{2} \sigma^2 \big( x, t \big) \Big( \partial_{xx} f (x) \Big) \bigg] \dd{x}
\end{equation*}

The following is actually quite similar to what we do in variational calculus for our variations!

Here we first use the standard "integration by parts to get $f$ rather than derivatives of $f$", and then we make use of the Fundamental Lemma of the Calculus of Variations!

Using integration by parts in the above equation for RHS, we have

\begin{equation*}
\int \bigg[ - \partial_x \Big( p(x, t \mid x_0, s) b(x, t) \Big) + \frac{1}{2} \partial_{xx} \Big( p(x, t \mid x_0, s) \sigma^2(x, t) \Big) \bigg] f(x) \dd{x}
\end{equation*}

where we have assumed that we do not pick up any extra terms (i.e. the functions vanish at the boundaries). Hence,

\begin{equation*}
\int f(x) \Big( \partial_t p(x, t \mid x_0, s) \Big) \dd{x} = \int \bigg[ - \partial_x \Big( p(x, t \mid x_0, s) b(x, t) \Big) + \frac{1}{2} \partial_{xx} \Big( p(x, t \mid x_0, s) \sigma^2(x, t) \Big) \bigg] f(x) \dd{x}
\end{equation*}

Since this holds for all $f \in C^2$, by the Fundamental Lemma of Calculus of Variations, we need

\begin{equation*}
\partial_t p(x, t \mid x_0, s) = - \partial_x \Big( p(x, t \mid x_0, s) b(x, t) \Big) + \frac{1}{2} \partial_{xx} \Big( p(x, t \mid x_0, s)  \sigma^2(x, t) \Big)
\end{equation*}

which is the Fokker-Planck equation, as wanted!

Assumptions

In the derivation above we assumed that the non-integral terms vanished when we performed integration by parts. This is really assuming one of the following:

  • Boundary conditions at $\infty$: $p \to 0$ as $\left| x \right| \to \infty$
  • Absorbing boundary conditions: where we assume that $p = 0$ for $x \in \partial D$ (the boundary of the domain $D$)
  • Reflecting boundary conditions: this really refers to the fact that multidimensional variations vanish has vanishing divergence at the boundary. See multidimensional Euler-Lagrange and the surrounding subjects.

Backward equation

And the Backward Kolmogorov equation:

\begin{equation*}
\begin{split}
 \partial_s u(,x ,s , t) + b(x, s) \partial_x u(x, s, t) + \frac{1}{2} \sigma^2(x, s) \partial_{xx} u &= 0 \\
 u(x, t, t) = f(x)
\end{split}
\end{equation*}

where

\begin{equation*}
u(x, s, t) = \big( P_t f \big)(x) = \mathbb{E} \Big[ f \big( X(t) \big) \mid X(s) = x \Big], \quad t \ge s
\end{equation*}

Here we've used $P_t f$ defined as

\begin{equation*}
\big( P_t f \big)(x) = \mathbb{E} \Big[ f \big( X(t) \big) \mid X(s) = x \Big]
\end{equation*}

rather than

\begin{equation*}
\big( P_t f \big)(x) = \mathbb{E} \Big[ f \big( X(t) \big) \mid X(0) = x \Big]
\end{equation*}

as defined before.

I do this to stay somewhat consistent with notes in the course (though there the operator $P_t$ is not mentioned explicitly), but I think using $s = 0$ would simplify things without loss of generality (could just define the equations using $t' = t - s$ and $s' = 0$, and substitute back when finished).

Furthermore, if $b(X, t) = b(X)$ and $\sigma^2(X, t) = \sigma^2(X)$, i.e. time-independent (autonomous system), then

\begin{equation*}
  u(x, s, t) = u(x, 0, t - s) =: u(x, t- s)
\end{equation*}

in which case the Forward Kolmogorov equation becomes

\begin{equation*}
\begin{split}
  \partial_t u(x, t) &= b(X) \partial_x u(x, t) + \frac{1}{2} \sigma^2(x) \partial_xx u(x, t) \\
  u(x, 0) &= f(x)
\end{split}
\end{equation*}

where we've used the fact that in this case $\partial_s u = - \partial_t u$.

Notation of generator of the Markov process and its adjoint

In the notation of the generator of the Markov process we have

\begin{equation*}
\begin{split}
  \mathcal{L} u &= b(x, t) \pdv{u}{x} + \frac{1}{2} \sigma^2(x, t) \pdv[2]{u}{t} \\   
  \mathcal{L}^* \rho &= - \pdv{}{x} \Big( b(x, t) \rho \Big) + \frac{1}{2} \pdv[2]{}{x} \Big( \sigma^2(x, t) \rho \Big)
\end{split}
\end{equation*}

TODO Introduction to Stochastic Differential Equations

Notation

  • $X_t^x$ indicates the dependence on the initial condition $X(0) = x$.
  • Stopping time

    \begin{equation*}
\tau_D = \inf_{t > 0} \left\{ X_t \notin D \right\}
\end{equation*}

Motivation

  • We will consider stochastic differential equations of the form

    \begin{equation*}
d X(t) = \mathbf{b} \Big( t, X(t) \Big) \ dt + \sigma \Big( t, X(t) \Big) \ dW(t), \quad X(0) = x
\end{equation*}

    or, equivalently, componentwise,

    \begin{equation*}
d X_i(t) = b_i \Big( t, X(T) \Big) \ dt + \sum_{j=1}^{m} \sigma_{ij} \Big( t, X(t) \Big) \ dW_j(t), \quad j = 1, \dots, d
\end{equation*}

    which is really just notation for

    \begin{equation*}
X(t) = X(0) + \int_{0}^{t} \mathbf{b} \Big( t, X(t) \Big) \ dt + \int_{0}^{t} \sigma \Big( t, X(t) \Big) \ d W(t)
\end{equation*}
  • Need to define stochastic integral

    \begin{equation*}
I(t) := \int_{0}^{t} h(t) \ d W(t)
\end{equation*}

    for sufficiently large class of functions.

    • Since Brownian motion is not of bounded variation → Riemann-Stieltjes integral cannot be defined in a unique way

Itô and Stratonovich Stochastic Integrals

Let

\begin{equation*}
I(t) = \int_{0}^{t} f(s) \ d W(s)
\end{equation*}

where $W(t)$ is a Brownian motion and $t \in [0, T]$, such that

\begin{equation*}
\mathbb{E} \bigg[ \int_{0}^{T} f(s)^2 \ ds \bigg] < \infty
\end{equation*}

Integrand is a stochastic process whose randomness depends on $W(t)$, and in particular, that is adapted to the filtration $\mathscr{F}_t$ generated by the Brownian motion $W(t)$, i.e. that is an $\mathscr{F}_t \text{-measurable}$ function for all $t \in [0, T]$.

Basically means that the integrand depends only on the past history of the Brownian motion wrt. which we are integrating.

We then define the stochastic integral $I(t)$ as the $L^2(\Omega)$ ($\Omega$ is the underlying probability space) limit of the Riemann sum approximation

\begin{equation*}
I(t) := \lim_{n \to \infty} \sum_{i=0}^{n - 1} f(\tau_i) \big( W(t_{i + 1}) - W(t_i) \big)
\end{equation*}

where

  • $t_i = i \Delta t$ where $\Delta t$ is s.t. $n \Delta t = t$
  • $\lambda \in [0, 1]$
  • increments

    \begin{equation*}
\tau_i := \big( 1 - \lambda \big) t_i + \lambda t_{i + 1}, \quad i = 0, \dots, n - 1
\end{equation*}

What we are really saying here is that $I(t)$ (which itself is a random variable, i.e. a measurable function on the probability space $(\Omega, \mathcal{A}, P)$) is the limit of the Riemann sum in a mean-square sense, i.e. converges in $L^2$!

\begin{equation*}
\begin{split}
  & \mathbb{E} \Bigg[ \left| I(t) - \lim_{n \to \infty} \sum_{i=0}^{n - 1} f(\tau_i) \big( W(t_{i + 1}) - W(t_i) \big) \right|^2 \Bigg] \\
  & \quad = \int_{\Omega} \left| I(t) - \lim_{n \to \infty} \sum_{i=0}^{n - 1} f(\tau_i) \big( W(t_{i + 1}) - W(t_i) \big) \right|^2 \dd{P} \\
  & \quad = 0 
\end{split}
\end{equation*}

I found this Stackexchange answer to be very informative regarding the use of Riemann sums to define the stochastic integral.

Let $I(t)$ be a stochastic integral.

The Itô stochastic integral is when $\lambda = 0$, i.e.

\begin{equation*}
I_I(t) := \lim_{n \to \infty} \sum_{i = 0}^{n - 1} f(t_i) \big( W(t_{i + 1}) - W(t_i) \big)
\end{equation*}

Let $I(t)$ be a stochastic integral.

The Stratonovich stochastic integral is when $\lambda = \frac{1}{2}$, i.e.

\begin{equation*}
I_S(t) := \lim_{n \to \infty} \sum_{i = 0}^{n - 1} f \bigg( \frac{1}{2}(t_i + t_{i + 1}) \bigg) \big( W(t_{i + 1}) - W(t_i) \big)
\end{equation*}

We will often use the notation

\begin{equation*}
I_S(t) = \int_{0}^{t} f(s) \circ d W(s)
\end{equation*}

to denote the Stratonovich stochastic integral.

Observe that $I_S$ does not satisfy the Martingale property, since we are taking the midpoint in $[t_k, t_{k + 1}]$, therefore $W(t_{k + 1}) - W(t_k)$ is correlated with $f$!

Itô instead uses the Martingale property by evaluating $f$ at the start point of the integral, and thus there is no auto correlation between the $\Delta W_{k, k + 1}$ and $f(t_k)$, making it much easier to work with.

Suppose that there exist $C, \delta > 0$ s.t.

\begin{equation*}
\mathbb{E} \Big[ \big( f(t) - f(s) \big)^2 \Big] \le C \left| t - s \right|^{1 + \delta}, \quad 0 \le s, t \le T
\end{equation*}

Then the Riemann sum approximation for the stochastic integral converges in $L^1(\Omega)$ to the same value for all $\lambda \in [0, 1]$.

From the definition of a stochastic integral, we can make sense of a "noise differential equation", or a stochastic differential equation

\begin{equation*}
\dot{X} = b(X, t) + \sigma(X, t) \xi(t), \quad X(0) = X_0
\end{equation*}

with $\xi(t)$ being white noise.

The solution $X(t)$ then satisfies the integral equation

\begin{equation*}
X(t) = X_0 + \int_{0}^{t} b \big( X(s), s \big) \ ds + \int_{0}^{t} \sigma \big( X(s), s \big) \ dW(s)
\end{equation*}

which motivates the notation

\begin{equation*}
\dot{X} = b(X, t) + \sigma(X, t) \xi(t) \quad \equiv \quad d X = b(X, t) \ dt + \sigma(X, t) \ dW
\end{equation*}

since, in a way,

\begin{equation*}
\xi(t) = \dot{W} = \frac{dW}{dt}
\end{equation*}

(sometimes you will actually see the Brownian motion (this definition for example) defined in this manner, e.g. lototsky2017stochastic)

Properties of Itô stochastic integral

Itô isometry

Let $I(t)$ be a Itô stochastic integral, then

\begin{equation*}
\mathbb{E} \bigg[ \bigg( \int_{0}^{T} f(t) \ d W(t) \bigg)^2 \bigg] = \int_{0}^{T} \mathbb{E} \big[ \left| f(t) \right|^2 \big] \ dt
\end{equation*}

From which it follows that for any square-integrable functions $g, h$

\begin{equation*}
\mathbb{E} \bigg[ \bigg( \int_{0}^{T} h(t) \ d W(t) \int_{0}^{T} g(s) \ d W(s) \bigg) \bigg] = \mathbb{E} \bigg[ \int_{0}^{T} h(t) g(t) \ dt \bigg]
\end{equation*}
\begin{equation*}
\begin{split}
  \mathbb{E} \bigg[ \bigg( \int_{0}^{T} f(t) + h(t) \ d W(t) \bigg)^2 \bigg]
  &= \mathbb{E} \bigg[ \bigg( \int_{0}^{T} f(t) \ d W(t) \bigg)^2 + \bigg( \int_{0}^{T} h(t) \ d W(t) \bigg)^2 \\
  & \qquad \quad + 2 \bigg( \int_{0}^{T} f(t) h(t) \ d W(t) \bigg) \bigg] \\
  &= \int_{0}^{T} \mathbb{E} \big[ \left| f(t) \right|^2 \big] \ dt + \int_{0}^{T} \mathbb{E} \big[ \left| h(t) \right|^2 \big] \ d W(t) \\
  & \qquad \quad + 2 \ \mathbb{E} \bigg[ \bigg( \int_{0}^{T} f(t) h(t) \ d W(t) \bigg) \bigg]
\end{split}
\end{equation*}

but we also know that

\begin{equation*}
\begin{split}
  \mathbb{E} \bigg[ \bigg( \int_{0}^{T} f(t) + h(t) \ d W(t) \bigg)^2 \bigg] &= \int_{0}^{T} \mathbb{E} \big[ \left| f(t) + h(t) \right|^2 \big] \ dt \\
  &= \int_{0}^{T} \mathbb{E} \big[ \left| f(t) \right|^2 \big] \ dt + \int_{0}^{T} \mathbb{E} \big[ \left| h(t) \right|^2 \big]  \\
  & \qquad + 2 \int_{0}^{T} \mathbb{E} \big[ \left| f(t) h(t) \right| \big] \ dt
\end{split}
\end{equation*}

Comparing with the equation above, we see that

\begin{equation*}
\mathbb{E} \bigg[ \bigg( \int_{0}^{T} f(t) h(t) \ d W(t) \bigg) \bigg] = \mathbb{E} \bigg[ \int_{0}^{T} f(t) h(t) \ dt  \bigg]
\end{equation*}

WHAT HAPPENED TO THE ABSOLUTE VALUE MATE?! Well, in the case where we are working with real functions, the Itô isometry is satisfied for $f(t)^2$ since it's equal to $\left| f(t) \right|^2$, which would give us the above expression.

Martingale

For Itô stochastic integral we have

\begin{equation*}
\mathbb{E} \bigg[ \int_{0}^{t} f(s) \ d W(s) \bigg] = 0
\end{equation*}

and

\begin{equation*}
\mathbb{E} \bigg[ \int_{0}^{t} f(\ell) \ d W(\ell) \ \bigg| \ \mathscr{F}_s \bigg] = \int_{0}^{s} f(\ell) \ d W(\ell), \quad \forall t \ge s
\end{equation*}

where $\mathscr{F}_s$ denotes the filtration generated by $W(s)$, hence the Itô integral is martingale.

The quadratic variation of this martingale is

\begin{equation*}
\left\langle I \right\rangle_t = \int_{0}^{t} \big( f(s) \big)^2 \ ds
\end{equation*}

Solutions of SDEs

  • Consider SDEs of the form

    \begin{equation*}
dX_t = \mathbf{b}(t, X_t) \ dt + \boldsymbol{\sigma}(t, X_t) \ dW_t, \quad X(0) = x
\end{equation*}

    where

    • $\mathbf{b}: [0, T] \times \mathbb{R}^d \to \mathbb{R}^d$
    • $\boldsymbol{\sigma}: [0, T] \times \mathbb{R}^{d} \to \mathbb{R}^{d \times m}$

A process $X_t$ with continuous paths defined on the probability space $\big( \Omega, \mathscr{F}, P \big)$ is called a strong solution to the SDE if:

  1. $X_t$ is a.s. continuous and adapted to the filtration $\mathscr{F}_t$
  2. $\mathbf{b}(\cdot, X) \in L^1 \Big( (0, T); \mathbb{R}^d \Big)$ and $\boldsymbol{\sigma}(\cdot, X) \in L^2 \Big( (0, T); \mathbb{R}^{d \times n} \Big)$ a.s.
  3. For every $t \ge 0$, the stochastic integral equation

    \begin{equation*}
X_t \overset{a.s.}{=} x + \int_{0}^{t} \mathbf{b}(s, X_s) \ ds + \int_{0}^{t} \boldsymbol{\sigma}(s, X_s) \ d W_s, \quad X(0) = x
\end{equation*}

Let

  • $\mathbf{b}: [0, T] \times \mathbb{R}^d \to \mathbb{R}^d$
  • $\boldsymbol{\sigma}: [0, T] \times \mathbb{R}^{d} \to \mathbb{R}^{d \times m}$

satisfy the following conditions

  1. There exists positive constant $C$ s.t. for all $x \in \mathbb{R}^d$ and $t \in [0, T]$

    \begin{equation*}
\norm{\mathbf{b}(t, x)} + \norm{\boldsymbol{\sigma}(t, x)}_F \le C \big( 1 + \norm{x} \big)
\end{equation*}
  2. For all $x, y \in \mathbb{R}^d$ and $t \in [0, T]$,

    \begin{equation*}
\norm{\mathbf{b}(t, x) - \mathbf{b}(t, y)} + \norm{\boldsymbol{\sigma}(t, x) - \boldsymbol{\sigma}(t, y)}_F \le C \norm{x - y}
\end{equation*}

Furthermore, suppose that the initial condition $x$ is a random variable independent of the Brownian motion $W_t$ with

\begin{equation*}
\mathbb{E} \big[ \norm{x}^2 \big] < \infty
\end{equation*}

Then the SDE

\begin{equation*}
dX_t = \mathbf{b}(t, X_t) \ dt + \boldsymbol{\sigma}(t, X_t) \ dW_t, \quad X(0) = x
\end{equation*}

has a unique strong solution $X_t$ with

\begin{equation*}
\mathbb{E} \bigg[ \int_{0}^{t} \norm{X_s}^2 \ ds \bigg] < \infty, \quad \forall t \in [0, T]
\end{equation*}

where by unique we mean

\begin{equation*}
X_t \overset{a.s.}{=} Y_t, \quad \forall t \in [0, T]
\end{equation*}

for all possible solutions $X_t$ and $Y_t$.

Itô's Formula

Notation

Stuff

Assume that the conditions used in thm:unique-strong-solution hold.

Let $X_t$ be the solution of

\begin{equation*}
d X_t = \mathbf{b}(X_t) \ dt + \boldsymbol{\sigma}(X_t) \ d W_t
\end{equation*}

and let $V \in C^{1, 2} \big( [0, T] \times \mathbb{R}^d \big)$. Then the process $V(X_t)$ satisfies

\begin{equation*}
\begin{split}
  V(t, X_t) =& V(X_0) + \int_{0}^{t} \frac{\partial V}{\partial s} (s, X_s) \ ds + \int_{0}^{t} \mathscr{L} V(s, X_s) \ ds \\
  & + \int_{0}^{t} \sum_{i=1}^{d} \frac{\partial^2 V(s, X_s)}{\partial x_i \partial x_j} \sigma_{ij}(X_s) \ d W_i(s) \ d W_j(s)
\end{split}
\end{equation*}

where the generator $\mathscr{L}$ is defined

\begin{equation*}
\mathscr{L} = \sum_{i=1}^{d} b_j(x) \frac{\partial }{\partial x_j} + \frac{1}{2} \sum_{i, j, k = 1}^{d} \sigma_{ik} \sigma_{jk} \frac{\partial^2 }{\partial x_i \partial x_j}
\end{equation*}

If we further assume that noise in different components are independent, i.e.

\begin{equation*}
d W_i(t) \ d W_j(t) = \delta_{ij} dt
\end{equation*}

then $V(t, X_t)$ simplifies to

\begin{equation*}
\begin{split}
  V(t, X_t) =& V(X_0) + \int_{0}^{t} \frac{\partial V}{\partial s}(s, X_s) \ ds + \int_{0}^{t} \mathscr{L} V(s, X_s) \ ds \\
  & + \int_{0}^{t} \sum_{i = 1}^{d} \frac{\partial^2 V(s, X_s)}{\partial x_i^2} \sigma_{ii}(X_s) \ dt
\end{split}
\end{equation*}

Finally, this can then be written in "differential form":

\begin{equation*}
dV(t, X_t) = \frac{\partial V}{\partial t} \dd{t} + \underbrace{\sum_{i=1}^{d} \frac{\partial V}{\partial x_i} \ d X_i + \frac{1}{2} \sum_{i, j = 1}^{d} \frac{\partial^2 V}{\partial x_i \partial x_j} \ d X_i \ d X_j}_{= \frac{d}{dt} \int_{0}^{t} \mathscr{L} V(s, X_s) \ ds}
\end{equation*}

In pavliotis2014stochastic it is stated that the proof is very similar to the proof of the validity of the proof of the validity of the backward Kolmogorov equation for diffusion processes.

Feynman-Kac formula

  • Itô's formula can be used to obtain a probabilistic description of solutions ot more general PDEs of parabolic type

Itô's formula can be used to obtain a probabilistic description of solutions of more general PDEs of parabolic type.

Let $X_t^x$ be a diffusion process with

  • drift $b$
  • diffusion $\Sigma = \boldsymbol{\sigma} \boldsymbol{\sigma}^T$
  • generator $\mathscr{L}$ with $X_0^x = x$

and let

  • $f \in C_0^2(\mathbb{R}^d)$
  • $V \in C(\mathbb{R}^d)$

be bounded from below.

Then the function

\begin{equation*}
u(x, t) = \mathbb{E} \bigg[ \exp \bigg( - \int_{0}^{t} V(X_s^x) \ ds \bigg) f(X_t^x) \bigg]
\end{equation*}

is the solution to the IVP

\begin{equation*}
\begin{split}
  \frac{\partial u}{\partial t} &= \mathscr{L} u - V u \\
  u(0, x) &= f(x)
\end{split}
\end{equation*}

The representation $u(x, t)$ is then called the Feynman-Kac formula of the solution to the IVP.

This is useful for theoretical analysis of IVP for parabolic PDEs of the form above, and for their numerical solution using Monte Carlo approaches.

Examples of solvable SDEs

Ornstein-Uhlenbeck process

Properties
  • Mean-reverting
  • Additive noise
Stuff
\begin{equation*}
d X_t = - \gamma X_t + \sigma d W_t, \quad \gamma, \sigma > 0
\end{equation*}

We observe that using Itô's formula

\begin{equation*}
\begin{split}
  d \big( e^{\gamma t} X_t \big) &= \bigg( \frac{\partial }{\partial t} e^{\gamma t} X_t \bigg) \ dt + \bigg( \frac{\partial }{\partial X_t} e^{\gamma t} X_t \bigg) \ d X_t + \bigg( \frac{\partial^2 }{\partial X_t^2} e^{\gamma t} X_t \bigg) \big( d X_t \big)^2 \\
  &= \gamma e^{\gamma t} X_t \ dt + e^{\gamma t} d X_t + 0 \\
  &= e^{\gamma t} \big( \gamma X_t \ dt + d X_t  \big)
\end{split}
\end{equation*}

which from the Ornstein-Uhlenbeck equation we see that

\begin{equation*}
d \big( e^{\gamma t} X_t \big) = e^{\gamma t} \big( \sigma \ dW_t \big) = \sigma e^{\gamma t} \ d W_t
\end{equation*}

i.e.

\begin{equation*}
\int_{}^{} d \big( e^{\gamma t} X_t \big) = \int \sigma e^{\gamma t} d W_t
\end{equation*}

which gives us

\begin{equation*}
e^{\gamma t} X(t) = \int_{0}^{t} \sigma e^{\gamma s} d W(s)
\end{equation*}

and thus

\begin{equation*}
X(t) = e^{- \gamma t} X(0) + \sigma \int_{0}^{t} e^{\gamma (s - t)} \ dW(s)
\end{equation*}

with $X(0) = X_0$ assumed to be non-random. This is the solution to the Ornstein-Uhlenbeck process.

Further, we observe the following:

\begin{equation*}
\mathbb{E} \big[ X(t) \big] = X_0 e^{- \gamma t}
\end{equation*}

since this is a Gaussian process. The covariance ew see

\begin{equation*}
\begin{split}
  & \mathbb{E} \big[ \big( X(t) - X_0 e^{- \gamma t} \big) \big( X(s) - X_0 e^{- \gamma t} \big) \big] \\
  =\ & \mathbb{E} \bigg[ \bigg( \sigma \int_{0}^{t} e^{\gamma (t' - t)} \ dW(t') \bigg) \bigg( \sigma \int_{0}^{s} e^{\gamma (s' - s)} \ dW(s') \bigg) \bigg] \\
  =\ & \sigma^2 \mathbb{E} \bigg[ \bigg( \int_{0}^{s} e^{\gamma (s' - t)} \ dW(s') \bigg) \bigg( \int_{0}^{s} e^{\gamma (s' - s)} \ dW(s') \bigg) \bigg( \int_{s}^{t} e^{\gamma (s' - t)} \ dW(s') \bigg) \bigg] \\
  =\ & \sigma^2 \mathbb{E} \bigg[ \bigg( \int_{0}^{s} e^{\gamma (s' - t)} \ dW(s') \bigg) \bigg( \int_{0}^{s} e^{\gamma (s' - s)} \ dW(s') \bigg) \bigg( \int_{s}^{t} e^{\gamma (s' - t)} \ dW(s') \bigg) \bigg]
\end{split}
\end{equation*}

Assuming $s \le t$, these are independent! Therefore

\begin{equation*}
\begin{split}
  & \mathbb{E} \bigg[ \bigg( \int_{0}^{s} e^{\gamma (s' - t)} \ dW(s') \bigg) \bigg( \int_{0}^{s} e^{\gamma (s' - s)} \ dW(s') \bigg) \bigg( \int_{s}^{t} e^{\gamma (s' - t)} \ dW(s') \bigg) \bigg] \\
  = \ & \mathbb{E} \bigg[ \bigg( \int_{0}^{s} e^{\gamma (s' - t)} \ dW(s') \bigg) \bigg( \int_{0}^{s} e^{\gamma (s' - s)} \ dW(s') \bigg) \bigg] \mathbb{E} \bigg[ \bigg( \int_{s}^{t} e^{\gamma (s' - t)} \ dW(s') \bigg) \bigg]
\end{split}
\end{equation*}

Using Itô isometry, the first factor becomes

\begin{equation*}
\mathbb{E} \bigg[ \int_{0}^{s} e^{\gamma(2s' - t - s)} \ ds' \bigg] =  \frac{1}{2 \gamma} \int_{0}^{s} e^{\tau} \ d \tau = \frac{1}{2 \gamma} \big( e^{s} - 1 \big)
\end{equation*}

since

\begin{equation*}
\tau = 2 \gamma s' - s - t \implies \frac{d \tau}{ds'} = 2 \gamma
\end{equation*}

Langevin equation

Notation
  • $X$ position of particle
  • $V$ velocity of particle
Definition
\begin{equation*}
\ddot{X} = - \gamma \dot{X} + \sigma \zeta
\end{equation*}
Solution

Observe that the Langevin equation looks very similar to the Ornstein-Uhlenbeck process, but with $\dot{X}$ instead of $X$. We can write this as a system of two SDEs

\begin{equation*}
\begin{cases}
  dX &= V \ dt \\
  d V &= - \gamma V \ dt + \sigma \ dW
\end{cases}
\end{equation*}

The expression for $d V $ is simply a OU process, and since this does not depend on $X$, we simply integrate as we did for the OU process giving us

\begin{equation*}
V(t) = V_0 e^{- \gamma t} + \sigma \int_{0}^{t} e^{- \gamma(t - s)} \ dW(s)
\end{equation*}

Subsituting into our expression for $dX$:

\begin{equation*}
\begin{split}
  \int_{X_0}^{X_t} dX &= \int_{0}^{t} V(s) \ ds \\
  &= \int_{0}^{t} \bigg( V_0 e^{- \gamma s} + \sigma \int_{0}^{s} e^{- \gamma (s - \tau)} \ d W(\tau) \bigg) \ ds \\
  &= \frac{V_0}{\gamma} \big( 1 - e^{- \gamma t} \big) + \sigma \int_{0}^{t} \int_{0}^{s} e^{- \gamma (s - \tau)} \ d W(\tau) \ ds
\end{split}
\end{equation*}

We notice that the integral here is what you call a "triangular" integral; we're integrating from $0 \to s$ and integrating $s \to t$. We can therefore interchange the order of integration by integrating $s$ from $\tau \to t$ and the integrating $\tau$ from $0 \to t$! In doing so we get:

\begin{equation*}
\begin{split}
  \int_{0}^{t} \int_{0}^{s} e^{- \gamma (s - \tau)} \dd{W(\tau)} \dd{s} &= \int_{0}^{t} \int_{\tau}^{t} e^{- \gamma(s - \tau)} \dd{s} \dd{W(\tau)} \\
  &= \int_{0}^{t} - \frac{1}{\gamma} \big( e^{- \gamma (t - \tau)} - 1 \big) \dd{W(\tau)} \\
  &= \frac{1}{\gamma} \bigg[ W(t) - \int_{0}^{t} e^{- \gamma (t - \tau)} \dd{W(\tau)} \bigg]
\end{split}
\end{equation*}

Hence the solution is given by

\begin{equation*}
X(t) = X(0) + V(0) \frac{1 - e^{- \gamma t}}{\gamma} - \frac{\sigma}{\gamma} \int_{0}^{t} e^{-\gamma(t - s)} \dd{W(s)} + \frac{\sigma}{\gamma} W(t)
\end{equation*}

Geometric Brownian motion

The Geometric Brownian motion equation is given by

\begin{equation*}
\dd{Y} = \lambda Y \dd{W}, \quad Y(0) = 1
\end{equation*}

Brownian bridge

Consider the process $B(t)$ which satisfies the SDE

\begin{equation*}
\dd{B} = \frac{-B}{1 - t} \dd{t} + \dd{W}, \quad B(0) = 0
\end{equation*}

for $0 \le t < 1$.

This is the definiting SDE for a Brownian bridge.

Solution

Let

\begin{equation*}
Y = \frac{B}{1 - t}
\end{equation*}

Ito's formula gives

\begin{equation*}
\dd{Y} = \frac{B}{(1 - t)^2} \dd{t} + \frac{1}{1 - t} \dd{B} = \frac{1}{1 - t} \dd{W}
\end{equation*}

which satisfies

\begin{equation*}
Y(t) = Y(0) + \int_{0}^{t} \frac{1}{1 - s} \dd{W(s)}
\end{equation*}

Hence,

\begin{equation*}
B(t) = (1 - t) B(0) + ( 1 - t ) \int_{0}^{t} \frac{1}{1 - s} \dd{W(s)}
\end{equation*}

Imposing the BCs $B(0) = B(1) = 0$, we have

\begin{equation*}
B(t) = (1 - t) \int_{0}^{t} \frac{1}{1 - s} \dd{W(s)}
\end{equation*}

A random oscillator

The harmonic oscillator ODE $\ddot{x} + \omega^2 x = 0$ can be written as a system of ODEs:

\begin{equation*}
\dot{x} = \omega y, \quad \dot{y} = - \omega x
\end{equation*}

A stochastic version might be

\begin{equation*}
\dd{X} = \alpha Y \dd{W}, \quad \dd{Y} = - \alpha X \dd{W}
\end{equation*}

where $\alpha$ is constant and the frequency is a white noise $\omega = \alpha \xi$ with $\xi \sim \mathcal{N}(0, 1)$.

Solution

We can solve this by letting

\begin{equation*}
Z = X + i Y
\end{equation*}

so

\begin{equation*}
\dd{Z} = - i \alpha Z \dd{W}
\end{equation*}

Then,

\begin{equation*}
\dd{\log Z} = \frac{1}{Z} \dd{Z} - \frac{1}{2} \frac{1}{Z^2} \big( \dd{Z} \big)^2 = - i \alpha \dd{W} + \frac{1}{2} \alpha^2 \dd{t}
\end{equation*}

Hence,

\begin{equation*}
\begin{split}
  Z(t) &= Z(0) \exp \bigg( \frac{1}{2} \int_0^t \alpha^2 \dd{s} - i \alpha \int_0^t \dd{W} \bigg) \\
  &= Z(0) \exp \bigg( \frac{1}{2} \alpha^2 t - i \alpha W(t) \bigg)
\end{split}
\end{equation*}

and thus

\begin{equation*}
X(t) = \text{Re} \big( Z(t) \big), \quad Y(t) = \text{Im} \big( Z(t) \big)
\end{equation*}

Stochastic Partial Differential Equations (rigorous)

Overview

This subsection are notes taken mostly from lototsky2017stochastic. This is quite a rigorous book which I find can be quite a useful supplement to the more "applied" descriptions of SDEs you'll find in most places.

I have found some of these more "formal" definitions to be provide insight:

  • Brownian motion expressed as a sum over basis-elements multiplied with white noise in each term
  • Definition of a "Gaussian process" as it is usually called, instead as a "Guassian field", such that each finite collection of random variables form a Gaussian vector.

Notation

  • $x = (x_1, \dots, x_d) \in \mathbb{R}^d$
  • $xy = x_1 y_1 + \dots x_d y_d$
  • $\mathcal{C}(A, B)$ denotes the space of continuous mappings from metric space $A$ to metric space $B$
    • If $B = \mathbb{R}$ then we write $\mathcal{C}(A)$
  • $\mathcal{C}^n$ is the collection of functions with $n$ continuous derivatives
  • $\mathcal{C}^{n + \gamma}(A)$ for $\gamma \in (0, 1)$ and $n \in \mathbb{N}$ is the collection of functions with $n$ continuous derivatives s.t. derivatives of order $n$ are Hölder continuous of order $\gamma$.
  • $\mathcal{C}_0^{\infty}(A)$ is the collection of infinitely differentiable functions with compact support
  • $\mathcal{S}(\mathbb{R}^d)$ denotes the Schwartz space and $\mathcal{S}'(\mathbb{R}^d)$ denotes the dual space (i.e. space of linear operators on $\mathcal{S}(\mathbb{R}^d)$)
  • $D^n f(x)$ denotes the partial derivatives of every order $n$
  • Partial derivatives:

    \begin{equation*}
u_t = \frac{\partial u}{\partial t}, \quad u_{x_i, x_j} = \frac{\partial^2 u}{\partial x_i \partial x_j}  
\end{equation*}

    and

    \begin{equation*}
\dot{v} = \frac{dv}{dt}  
\end{equation*}
  • Laplace operator is denoted by $\boldsymbol{\Delta}$
  • $a_k \sim b_k$ means

    \begin{equation*}
\lim_{k \to \infty} \frac{a_k}{b_k} = c \in (0, \infty)  
\end{equation*}

    and if $c = 1$ we will write $a_k \simeq b_k$

  • $a_k \asymp b_k$ means

    \begin{equation*}
0 < c_1 \le \frac{a_k}{b_k} \le c_2 < \infty  
\end{equation*}

    for all sufficiently larger $k$.

  • $\eta \sim \mathcal{N}(m, \sigma^2)$ means that $\eta$ is a Gaussian rv. with mean $m$ and variance $\sigma^2$
  • $du = \dots dw$ or $\dot{u} = \dots \dot{w}$ are equations driven by Wiener process
  • $\mathbb{F} = (\Omega, \mathcal{F}, \left\{ \mathcal{F}_t \right\}, \mathbb{P})$
    • $\Omega$ is the sample space (i.e. underlying space of the measure space)
    • $\mathcal{F} \subseteq 2^{\Omega}$ is the sigma-algebra ($2^{\Omega}$ denotes the power set of $\Omega$)
    • $\{ F_t \}$ is an increasing family of sub-algebras and $\mathcal{F}_t \subseteq \mathcal{F}$ is right-continuous (often called a filtration)
    • $\mathcal{F}_t$ contains all $\mathbb{P}$ neglible sets, i.e. $\mathcal{F}_t$ contains every subset of $\Omega$ that is a subset of an element from $\mathcal{F}$ with $\mathbb{P}$ measure zero (i.e. $\mathcal{F}_t$ is complete)
  • $\xi_k$ is an indep. std. Gaussian random variable
  • $a \wedge b = \min \left\{ a, b \right\}$

Definitions

A filtered probability space is given by

\begin{equation*}
\Big( \Omega, \mathcal{F}, \big( \mathcal{F}_t \big)_{t \ge 0}, \mathbb{P} \Big)
\end{equation*}

where the sigma-algebra $\mathcal{F}_t$ represents the information available up until time $t$.

Let

A random process $X$ is adapted if $X_t$ is $\mathcal{F}_t \text{-measurable}$.

This is equivalent to requiring that $\mathcal{F}_t^X \subseteq \mathcal{F}_t$ for all $t \ge 0$.

Martingale

A square-integrable Martingale on $\mathbb{F}$ is a process $M = M(t)$ with values in $\mathbb{R}^d$ such that

\begin{equation*}
M(0) = 0, \quad \mathbb{E} \big[ \left| M(t) \right|^2 \big] < \infty
\end{equation*}

and

\begin{equation*}
\mathbb{E} \big[ M(t) \mid \mathcal{F}_s \big] = M(s), \quad \forall t \ge s \ge 0
\end{equation*}

A quadratic variation of a martingale $M$ is the continuous non-decreasing real-valued process $\left\langle M \right\rangle$ such that

\begin{equation*}
\left| M \right|^2 - \left\langle M \right\rangle
\end{equation*}

is a martingale.

A stopping (or Markov) time on $\mathbb{F}$ is a non-negative random variable $\tau$ such that

\begin{equation*}
\left\{ \omega: \tau(\omega) > t \right\} \in \mathcal{F}, \quad \forall t \ge 0
\end{equation*}

Introduction

If $\{ m_k(t), k \ge 1 \}$ is an orthonormal basis in $L_2 \big( (0, T) \big)$, then

\begin{equation*}
w(t) = \sum_{k \ge 1}^{} \Big( \int_{0}^{t} m_k(s) \ ds \Big) \xi_k
\end{equation*}

is a standard Brownian motion ; a Gaussian process with zero mean and covariance given by

\begin{equation*}
\mathbb{E} [ w(t) w(s) ] = \min(t, s)
\end{equation*}

This definition of a standard Brownian motion does make a fair bit of sense.

It basically says that that a Brownian motion can be written as a sum of elements in the basis of the space, with each term being multiplied by some white noise.

The derivative of Brownian motion (though does not exist in the usual sense) is then defined

\begin{equation*}
\dot{w}(t) = \sum_{k \ge 1}^{} m_k(t) \xi_k
\end{equation*}

While the series certainly diverges, id oes define a random generalized function on $L_2 \Big( (0, T) \Big)$ according to the rule

\begin{equation*}
\dot{w}(f) = \sum_{k \ge 1}^{} f_k \xi_k, \quad f_k = \int_{0}^{T} f(t) m_k(t) \ dt
\end{equation*}

Consider

  • a collection $\{ w_k = w_k(t), k \ge 1 \}$ of indep. std. Brownian motions, $t \in [0, T]$
  • orthonormal basis $\{ \mathfrak{h}_k(x), k \ge 1, x \in G \}$ in the space $L_2(G)$ with

    \begin{equation*}
G = (0, L)^d = (0, L) \times \cdots \times (0, L)
\end{equation*}

    a d-dimensional hyper-cube.

  • For $x = (x_1, x_2, \dots, x_d)$ define

    \begin{equation*}
h_k(x) = \int_{0}^{x_1} \int_{0}^{x_2} \cdots \int_{0}^{x_d} \mathfrak{h}_k(r_1, \dots, r_d) \ d r_d \dots d r_1
\end{equation*}

Then the process

\begin{equation*}
W(t, x) = \sum_{k \ge 1}^{} h_k(x) w_k(t)
\end{equation*}

is Gaussian,

\begin{equation*}
\mathbb{E}[W(t,x)] = 0, \ \mathbb{E} \big[ W(t, x) W(s, y) \big] = \min(t, s) \prod_{k = 1}^d \min(x_k, y_k)
\end{equation*}

We call this process $W(t,x)$ the Brownian sheet.

From Ex. 1.1.3 b) we have

\begin{equation*}
\xi_k = \int_{0}^{T} m_k(t) \ dw(t), \quad k \ge 1
\end{equation*}

are i.i.d. std. normal. From this we can define

\begin{equation*}
\dot{w}(f) = \int_{0}^{T} f(s) \ dw(s)
\end{equation*}

Writing

\begin{equation*}
\dot{W}(t, x) = \sum_{k \ge 1}^{} \mathfrak{h}_k(x) \dot{w}_k(t)
\end{equation*}

where $\{ \mathfrak{h}_k, k \ge 1 \}$ is an orthonormal basis in $L_2(G)$ and $G \subseteq \mathbb{R}^d$ is an open set.

We call the process $\dot{W}$ the (Gaussian) space-time white noise. It is a random generalized function $L_2 \big( (0, T) \times G \big)$:

\begin{equation*}
\dot{W}(f) = \sum_{k \ge 1}^{} \int_{0}^{T} \Bigg( \int_{G}^{} f(t,x) \mathfrak{h}_k(x) \ dx \Bigg) \ dw_k(t)
\end{equation*}

Sometimes, an alternative notation is used for $\dot{W}(f)$:

\begin{equation*}
\dot{W}(f) = \int_{0}^{T} \int_{G}^{} f(t,x) \ d W(t, x)
\end{equation*}

Unlike the Brownian sheet, space-time white noise $\dot{W}$ is defined on every domain $G \subseteq \mathbb{R}^d$ and not just on hyper-cuves $(0, L)^d$, as log as we can find an orthonormal basis $\{ \mathfrak{h}_k, k \ge 1 \}$ in $L_2(G)$.

Integrating over $d w(t)$

Often see something like

\begin{equation*}
\int_{0}^{t} f \Big( w(s) \Big) \ dw(s)
\end{equation*}

e.g. in Ito's lemma, but what does this even mean?

Consider the integral above, which we then define as

\begin{equation*}
\int_{0}^{t} f \Big( w(s) \Big) \ d w(s) = \lim_{n \to \infty} \sum_{i=1}^{n - 1} f \big( w(t_{i - 1}) \big) \big( w(t_i) - w(t_{i - 1}) \big)
\end{equation*}

as we would do normally in the case of a Riemann-Stieltjes integral.

Now, observe that in the case of $w(t)$ being Brownian motion, each of these increments are well-defined!

Furthermore, considering a partial sum of the RHS in the above equation, we find that the partial sum converges to the infinite sum in the mean-squared sense.

This then means that the stochastic integral is a random variable, the samples of which depend on the individual realizations of the paths $f \big( w(t) \big)$!

Alternative description of Gaussian white noise

  • Zero-mean Gaussian process $\dot{w} = \dot{w}(t)$ such that

    \begin{equation*}
  \mathbb{E} \big[ \dot{w}(t) \dot{w}(s) \big] = \delta(t - s)
\end{equation*}

    where $\delta$ is the Dirac delta function.

  • Similarily, we have

    \begin{equation*}
\mathbb{E} \big[ \dot{W}(t, x) \dot{W}(s, y) \big] = \delta(t - s) \delta(x - y)  
\end{equation*}
  • To construct noise that is white in time and coloured in space, take a sequence of non-negative numbers $\{ q_k, k \ge 1 \}$ and define

    \begin{equation*}
\dot{W}^Q(t, x) = \sum_{k \ge 1}^{} q_k \mathfrak{h}_k(x) \dot{w}_k(t)  
\end{equation*}

    where $\{ \mathfrak{h}_k, k \ge 1 \}$ is an orthonormal basis in $L_2(G)$.

  • We say this noise is finite-dimensional if

    \begin{equation*}
\exists N \in \mathbb{N} : q_k = 0, \quad \forall k \ge N  
\end{equation*}

Useful Equalities

If $F = F(x)$ is a smooth function and $w = w(t)$ is a standard Brownian motion, then

\begin{equation*}
F \big( w(t) \big) = F(0) + \int_{0}^{t} F' \big( w(s) \big) \ dw(s) + \frac{1}{2} \int_{0}^{t} F'' \big( w(s) \big) \ ds
\end{equation*}

If $w$ is a std. Brownian motion and $f$, an adapted process, then

\begin{equation*}
\mathbb{E} \Bigg[ \int_{0}^{T} f(t) \ dw(t) \Bigg]^2 = \int_{0}^{T} \mathbb{E} \big[ f^2(t) \big] \ dt
\end{equation*}

The Fourier transform is defined

\begin{equation*}
\hat{f}(y) = \frac{1}{(2 \pi)^{d / 2}} \int_{\mathbb{R}^d}^{} f(x) e^{-ixy} \ dx
\end{equation*}

which is defined on the generalized functions from $\mathcal{S}'(\mathbb{R}^d)$ by

\begin{equation*}
\big( \hat{f}, \varphi \big) = \big( f, \hat{\varphi} \big)
\end{equation*}

for $f \in \mathcal{S}'(\mathbb{R}^d)$ and $\varphi \in \mathcal{S}(\mathbb{R}^d)$. And the inverse Fourier transform is

\begin{equation*}
\check{f}(y) = \frac{1}{(2 \pi)^{d / 2}} \int_{\mathbb{R}^d}^{} f(x) e^{ixy} \ dx = \hat{f}(- y)
\end{equation*}
\begin{equation*}
\frac{1}{\sqrt{2 \pi}} \int_{-\infty}^{\infty} e^{- x^2 / 2} e^{- ixy} \ dx = \frac{1}{\sqrt{2 \pi}} e^{- y^2 / 2}
\end{equation*}

If $\{ \mathfrak{h}_k, k \ge 1 \}$ is an orthonormal basis in a Hilbert space $X$ and $f \in X$, then

\begin{equation*}
\sum_{k \ge 1}^{} \big( f, \mathfrak{h}_k \big)_X^2 = \norm{f}_X^2
\end{equation*}

Plancherel's identity (or isometry of the Fourier transform) says that if $f$ is a smooth function with compact support in $\mathbb{R}^d$ and

\begin{equation*}
\hat{f}(y) = \frac{1}{(2 \pi)^{d / 2}} \int_{\mathbb{R}^d}^{} f(x) e^{- ixy} \ dx
\end{equation*}

then

\begin{equation*}
\int_{\mathbb{R}^d}^{} |f(x)|^2 \ dx = \int_{\mathbb{R}^d}^{} |\hat{f}(y)|^2 \ dy
\end{equation*}

This result is essentially a continuum version of Parseval's identity.

Useful inequalities

Exercises

1.1.1
\begin{equation*}
\mathbb{E} [w(t)w(s)] = \min(t,s)
\end{equation*}

Observe that

\begin{equation*}
w(t) w(s) = \sum_{k}^{} \bigg[ \bigg( \int_{0}^{t} m_k(t') \ dt' \bigg) \xi_k \bigg] \bigg[ \bigg( \int_{0}^{s} m_k(s') \ ds' \bigg) \xi_k \bigg]
\end{equation*}

since we're taking the product (see notation). Then, from the hint that $\int_0^t m_k(s) \ ds$ is the Fourier coefficient of the indicator function of the interval $[0, t]$, we make use of the Parseval identity:

\begin{equation*}
w(t) w(s) = \sum_{k}^{} \big( I_t \big)_k \big( I_s \big)_k \xi_k^2n
\end{equation*}

where $\big( I_t \big)_k$ denotes the k-th Fourier coefficient of the indicator function. Then

\begin{equation*}
\begin{split}
  \mathbb{E}[w(t) w(s)] &= \sum_{k}^{} \big( I_t \big)_k \big( I_s \big)_k \mathbb{E}[\xi_k^2] \\
  &= \sum_{k}^{} \big( I_t \big)_k \big( I_s \big)_k \\
  &= I_t \ I_s \\
  &= \min(t, s)
\end{split}
\end{equation*}

since $\xi_k$ are all standard normal variables, hence $\text{Var}(\xi_k) = \mathbb{E}[\xi_k^2] = 1$.

I'm not entirely sure about that hint though? Are we not then assuming that the $m_k(t) = \exp\big(\frac{i \pi k t}{L} \big)$ is the basis? I mean, sure that's fine, but not specified anywhere as far as I can tell?

Hooold, is inner-product same in any given basis? It is! (well, at least for finite bases, which this is not, but aight) Then it doesn't matter which basis we're working in, so we might as well use the basis of $\cos(\frac{\pi k t}{L})$ and $\sin(\frac{\pi k t}{L})$.

1.1.2

Same procedure as in 1.1.1., but observing that we can separate $W(t, x)$ into a time- and position-dependent part, and again using the fact that

\begin{equation*}
\int_{0}^{x_i} \mathfrak{h}_k(x_1, x_2, \dots, r_i, \dots, x_d) \ dr_i
\end{equation*}

is just the Fourier coefficient of the indicator function on the range $[0, x_i]$ in the i-th dimension.

Basic Ideas

Notation

  • $\big( \Omega, \mathcal{F}, \mathbb{P} \big)$ is a probability space
  • $\big( A, \mathcal{A} \big)$ and $\big( B, \mathcal{B} \big)$ are two measurable spaces
  • $X: \big( A \times \Omega, \mathcal{A} \times \mathcal{F} \big) \to \big( B, \mathcal{Y} \big)$ is a random function
  • Random process refers to the case when $A, B \subset \mathbb{R}$
  • Random field corresponds to when $A = \mathbb{R}^d$ and $B = \mathbb{R}$.