Variational Calculus

Table of Contents

Sources

  • Most of these notes comes from the Variational Calculus taught by Prof. José Figueroa O'Farrill, University of Edinburgh

Overview

Notation

  • $\mathcal{C}_{P, Q}$ denotes the space of possible paths (i.e. $C^1$ curves) between points $P$ and $Q$

Introduction

Precise analytical techniques to answer:

  • Shortest path between two given points on a surface
  • Curve between two given points in the place that yields a surface of revolution of minimum area when revolved around a given axis
  • Curve along which a bead will slide (under the effect of gravity) in the shortest time

Underpins much of modern mathematical physics, via Hamilton's principle of least action

Consider "standard" directional derivative of $f: U \to \mathbb{R}$, with $U \subseteq \mathbb{R}^N$, at $x_0$ along some vector $\mathbf{v} \in \mathbb{R}^N$:

\begin{equation*}
\sum_{i}^{} v^i \frac{\partial f}{\partial x^i} \bigg|_{x_0} = \frac{d}{ds} f (x_0 + s v)|_{s = 0}
\end{equation*}

where $x_0$ is a critical point of $f$, i.e.

\begin{equation*}
\frac{d}{ds} f(x_0 + sv)|_{s = 0} = 0, \quad \forall v \in \mathbb{R}^N
\end{equation*}

(since we know that $\frac{\partial }{\partial x^i}$ form a basis in $T_p U$, so by lin. indep. we have the above!)

Stuff

Let $f: [0, 1] \to \mathbb{R}^n$ be a continuous function which satisfies

\begin{equation*}
\int_{0}^{1} \left\langle f(t), h(t) \right\rangle \ dt = 0
\end{equation*}

for all $h \in C^{\infty}\Big([0, 1], \mathbb{R}^n\Big)$ with

\begin{equation*}
h(0) = h(1) = 0
\end{equation*}

Then $f \equiv 0$.

Observe that if we only consider $n = 1$, then we can simply genearlize to arbitrary $n$ since integration is a linear operation.

Let $f \in C([0, 1])$ which obeys

\begin{equation*}
\int_{0}^{1} f(t) h(t) \ dt = 0, \quad \forall h \in C^{\infty}([0, 1]), h(0) = h(1) = 0
\end{equation*}

Assume, for the sake of contradiction, that $f \ne 0$, i.e.

\begin{equation*}
\exists t_0 \in [0, 1] : f(t_0) \ne 0
\end{equation*}

Let, w.l.o.g., $f(t_0) > 0$.

Since $f$ is continuous, there is some interval $(a, b) \subset (0, 1)$ such that $t_0 \in (a, b)$ and some $c > 0$ such that

\begin{equation*}
f(t) > c, \quad \forall t \in [0, 1]
\end{equation*}

Suppose for the moment that there exists some $h \in C^{\infty}([0, 1])$ such that

  1. $h(t) = 0$ for all $t$ outside $(a, b)$
  2. $\int_0^1 h(t) \ dt = \int_a^b h(t) \ dt > 0$

Then observe that

\begin{equation*}
\begin{split}
  \int_{0}^{1} f(t) h(t) \ dt &= \int_{a}^{b} f(t) h(t) \ dt \\
  &> c \int_{a}^{b} h(t) \ dt \\
  &> 0
\end{split}
\end{equation*}

This is clearly a contradiction with our initial assumption, hence $\nexists t_0 \in (0, 1)$ such that $f(t_0) \ne 0$. Hence, by continuity,

\begin{equation*}
f(t) = 0, \quad \forall t \in [0, 1]
\end{equation*}

Now we just prove that there exists such a function $h(t)$ which satisfies the properties we described earlier.

Let

\begin{equation*}
\theta(t) = 
\begin{cases}
  e^{- 1 / t} & t > 0 \\
  0 & t \le 0
\end{cases}
\end{equation*}

which is a smooth function. Then let

\begin{equation*}
\varphi(t) = \theta(t) \ \theta(1 - t)
\end{equation*}

which is a smooth function, since it's a product of smooth functions.

To make this vanish outside of $(a, b)$, we have

\begin{equation*}
\varphi_{a, b}(t) = \varphi \Bigg( \frac{t - a}{b - a} \Bigg)
\end{equation*}

This function is clearly always positive, hence,

\begin{equation*}
\int_{a}^{b} \varphi_{a, b}(t) \ dt = (b - a) \int_{0}^{1} \varphi(t) \ dt > 0
\end{equation*}

Hence, letting $h = \varphi_{a, b}$ we get a function used in the proof above!

This concludes our proof of Fundamental Lemma of the Calculus of Variations

General variations

Suppose we want te shortest path in $\mathbb{R}^2$ between $P$ and a curve $C$ on $\mathbb{R}^2$.

We assume that $\exists g: \mathbb{R}^2 \to \mathbb{R}$ with $g$ differentiable, such that $C = g^{-1}(0)$.

Then observe that $x(1) + s \varepsilon(1) \in C$, then

\begin{equation*}
g \big( x(1) + s \varepsilon(1) \big) = 0, \quad \forall s
\end{equation*}

Then

\begin{equation*}
\begin{split}
  \frac{d}{ds} S[x_s] &= \frac{d}{ds} \bigg|_{s = 0} \int_{0}^{1} \norm{\dot{x}_s} \ dt \\
  &= \int_{0}^{1} \left\langle \dot{\xi}, \frac{\dot{x}}{\norm{\dot{x}}} \right\rangle \ dt \\
  &= \int_{0}^{1} \Bigg( \frac{d}{dt} \left\langle \xi, \frac{\dot{x}}{\norm{\dot{x}}} \right\rangle - \left\langle \varepsilon, \frac{d}{dt} \bigg( \frac{\dot{x}}{\norm{\dot{x}}} \bigg) \right\rangle \Bigg) \ dt \\
  &= \left\langle \varepsilon(1), \frac{\dot{x}(1)}{\norm{\dot{x}(1)}} \right\rangle - \underbrace{\left\langle \varepsilon(0), \frac{\dot{x}(0)}{\norm{\dot{x}(0)}} \right\rangle}_{= 0} - \frac{0}{1} \left\langle \varepsilon, \frac{d}{dt} \bigg( \frac{\dot{x}}{\norm{\dot{x}}} \bigg) \ dt \right\rangle \\
  &= \left\langle \varepsilon(1), \frac{\dot{x}(1)}{\norm{\dot{x}(1)}} \right\rangle - \frac{0}{1} \left\langle \varepsilon, \frac{d}{dt} \bigg( \frac{\dot{x}}{\norm{\dot{x}}} \bigg) \ dt \right\rangle
\end{split}
\end{equation*}

where we've used the fact that

\begin{equation*}
\frac{\partial }{\partial s} \frac{\partial }{\partial t} x_s(t) \bigg|_{s = 0} = \frac{\partial }{\partial t} \frac{\partial }{\partial s} x_s(t) \bigg|_{s = 0} = \dot{\varepsilon}
\end{equation*}

We cannot just drop the endpoint-terms anymore, since these are now not necessarily vanishing, as we had in the previous case.

Then, by our earlier assumption, we have

\begin{equation*}
g \Big( x_s(1) \Big) = 0, \quad \text{since} \quad x_s(1) \in C
\end{equation*}

which implies that

\begin{equation*}
\frac{\partial }{\partial s} g \Big( x_s(1) \Big) \Big|_{s = 0}  = 0 = \left\langle \nabla g \big|_{x(1)}, \varepsilon(1) \right\rangle
\end{equation*}

Hence,

\begin{equation*}
\left\langle \varepsilon(1), \frac{\dot{x}(1)}{\norm{\dot{x}(1)}} \right\rangle - \frac{0}{1} \left\langle \varepsilon, \frac{d}{dt} \bigg( \frac{\dot{x}}{\norm{\dot{x}}} \bigg) \ dt \right\rangle
\end{equation*}

must hold for all $\varepsilon: [0, 1] \to \mathbb{R}^2$ and $\varepsilon(0) = 0$ and $\varepsilon(1) \perp g$

In particular, $\varepsilon(1) = 0$, then

\begin{equation*}
\frac{d}{dt} \bigg( \frac{\dot{x}}{\norm{\dot{x}}} \bigg) = 0
\end{equation*}

and

\begin{equation*}
\left\langle \varepsilon(1), \frac{\dot{x}(1)}{\norm{\dot{x}(1)}} \right\rangle = 0
\end{equation*}

i.e. $\dot{x}$ is normal to $C$ at the point where it intersects with $C$.

Euler-Lagrange equations

Notation

Endpoint-fixed variations

Let $\mathcal{C}_{P, Q}$ be the space of $C^1$ curves $x: [0, 1] \to \mathbb{R}^n$ with

\begin{equation*}
x(0) = P \quad \text{and} \quad x(1) = Q
\end{equation*}

The Lagrangian is defined

\begin{equation*}
\begin{split}
  L: \quad & \mathbb{R}^{2n + 1} \to \mathbb{R} \\
  & (x, v, t) \mapsto L(x, v, t)
\end{split}
\end{equation*}

where $x, v \in \mathbb{R}^n$ and $t \in \mathbb{R}$. Let $L$ be "sufficiently" differentiable (usually taken to be smooth in applications).

Then the function $I: \mathcal{C}_{P, Q} \to \mathbb{R}$, called the action, is defined

\begin{equation*}
I[x] = \int_{0}^{1} L \Big( x(t), \dot{x}(t), t \Big) \ dt
\end{equation*}

A path $x$ is a critical point for the action if, for all endpoint-fixed variations $\varepsilon$, we have

\begin{equation*}
\frac{d}{ds} I [x + s \varepsilon] \Big|_{s = 0} = 0
\end{equation*}

Bringing the differentiation into to the integral, we have

\begin{equation*}
\begin{split}
  \frac{d}{ds} I[x + s \varepsilon] &= \int_{0}^{1} \frac{d}{ds} \Big|_{s = 0} L \Big( x + s \varepsilon, \frac{d}{dt}(x + s \varepsilon), t \Big) \ dt \\
  &= \int_{0}^{1} \frac{d}{ds} \Big|_{s = 0} L \Big( x + s \varepsilon, \dot{x} + s \dot{\varepsilon}, t \Big) \ dt \\
  &= \int_{0}^{1} \bigg( \sum_{i=1}^{n} \frac{\partial L}{\partial x^i} \varepsilon^i + \sum_{i=1}^{n} \frac{\partial L}{\partial \dot{x}^i} \dot{\varepsilon}^i \bigg) \ dt \\
  &= \int_{0}^{1} \sum_{i=1}^{n} \bigg( \frac{\partial L}{\partial x^i} - \frac{d}{dt} \frac{\partial L}{\partial \dot{x}^i} \bigg) \varepsilon^i \ dt
\end{split}
\end{equation*}

Properties

  1. If $\frac{\partial L}{\partial t} = 0$, then the "energy" given by

    \begin{equation*}
E := \sum_{i=1}^{n} \frac{\partial L}{\partial \dot{x}^i} \dot{x}^i - L
\end{equation*}

    is constant along extremals of the Lagrangian. Then observe that

    \begin{equation*}
\begin{split}
     \frac{d}{dt} L &= \sum_{i=1}^{n} \frac{\partial L}{\partial x^i} \dot{x}^i + \sum_{i=1}^{n} \frac{\partial L}{\partial \dot{x}^i} \ddot{x}^i + \underbrace{\frac{\partial L}{\partial t}}_{= 0} \\
     &= \sum_{i=1}^{n} \frac{\partial L}{\partial x^i} \dot{x}^i + \sum_{i=1}^{n} \frac{\partial L}{\partial \dot{x}^i} \ddot{x}^i \\
     &= \sum_{i=1}^{n} \bigg( \frac{d}{dt} \frac{\partial L}{\partial \dot{x}^i} \bigg) \dot{x}^i + \sum_{i}^{n} \frac{\partial L}{\partial \dot{x}^i} \ddot{x}^i \\
     &= \frac{d}{dt} \sum_{i=1}^{n} \frac{\partial L}{\partial \dot{x}^i} \dot{x}^i
\end{split}
\end{equation*}

    where we've used the fact that $\frac{d}{dt} \frac{\partial L}{\partial \dot{x}^i} = \frac{\partial L}{\partial x^i}$ in the thrid equality. Thus,

    \begin{equation*}
\frac{d}{dt} \underbrace{\Bigg( \sum_{i=1}^{n} \frac{\partial L}{\partial \dot{x}^i} \dot{x}^i - L \Bigg)}_{\text{constant wrt. motion}} = 0
\end{equation*}

    and,

    \begin{equation*}
\frac{\partial L}{\partial t} = 0 \iff L \text{ invariant under translation: } t \mapsto t + s, \ \forall s \in \mathbb{R}
\end{equation*}

    Hence,

    \begin{equation*}
L(x, v, t) = L(x, v, t + s)
\end{equation*}

    i.e time invariance! This is an instance of Noether's Theorem.

If $\frac{\partial L}{\partial t} = 0$, so that the lagrangian does not depend explicitly on $t$, then the energy

\begin{equation*}
E := \sum_{i=1}^{n} \dot{x}^i \frac{\partial L}{\partial \dot{x}^i} - L
\end{equation*}

is constant.

This is known as the Beltrami's idenity.

Euler-Lagrange

Let $\mathcal{C}_{P, Q}$ be the space of $C^1$ curves $x: [0, 1] \to \mathbb{R}^n$ with

\begin{equation*}
x(0) = P \quad \text{and} \quad x(1) = Q
\end{equation*}

Let

\begin{equation*}
\begin{split}
  L: \quad & \mathbb{R}^{2n + 1} \to \mathbb{R} \\
  & (x, v, t) \mapsto L(x, v, t)
\end{split}
\end{equation*}

where $x, v \in \mathbb{R}^n$ and $t \in \mathbb{R}$, be sufficiently differentiable (typically smooth in applications) and let us consider the function $I: \mathcal{C}_{P, Q} \to \mathbb{R}$ defined by

\begin{equation*}
I[x] = \int_{0}^{1} L \big( x(t), \dot{x}(t), t \big) \ dt
\end{equation*}

Then he extremals must satisfy the Euler-Lagrange equations:

\begin{equation*}
\frac{\partial L}{\partial x^i} = \frac{d}{dt} \frac{\partial L}{\partial \dot{x}^i}
\end{equation*}

Newtonian mechanics

Notation

  • worldlines refer to the trajectory of a particle:

    \begin{equation*}
\left\{ \big( t, x(t) \big) \mid t \in \mathbb{R} \right\}
\end{equation*}

    with $x: \mathbb{R} \to \mathbb{R}^3$

Galilean relativity

  • Affine transformations (don't assume a basis)
  • Relativity group: group of transformations on the universe preserving whichever structure we've endowed the universe with

The subgroup of affine transformations of $\mathbb{R} \times \mathbb{R}^3$ which leave invariant the time interval between events and the distance between simultaneous events is called the Galilean group.

That is, the Galilean group consists of affine transformations of the form

\begin{equation*}
\begin{pmatrix}
  t & \mathbf{x}
\end{pmatrix}
\mapsto
\begin{pmatrix}
  t' \\ \mathbf{x}'
\end{pmatrix}
= 
\begin{pmatrix}
  1 & 0 \\ \mathbf{v} & R
\end{pmatrix}
\begin{pmatrix}
  t \\ \mathbf{x}
\end{pmatrix}
+
\begin{pmatrix}
  s \\ \mathbf{a}
\end{pmatrix}
=
\begin{pmatrix}
  t + s \\ \mathbf{v} t + R \mathbf{x} + \mathbf{a}
\end{pmatrix}
\end{equation*}

These transformations can be written uniquely as a composition of three elementary galilean transformations:

  1. translations in space and time:

    \begin{equation*}
\begin{pmatrix}
  t \\ \mathbf{x}
\end{pmatrix}
\mapsto
\begin{pmatrix}
  t + s \\ \mathbf{x} + \mathbf{a}
\end{pmatrix}
\end{equation*}
  2. orthogonal transformations in space:

    \begin{equation*}
\begin{pmatrix}
  t \\ \mathbf{x}
\end{pmatrix}
\mapsto
\begin{pmatrix}
  t \\ R \mathbf{x}
\end{pmatrix},
\quad R \in O(3)
\end{equation*}
  3. and galilean boosts:

    \begin{equation*}
\begin{pmatrix}
  t \\ \mathbf{x}
\end{pmatrix}
\mapsto
\begin{pmatrix}
  t \\ \mathbf{x} + t \mathbf{v}
\end{pmatrix}
\end{equation*}

Observe that if choose the action

\begin{equation*}
I[x] = \int_{0}^{1} \bigg( \frac{1}{2} m \norm{\dot{x}}^2 - V(x) \bigg) \ dt
\end{equation*}

which has Lagrangian

\begin{equation*}
\mathcal{L}(x, \dot{x}, t) = \frac{1}{2} m \norm{\dot{x}}^2  - V(x)
\end{equation*}

then we observe that the minimizing path $x$ should satisfy

\begin{equation*}
\frac{d}{dt} \frac{\partial \mathcal{L}}{\partial \dot{x}} = \frac{\partial \mathcal{L}}{\partial x}
\end{equation*}

which is

\begin{equation*}
\frac{\partial \mathcal{L}}{\partial x} = - \frac{dV}{dx} = F(x)
\end{equation*}

where $F$ denotes the force. Further,

\begin{equation*}
\frac{\partial \mathcal{L}}{\partial \dot{x}} = m \dot{x}
\end{equation*}

which is the momentum! Then,

\begin{equation*}
\frac{d}{dt} m \dot{x} = - \frac{dV}{dx} \iff m \ddot{x} = F(x)
\end{equation*}

Hence we're left with Newton's 2nd law.

Noether's Theorem

Notation

  • $\varphi_s: \mathbb{R}^n \to \mathbb{R}^n$ of $C^2$ functions called one-parameter subgroup of $C^2$ diffeomorphisms , which are differentiable wrt. $s$
  • $\bar{x}$ and $\bar{t}$ are defined by

    \begin{equation*}
(x, t) \mapsto \Big( \bar{x}(x, t, s), \bar{t}(x, t, s) \Big)
\end{equation*}

    and

    \begin{equation*}
\bar{x}^j = x^j + \zeta^j(t, x) s + \mathcal{O}(s^2) \quad \text{and} \quad \bar{t} = t + \tau(t, x) s + \mathcal{O}(s^2)
\end{equation*}
  • $\zeta^j = \frac{\partial \bar{x}^j}{\partial s}\big|_{s = 0}$
  • $\tau = \frac{\partial \bar{t}}{\partial s}\big|_{s=0}$

Stuff

We've seen the following continuous symmetries this far:

  1. Momentum is conserved:

    \begin{equation*}
\frac{\partial L}{\partial z} = 0 \quad \implies \quad \frac{\partial L}{\partial \dot{z}}
\end{equation*}
  2. Energy is conserved:

    \begin{equation*}
\frac{\partial L}{\partial t} = 0 \quad \implies \quad E: = \sum_{i}^{} \frac{\partial L}{\partial \dot{x}^i} \dot{x}^i - L
\end{equation*}

We say that $\varphi$ is a symmetry of the Lagrangian $L$ if

\begin{equation*}
L(x, \dot{x}, t) = L(y, \dot{y}, t)
\end{equation*}

where

Equivalently, one says that $L$ is invariant under $\varphi$.

Let $\varphi_s: \mathbb{R}^n \to \mathbb{R}^n$ of $C^2$ functions, defined for all $s \in \mathbb{R}$ and depending differentiably on $s$.

Moreover, let this family satisfies the following properties:

  1. $\varphi_0(x) = 0$ for all $x \in \mathbb{R}^n$
  2. $\varphi_s \circ \varphi_t = \varphi_{s + t}$ for all $s, t \in \mathbb{R}$

Then the family $\{ \varphi_s \}$ is called a one-parameter subgroup of $C^2$ diffeomorphisms on $\mathbb{R}^n$.

Let $I[x] = \int_{0}^{1} L(x, \dot{x}, t) \ dt$ be an action for curves $x: [0, 1] \to \mathbb{R}^n$, and let $L$ be invariant under a one-parameter group of diffeomorphisms $\{ \varphi_s \}$.

Then the Noether charge $N$, defined by

\begin{equation*}
N(x, \dot{x}, t) = \sum_{i=1}^{n} \frac{\partial L}{\partial \dot{x}^i} \frac{\partial \varphi_s^i(x)}{\partial s} \bigg|_{s = 0}
\end{equation*}

is conserved; that is, $\frac{dN}{dt} = 0$ along physical trajectories.

Consider functions which are defined by Lagrangians $L(x, \dot{x}, t)$ and such that they are invariant under one-parameter family of diffeomorphisms of $\mathbb{R}^n \times \mathbb{R}$ such that

\begin{equation*}
(x, t) \mapsto \Big( \bar{x}(x, t, s), \bar{t}(x, t, s) \Big)
\end{equation*}

This means, in particular, that

\begin{equation*}
\bar{x}^j = x^j + \zeta^j(t, x) s + \mathcal{O}(s^2) \quad \text{and} \quad \bar{t} = t + \tau(t, x) s + \mathcal{O}(s^2)
\end{equation*}

where

  • $\zeta^j = \frac{\partial \bar{x}^j}{\partial s}\big|_{s = 0}$
  • $\tau = \frac{\partial \bar{t}}{\partial s}\big|_{s=0}$

Then the Noether charge

\begin{equation*}
N: = L \tau + \sum_{k}^{} \frac{\partial L}{\partial \dot{x}^k} (\zeta^k - \dot{x}^k \tau)
\end{equation*}

is conserved along extremals; that is, along curves which obey the Euler-Lagrange equation.

Hamilton's canonical formalism

Notation

  • Considering

    \begin{equation*}
I[x] = \int_{0}^{1} L(x, \dot{x}, t) \ dt
\end{equation*}

    for $C^1$ curves $x: [0, 1] \to \mathbb{R}^n$

  • Differentiable function $\Phi(x, p)$
  • $H$ denotes the Hamiltonian

Canonical form of Euler-Lagrange equation

  • Can convert E-L into equivalent first-order ODE:

    \begin{equation*}
\frac{\partial L}{\partial x^i} = \frac{d}{dt} \frac{\partial L}{\partial \dot{x}^i}
\end{equation*}

    is equivalent to system

    \begin{equation*}
\frac{d x^i}{dt} = v^i \quad \text{and} \quad \frac{d}{dt} \frac{\partial L}{\partial \dot{x}^i} = \frac{\partial L}{\partial x^i}
\end{equation*}

    for the variables $\big( x(t), v(t) \big)$.

  • Consider

    \begin{equation*}
L(x, v) = \frac{1}{2} mv^2 - V(x)
\end{equation*}

    Then letting

    \begin{equation*}
p = \frac{\partial L}{\partial v} = m v   
\end{equation*}

    the above system becomes

    \begin{equation*}
\dot{x}^i = v^i \quad \text{and} \quad \dot{p}^i = - \frac{dV}{d x^i}
\end{equation*}
  • Hamiltonian becomes

    \begin{equation*}
H(x, p) = \frac{p^2}{2m} + V(x)
\end{equation*}

    which has the property

    \begin{equation*}
\frac{dx}{dt} = \frac{\partial H}{\partial p} \quad \text{and} \quad \frac{dp}{dt} = - \frac{\partial H}{\partial x}
\end{equation*}

    known as Hamilton's equations.

  • Observe that

    \begin{equation*}
\frac{d}{dt} 
\begin{pmatrix}
  x \\ p
\end{pmatrix}
= 
\begin{pmatrix}
  0 & 1 \\ -1 & 0
\end{pmatrix}
\begin{pmatrix}
  \frac{\partial H}{\partial x} \\ \frac{\partial H}{\partial p}
\end{pmatrix}
\end{equation*}

    where the coefficient matrix, let's call it $J$, can be thought of as a bilinear form on $\mathbb{R}^2$, which defines a symplectic structure.

  • In general case, existence of solution set to the $n$ equations is guaranteed by the implicit function theorem in the case where the Hessian $\frac{\partial^2 L}{\partial v^i \partial v^j}$ is invertible
    • $L$ is then said to be regular (or non-degenerate)

General case:

  • Hamiltonian

    \begin{equation*}
H(x, p, t) = \sum_{i=1}^{n}  v^i p_i - L(x, v, t)
\end{equation*}
  • Total derivative of $H$ (or as we recognize, the exterior derivative of a continuous function)

    \begin{equation*}
\begin{split}
  d H &= d \bigg( \sum_{i=1}^{n} p_i v^i - L \bigg) \\
  &= \sum_{i=1}^{n} \Big( v^i dp_i + p_i d v^i \Big) - \bigg( \frac{\partial L}{\partial t} dt + \sum_{i=1}^{n} \frac{\partial L}{\partial x^i} dx^i + \sum_{i=1}^{n} \frac{\partial L}{\partial v^i} dv^i \bigg) \\
  &= \sum_{i=1}^{n} v^i dp_i - \frac{\partial L}{\partial t} dt - \sum_{i=1}^{n} \frac{\partial L}{\partial x^i} dx^i
\end{split}
\end{equation*}

    where we have used that $p_i = \frac{\partial L}{\partial v^i}$.

    • Give us

      \begin{equation*}
\begin{split}
  \frac{\partial H}{\partial t} &= - \frac{\partial L}{\partial t} \\
  \frac{\partial H}{\partial p_i} &= v^i \\
  \frac{\partial H}{\partial x^i} &= - \frac{\partial L}{\partial x^i}
\end{split}
\end{equation*}
    • First-order version of Euler-Lagrange equations in canonical (or hamiltonian) form:

      \begin{equation*}
\frac{d x^i}{dt} = \frac{\partial H}{\partial p^i} \quad \text{and} \quad \frac{d p_i}{dt} = - \frac{\partial H}{\partial x^i}
\end{equation*}

      which we call Hamilton's equations.

In general, the Hamiltonian $H(x, p, t)$ is given by

\begin{equation*}
H(x, p, t) = \sum_{i=1}^{n}  v^i p_i - L(x, v, t)
\end{equation*}

Taking the total derivative, we're left with

\begin{equation*}
dH = \sum_{i=1}^{n} v^i d p_i - \frac{\partial L}{\partial t} dt - \sum_{i=1}^{n} \frac{\partial L}{\partial x^i} dx^i
\end{equation*}

where we've used $p_i = \frac{\partial L}{\partial v^i}$.

This gives us

\begin{equation*}
\begin{split}
  \frac{\partial H}{\partial t} &= - \frac{\partial L}{\partial t} \\
  \frac{\partial H}{\partial p_i} &= v^i \\
  \frac{\partial H}{\partial x^i} &= - \frac{\partial L}{\partial x^i}
\end{split}
\end{equation*}

Conserved quantity using Poisson brackets

Consider energy conserved, i.e. $\frac{\partial H}{\partial t} = 0$, and a differentiable function $\Phi(x, p)$:

\begin{equation*}
\begin{split}
  \frac{d \Phi}{dt} &= \sum_{i=1}^{n} \bigg( \frac{\partial \Phi}{\partial x^i} \frac{dx^i}{dt} + \frac{\partial \Phi}{\partial p_i} \frac{d p_i}{dt} \bigg) \\
  &= \sum_{i=1}^{n} \bigg( \frac{\partial \Phi}{\partial x^i} \frac{\partial H}{\partial p_i} - \frac{\partial \Phi}{\partial p_i} \frac{\partial H}{\partial x^i} \bigg) \\
  &= \pb{\Phi}{H}
\end{split}
\end{equation*}

where we have introduced the Poisson bracket

\begin{equation*}
\pb{f}{g} = \sum_{i=1}^{n} \bigg( \frac{\partial f}{\partial x^i} \frac{\partial g}{\partial p_i} - \frac{\partial f}{\partial p_i} \frac{\partial g}{\partial x^i} \bigg)
\end{equation*}

for any two differentiable functions $f, g: \mathbb{R}^{2n} \to \mathbb{R}$ of $(x, p)$.

Hence

\begin{equation*}
\frac{d \Phi}{dt} = 0 \iff \pb{\Phi}{H} = 0
\end{equation*}

If $\Phi$ depend explicity on $t$, then the same calculation as above would show that

\begin{equation*}
\frac{d\Phi}{dt} = \frac{\partial \Phi}{\partial t} + \pb{\Phi}{H}
\end{equation*}

In this case $\Phi$ could still be conserved if

\begin{equation*}
\frac{\partial \Phi}{\partial t} = - \pb{\Phi}{H}
\end{equation*}

Given two conserved quantities $\Phi_1$ and $\Phi_2$, i.e. Poisson-commute with the hamiltonian $H$.

Then we can generate new conserved quantities from old using the Jacobi identity:

\begin{equation*}
\comm{\comm{\Phi_1}{\Phi_2}}{H} = \comm{\Phi_1}{\comm{\Phi_2}{H}} - \comm{\Phi_2}{\comm{\Phi_1}{H}}
\end{equation*}

Therefore $\comm{\Phi_1}{\Phi_2}$ is a conserved quantity.

Can we associate some * to an conserved charge?

Consider a conserved quantity $\Phi(x, p)$ which satisfy

\begin{equation*}
\pb{\Phi}{H} = 0
\end{equation*}

Then

\begin{equation*}
\pb{x^i}{\Phi} = \frac{\partial \Phi}{\partial p_i} \quad \text{and} \quad \pb{p_i}{\Phi} = - \frac{\partial \Phi}{\partial x^i}
\end{equation*}

defines a vector field on the phase space $\mathbb{R}^{2n}$ which we may integrate to find through every point a solution to the differential equation

\begin{equation*}
\frac{d x^i}{ds} = \pb{x^i}{\Phi} = \frac{\partial \Phi}{\partial p_i} \quad \text{and} \quad \frac{d p_i}{ds} = \pb{p_i}{\Phi} = - \frac{\partial \Phi}{\partial x^i}
\end{equation*}

Then, by existence and uniqueness of solutions to IVPs on some open interval $(- \varepsilon, \varepsilon)$ as given above, gives us the unique solutions $x^i(s)$ and $p_i(s)$ for some initial values $x(0)$ and $p(0)$.

Therefore,

\begin{equation*}
\begin{split}
  \frac{d}{ds} H \Big( x(s), p(s) \Big) &= \sum_{i=1}^{n} \bigg( \frac{\partial H}{\partial x^i} \frac{d x^i}{ds} + \frac{\partial H}{\partial p_i} \frac{d p_i}{ds} \bigg) \\
  &= \sum_{i=1}^{n} \bigg( \frac{\partial H}{\partial x^i} \frac{\partial \Phi}{\partial p_i} + \frac{\partial H}{\partial p_i} \Big( - \frac{\partial \Phi}{\partial x^i} \Big) \bigg) \\
  &= \comm{H}{\Phi} \\
  &= - \comm{\Phi}{H} \\
  &= 0
\end{split}
\end{equation*}

Thus we have a continuous symmetry of Hamilton's equations, which takes solutions to solutions.

That is, these solutions $\big( x(s), p(s) \big)$ which are in some sense generated from $\Phi$ and $H$ contain symmetries!

E.g. solution to the system of ODEs above could for example be a linear combination of $\cos(\theta)$ and $\sin(\theta)$, in which case we would then understand that "symmetry" generated by $\Phi$ is rotational symmetry.

The caveat is that this may not actually extend to a one-parameter family of diffeomorphisms, as we used earlier in Noether's theorem as the invariant functions or symmetries.

Example:

  • Lagrangian:

    \begin{equation*}
L[x^{\mu}, \dot{x}^{\mu}] = \sqrt{- \eta_{\mu \nu} \dot{x}^{\mu} \dot{x}_{\nu}
\end{equation*}
  • Exists

    \begin{equation*}
\begin{split}
  \delta x^{\mu} &= \tensor{\Lambda}{^\mu_\nu} x^{\nu} = \tilde{x}^{\nu} \\
  \delta \dot{x}^{\mu} &= \tensor{\Lambda}{^\mu_\nu} \dot{x}^{\nu} = \dot{\tilde{x}}^{\mu}
\end{split}
\end{equation*}

TODO Integrability

A set of functions which Poisson-commute among themselves are said to be in involution.

Liouville's theorem says that if a hamiltonian $H(x, p)$ on $\mathbb{R}^{2n}$ admits $n$ independent conserved quantities in involution, then there is a canonical transformation to so called action / angle variables $\big( X, P \big)$ such that Hamilton's equations imply that

\begin{equation*}
\dot{P} = 0, \quad X(t) = X(0) + t P(0)
\end{equation*}

Such a system is said to be integrable.

Constrained problems

Isoperimetric problems

Notation

  • Typically we talk about the constrained optimization problem:

    \begin{equation*}
\begin{split}
  J[y] &= \int_{0}^{1} L(y, y', x) \dd{x} \\
  \text{subject to} \quad I[y] &= \int_{0}^{1} K(y, y', x) \dd{x} = 0 \\
  & y(0) = y_0,  \quad  y(1) = y_1
\end{split}
\end{equation*}

    for $y: [0, 1] \to \mathbb{R}^n$.

Stuff

  • Extremise a functional subject to a functional constraint

Consider a closed loop $C$ enclosing some area $D$.

The Dido's problem or (original) isoperimetric problem is the problem of maximizing the area of $D$ while keeping the length of $C$ constant.

Consider $t \mapsto \big( x(t), y(t) \big)$ for $t \in [0, 1]$ with initial conditions

\begin{equation*}
\big( x(0), y(0) \big) = \big( x(1), y(1) \big)
\end{equation*}

Using Green's theorem, we have

\begin{equation*}
\text{Area}(D) = \int_{D}^{} \dd{A} = \int_{C}^{} \frac{1}{2} \big( x \dd{y} - y \dd{x} \big) = \int_{0}^{1} \frac{1}{2} \big( x \dot{y} - y \dot{x} \big) \dd{t}
\end{equation*}

and length

\begin{equation*}
\text{Length}(C) = \int_{0}^{1} \sqrt{\dot{x}^2 + \dot{y}^2} \dd{t} = \ell
\end{equation*}

The problem is then that we want to extremise $I [x, y] = \frac{1}{2} \big( x \dot{y} - y \dot{x} \big)$ wrt. the constraint $J[ x, y] = \ell$.

More generally, given functions for $C^1$

\begin{equation*}
\begin{split}
  y: \quad & [0, 1] \to \mathbb{R}^n \\
  & x \mapsto y(x)
\end{split}
\end{equation*}

being endpoint fixed, e.g. $y(0) = y_0$ and $y(1) = y_1$ for some $y_0, y_1 \in \mathbb{R}^n$.

Given the functionals

\begin{equation*}
\begin{split}
  I[y] &= \int_{0}^{1} L(y, y', x) \dd{x} \\
  J[y] &= \int_{0}^{1} K(y, y', x) \dd{x}
\end{split}
\end{equation*}

we want to extremise $I$ subject to $J = 0$.

Let $f,g : \mathbb{R}^n \to \mathbb{R}$ and let $x_0 \in \mathbb{R}^n$ be an extremum of $f$ subject to $g = 0$.

If $\nabla g \big|_{x_0} \ne 0$ (i.e. $x_0$ is not a critical point), then $\exists \lambda_0 \in \mathbb{R}$ (called a Lagrange multiplier) s.t. $\big( x_0, \lambda_0 \big) \in \mathbb{R}^n \times \mathbb{R}$ is a critical point of the function $F: \mathbb{R}^n \times \mathbb{R} \to \mathbb{R}$ defined by

\begin{equation*}
F(x, \lambda) = f(x) - \lambda g(x)
\end{equation*}

Supose $y$ is an extremal of $I$ in the space of $C^1$ with $J[y] = 0$. Then

\begin{equation*}
\frac{d}{ds}\bigg|_{s = 0} I[Y_s] = 0 \quad \text{subject to} \quad J[Y_s] = 0
\end{equation*}

for small $s$, where $Y_s(x) = y(x) + s \varepsilon(x)$.

This might constrain

\begin{equation*}
\varepsilon = \frac{\partial y_s}{\partial s}\bigg|_{s = 0}
\end{equation*}

and prevent use of the Fundamental Lemma of Calculus of Variations.

Idea:

  • Consider $Y_{r, s}(0) = y_0$ and $Y_{r, s}(1) = 0$ for all $r, s$ near $0$, defined $Y_{r, s} = y + s \varepsilon + r \eta$.
  • Then express one of the variations as the other, allowing us to eliminate one.

Let $f,g : \mathbb{R}^2 \to \mathbb{R}$ be defined

\begin{equation*}
\begin{split}
  f(r, s) &:= I[Y_{r,s}] \\
  g(r, s) &:= J[Y_{r, s}]
\end{split}
\end{equation*}

Assume now that $\nabla g \big|_{(0, 0)} \ne 0$ and $g(0, 0) = 0$.

If we specifically consider (wlog) $\frac{\partial g}{\partial r}\big|_{(0, 0)}$, then IFT implies that

\begin{equation*}
g \big( r(s), s \big) = 0
\end{equation*}

for "small" $s$. Therefore,

\begin{equation*}
\frac{\partial Y_{r(s), s}}{\partial s}\bigg|_{s = 0}
\end{equation*}

is a

Let

\begin{equation*}
\begin{split}
  J[y] &= \int_{0}^{1} L(y, y', x) \dd{x} \\
  I[y] &= \int_{0}^{1} K(y, y', x) \dd{x}
\end{split}
\end{equation*}

be functionals of functions $y: [0, 1] \to \mathbb{R}$ subject to BCs

\begin{equation*}
y(0) = y_0 \quad \text{and} \quad y(1) = y_1
\end{equation*}

Suppose that $y(x)$ is an extremal of $J$ subject to the isoperimetric constraint $I[y] = 0$. Then if $y$ is not an extremal of $I[y]$, there is a constant $\lambda \in \mathbb{R}$ so that $y$ is an extremal of $J[y] - \lambda I [y]$. That is,

\begin{equation*}
\exists \lambda \in \mathbb{R}: \quad  J[y] - \lambda I [y] = \max_{y' \in C^1} J[y'] - \lambda I [y']
\end{equation*}

Method of Lagrange multipliers for functionals

Suppose that we wish to extremise a functional

\begin{equation*}
\begin{split}
  J[y] &= \int_{0}^{1} L(y, y', x) \dd{x} \\
  \text{subject to} \quad I[y] &= \int_{0}^{1} K(y, y', x) \dd{x} = 0 \\
  & y(0) = y_0,  \quad  y(1) = y_1
\end{split}
\end{equation*}

for $y: [0, 1] \to \mathbb{R}^n$.

Then the method consists of the following steps:

  1. Ensure that $I[y]$ has no extremals satisfying $I[y] = 0$.
  2. Solve EL-equation for

    \begin{equation*}
M(y, y', x, \lambda) = L(y, y', x) - \lambda K(y, y', x)
\end{equation*}

    which is second-order ODE for $y(x)$.

  3. Fix constants of integration from the BCs
  4. Fix the value of $\lambda$ using $I[y] = 0$.

Classical isoperimetric problem

The catenary

Notation
  • $H(s)$ denotes height as a function of arclength $s$
Catenary
  • Uniform chain og length $\ell$ hangs under its own weight from two poles of height $h$ a distance $2 \ell_0 < \ell$ apart
  • Potential energy is given by $\int_{0}^{\ell} H(s) \dd{s}$, which we can parametrize

    \begin{equation*}
y(x) = H \Big( s(x) \Big), \quad x \in [- \ell_0, \ell_0]
\end{equation*}

    giving us

    \begin{equation*}
\begin{split}
  & \int_{- \ell_0}^{\ell_0} y(x) \sqrt{1 + y'(x)^2} \dd{x} \\
  \text{subject to} \quad & \int_{- \ell_0}^{\ell_0} \sqrt{1 + y'(x)^2} \dd{x} = \ell, \\
  & y(- \ell_0) = y(\ell_0) = h
\end{split}
\end{equation*}
  • Observation:
    • All extremals of arclength are straight lines
    • Extremal to constrained problem has non-zero gradient of the constraint
    • → straight lines are not the solutions
  • Consider lagrangian

    \begin{equation*}
L(y, y,' \lambda) = \big( y - \lambda \big) \sqrt{1 + \big( y' \big)^2}
\end{equation*}
  • Using Beltrami's identity:

    \begin{equation*}
\frac{\partial L}{\partial y'} y' - L = \text{constant} \implies \frac{y - \lambda}{\sqrt{1 + (y')^2}} = c
\end{equation*}

    rewritten to

    \begin{equation*}
(y')^2 = \bigg( \frac{y - \lambda}{c} \bigg)^2 - 1
\end{equation*}

    which, given the BCs $y(- \ell_0) = y(\ell_0) = h$ we get the solution

    \begin{equation*}
y(x) = c \cosh \bigg( \frac{x}{c} \bigg) - c \cosh \bigg( \frac{\ell_0}{c} \bigg) + h
\end{equation*}

    which follows from taking the derivative of both sides and then solving.

  • Impose the isoperimetric condition:

    \begin{equation*}
\begin{split}
  \int_{- \ell_0}^{\ell_0} \sqrt{1 + (y')^2} \dd{x} &= \int_{- \ell_0}^{\ell_0} \sqrt{1 + \sinh(x / c)^2} \dd{x} \\
  &= \int_{- \ell_0}^{\ell_0} \cosh (x / c) \dd{x} \\
  \therefore \quad \ell &= 2c \sinh(\ell_0 / c)
\end{split}
\end{equation*}
  • Introducing $\zeta = \frac{1}{2c}$, we find the following transcendental equation

    \begin{equation*}
\ell \zeta = \sinh(2 \ell_0 \zeta)
\end{equation*}

    for which

    • $\zeta = 0$ is the trivial solution
    • $\zeta > 0$ small, together with the condition $\ell > 2 \ell_0$ ensures that $\ell \zeta > \sinh(2 \ell_0 \zeta)$
      • But for $\zeta \gg 1$ the exponential term dominates

        \begin{equation*}
\implies \ell \zeta < \sinh(2 \ell_0 \zeta) \implies \exists \zeta_0 > 0 : \ell \zeta_0 = \sinh(2 \ell_0 \zeta_0)
\end{equation*}

        by continuity.

Holonomic and nonholonomic constraints

We say a constraint is scleronomic if the constraint does not depend explicitly on $t$, and rheonomic if it does.

We say a constraint is holonomic if it does not depend explicitly on $v$, and nonholonomic if it does.

In the case where nonholonomic constraints are at most linear in $\dot{x}$, then we say that the constraints are pfaffian constraints.

Typical usecases

  • Finding geodesics on a surface as defined as the zero locus of a function, which are scleronomic and holonomic.
  • Reducing higher-order lagrangians to first-order lagriangians
  • Mechanical problems: e.g. "rolling without sliding", which are typically nonholonomic

Holonomic constraints

Let $x: [0, 1] \to \mathbb{R}^n$ be a smooth extremal for

\begin{equation*}
\begin{split}
  J[x] &= \int_{0}^{1} L(x, \dot{x}, t) \dd{t} \\
  \text{subject to} \quad g(x, t) &= 0
\end{split}
\end{equation*}

with $\nabla g \big( x(t), t \big) \ne 0$ for all $t \in [0, 1]$.

Then there exists $\lambda: [0, 1] \to \mathbb{R}$ such that $x$ obeys the EL equations of

\begin{equation*}
K(x, \dot{x}, t, \lambda) = L(x, \dot{x}, t) - \lambda(t) g(x, t)
\end{equation*}

$\nabla g$ is the gradient wrt. $x^i$ for all $i$, NOT $t$.

Let $y: [0, 1] \times (- \delta, \delta) \longrightarrow \mathbb{R}^n$ be admissible variation of $y(t, 0) = x(t)$.

Then the constraint $g$ requires

\begin{equation*}
g \Big( y(t, s), t \Big) = 0, \quad \forall t, s
\end{equation*}

Let

\begin{equation*}
\varepsilon(t) := \frac{\partial }{\partial s}\bigg|_{s = 0} y(t, s), \quad \varepsilon(0) = \varepsilon(1) = 0
\end{equation*}

Then the contraint above implies that

\begin{equation*}
\frac{d}{ds} \bigg|_{s = 0} \nabla g \cdot \varepsilon = 0
\end{equation*}

i.e. if $g(x)$ then $\varepsilon$ is tangent to the implicitly defined surface $g(x) = 0$.

Consider $n = 2$, then the above gives us

\begin{equation*}
\frac{\partial g}{\partial x^1} \varepsilon^1 + \frac{\partial g}{\partial x^2} \varepsilon^2 = 0
\end{equation*}

We therefore suppose that

\begin{equation*}
\frac{\partial g}{\partial x^2}\bigg|_{\big( x(t), t \big)} \ne 0, \quad \forall t
\end{equation*}

The implicit function then implies that we can solve $g(x, t) = 0$ for one component. Then the above gives us

\begin{equation*}
\varepsilon^2 = - \frac{\frac{\partial g}{\partial x^1}}{\frac{\partial g}{\partial x^2}} \varepsilon^1
\end{equation*}

i.e. $\varepsilon^1$ is arbitrary and $\varepsilon^2$ is fully determined by $\varepsilon^1$ in this "small" neighborhood $\big( - \delta, \delta \big)$.

Then

\begin{equation*}
\begin{split}
  \dv{s}\bigg|_{s = 0} I[y] &= \int_{0}^{1} \bigg( \frac{\partial L}{\partial x^i} \varepsilon^i + \frac{\partial L}{\partial \dot{x}^i} \dot{\varepsilon}^i \bigg) \dd{t} \\
  &= \int_{0}^{1} \bigg( \frac{\partial L}{\partial x^i} - \dv{t} \frac{\partial L}{\partial \dot{x}^i} \bigg) \varepsilon^i \dd{t} \\
  &= \int_{0}^{1} \bigg[ \bigg( \frac{\partial L}{\partial x^1} - \dv{t} \frac{\partial L}{\partial \dot{x}^i} \bigg) \varepsilon^1 - \lambda(t) \underbrace{\frac{\partial g}{\partial x^2} \varepsilon^2}_{- \frac{\partial g}{\partial x^1} \varepsilon^1} \bigg] \dd{t} \\
  &= \int_{0}^{1} \bigg( \frac{\partial L}{\partial x^1} - \dv{t} \frac{\partial L}{\partial \dot{x}^1} - \lambda \frac{\partial g}{\partial x^1} \bigg) \varepsilon^1 \dd{t}
\end{split}
\end{equation*}

Since $\varepsilon^1$ is arbitrary, FTCV implies that

\begin{equation*}
\dv{t} \frac{\partial L}{\partial \dot{x}^i} - \frac{\partial L}{\partial x^i} = - \lambda \frac{\partial g}{\partial x^1}
\end{equation*}

Which is jus the E-L equation for a lagrangian given by

\begin{equation*}
M = L - \lambda g
\end{equation*}

Above we use the implicit function theorem to prove the existence of such extremals, but one can actually prove this using something called "smooth partion of unity".

In that case we will basically do the above for a bunch of different neighborhoods, and the sum them together to give us (apparently) the same answer!

Nonholonomic constraints

Examples

(Non-holonomic) Higher order lagrangians

$L(x, \dot{x}, \ddot{x}, t)$ can be replaced by $L(x, \dot{x}, y, t)$

\begin{equation*}
\begin{split}
  & L(x, \dot{x}, \dot{y}, t) \\
  \text{subject to} \quad & y = \dot{x}
\end{split}
\end{equation*}

which are scleronomic and nonholonomic.

So we consider the Lagriangian

\begin{equation*}
M(x, \dot{x}, \dot{y}, t, \lambda ) = L(x, \dot{x}, \dot{y}, t) - \lambda \big( y - \dot{x} \big)
\end{equation*}

Then

\begin{equation*}
\begin{split}
  \pdv{M}{x} &= \pdv{L}{x} \\
  \pdv{M}{\dot{x}} &= \pdv{L}{\dot{x}} + \lambda \\
  \pdv{M}{y} &= - \lambda \\
  \pdv{M}{\dot{y}} &= \pdv{L}{\dot{y}}
\end{split}
\end{equation*}

So the E-L equations gives us

\begin{equation*}
\begin{split}
  \dv{}{t} \bigg( \pdv{M}{\dot{x}} \bigg) &= \pdv{M}{x} \\
  \dv{}{t} \bigg( \pdv{M}{\dot{y}} \bigg) &= \pdv{M}{y} \\
  \dv{}{t} \bigg( \pdv{L}{\dot{x}} + \lambda \bigg) &= \pdv{L}{x} \\
  \dv{}{t} \bigg( \pdv{L}{\dot{y}} \bigg) &= - \lambda
\end{split}
\end{equation*}

which gives us

\begin{equation*}
\begin{split}
  - \dv{}{t} \bigg( \pdv{L}{\dot{x}} \bigg) + \dv[2]{}{t} \bigg( \pdv{L}{\dot{y}} \bigg) + \pdv{L}{x} &= 0 \\
  \iff \quad \dv[2]{}{t} \bigg( \pdv{L}{\ddot{x}} \bigg) - \dv{}{t} \bigg( \pdv{L}{\dot{x}} \bigg) + \pdv{L}{x} &= 0
\end{split}
\end{equation*}

where we've used

\begin{equation*}
y = \dot{x}, \quad \dot{y} = \ddot{x}
\end{equation*}

Variational PDEs

Notation

  • $y: D \subset \mathbb{R}^2 \to \mathbb{R}^n$ be a $C^2$ function on the set $D$
  • Lagrangian over a surface $L: \big( \mathbb{R}^n \times \mathbb{R}^{2n} \times \mathbb{R}^2 \big) \to \mathbb{R}$ with corresponding action

    \begin{equation*}
S[y] = \int_D L \big( y, y_u, y_v, u, v \big) \dd{u} \dd{v}
\end{equation*}

    where $y_u$ and $y_v$ denote collectively the $2n$ partial derivatives $\pdv{y^i}{u}$ and $\pdv{y^i}{v}$ for $i = 1,\dots, n$

  • BCs are given by

    \begin{equation*}
y(x) = \phi(x), \quad \forall x \in \partial D
\end{equation*}

    where $\phi: \partial D \to \mathbb{R}^n$ is given

  • Variations are $C^1$ functions $\varepsilon: D \to \mathbb{R}^2$ such that

    \begin{equation*}
\varepsilon(x) = \mathbf{0}, \quad \forall x \in \partial D
\end{equation*}

Stuff

\begin{equation*}
S[y] = \int_D L \big( y, y_u, y_v, u, v \big) \dd{u} \dd{v}
\end{equation*}

Then

\begin{equation*}
\begin{split}
  \dv{}{s}\bigg|_{s = 0} S[y + s \varepsilon] &= \int_{D}^{} \dv{}{s}\bigg|_{s = 0} L \big( y + s \varepsilon, y_u + s \varepsilon_u, y_v + s \varepsilon_v, u, v \big) \dd{u} \dd{v} \\
  &= \int_D \bigg( \frac{\partial L}{\partial y^i} \varepsilon^i + \frac{\partial L}{\partial y_u^i} \varepsilon_u^i + \frac{\partial L}{\partial y_v^i} \varepsilon_v^i \bigg) \dd{u} \dd{v} \\
  &= \int_D \bigg( \pdv{L}{y^i} - \pdv{}{u} \pdv{L}{y_u^i} - \pdv{}{v} \pdv{L}{y_v^i} \bigg) \varepsilon^i \dd{u} \dd{v} \\
  & \quad + \int_D \bigg[ \pdv{}{u} \bigg( \pdv{L}{y_u^i} \varepsilon^i \bigg) + \pdv{}{v} \bigg( \pdv{L}{y_v^i} \varepsilon^i \bigg) \bigg] \dd{u} \dd{v}
\end{split}
\end{equation*}

where we've used integration by parts in the last equality.

The Divergence (or rather, Stokes') theorem allows us to rewrite the last integral as

\begin{equation*}
\int_{D}^{} \bigg[ \pdv{}{u} \bigg( \pdv{L}{y_u^i}  \varepsilon^i \bigg) + \pdv{}{v} \bigg( \pdv{L}{y_v^i} \varepsilon^i \bigg) \bigg] \dd{u} \dd{v} = \int_{\partial D}^{} \bigg(  \pdv{L}{y_u^i} N^{u} + \pdv{L}{y_v^i} N^{v} \bigg) \varepsilon^i \dd{\ell}
\end{equation*}

where $\ell$ is the arclength. And since $\varepsilon|_{\partial D} = 0$ this vanishes.

Generalisation of the FLCV the first integral term must vanish and so we get

\begin{equation*}
\pdv{L}{y^i} = \pdv{}{u} \pdv{L}{y_u^i} + \pdv{}{v} \pdv{L}{y_v^i}
\end{equation*}

We can generalise this to more than just 2D!

Multidimensional Euler-Lagrange equations

Let

  • $D \subset \mathbb{R}^m$ be a bounded region with (piecewise) smooth boundary.
  • $x^{\mu} = \big( x^1, \dots, x^m \big)$ denote the coordinates for $D$
  • $L \big( y, \nabla y, x \big)$ be the Lagrangian for maps $y: D \to \mathbb{R}^n$ where $\nabla y$ denotes collectively the $mn$ partial derivatives

    \begin{equation*}
u_{\mu}^i := \pdv{y^i}{x^{\mu}}, \quad i = 1, \dots, n, \ \mu = 1, \dots, m
\end{equation*}

Then the general Euler-Lagrange equations are given by

\begin{equation*}
\pdv{L}{y^i} = \pdv{}{x^{\mu}} \pdv{L}{y_{\mu}^i}
\end{equation*}

using Einstein summation.

Notice that we are treating $\pdv{L}{y_{\mu}^i}$ as a function of the x's and differentiate wrt. $x^{\mu}$ keeping all other x's fixed!

(This is really Stokes' theorem)

Let

  • $D \subset \mathbb{R}^m$ be bounded open set with (piecewise) smooth boundary $\partial D$
  • $X = \big( X^1, \dots, X^m \big)$ be a smooth vector field defined on $D \cup \partial D$
  • $N$ be the unit outward-pointing normal of $\partial D$

Then,

\begin{equation*}
\int_D \partial_{\mu} X^{\mu} \dd{V} = \int_{\partial D} X^{\mu} N_{\mu} \dd{A}
\end{equation*}

(using Einstein summation) where $\dd{V}$ is the volume element in $\mathbb{R}^m$ and $\dd{A}$ is the area element in $\partial D$ and $\left\langle \cdot, \cdot \right\rangle$ denotes the Euclidean inner product in $\mathbb{R}^m$.

Let

  • $D \subset \mathbb{R}^m$ be bounded open with (piecewise) smooth boundary $\partial D$
  • $f: D \to \mathbb{R}^n$ be a continuous function which obeys

    \begin{equation*}
\int_D \left\langle f(x), h(x) \right\rangle \dd[m]{x} = 0
\end{equation*}

    for all $C^{\infty}$ functions $h: D \to \mathbb{R}^n$ vanishing on $\partial D$.

Then $f \equiv 0$.

Noether's theorem for multidimesional Lagrangians

Notation

  • $D \subset \mathbb{R}^m$
  • Lagrangian

    \begin{equation*}
S[u] = \int_D L \big( u, \nabla u, x \big) \dd[m]{x}
\end{equation*}

    where

    \begin{equation*}
\begin{split}
  u: \quad & D \to \mathbb{R}^n \\
  & x \mapsto \Big( u^1(x), \dots, u^n(x) \Big)
\end{split}
\end{equation*}
  • Use the notation

    \begin{equation*}
\pdv{u^i}{x^{\mu}} =: u_{\mu}^i
\end{equation*}
  • Conserved now refers to "divergenceless", that is, $J$ is a conserved quantity if

    \begin{equation*}
\pdv{}{x^{\mu}} J^{\mu} = 0
\end{equation*}

    where we're using Einstein summation.

  • $\left\{ \varphi_s: \mathbb{R}^n \to \mathbb{R}^n \right\}$ for $s \in \mathbb{R}$ denotes a one-parameter group of diffeomorphisms
  • $z: D \times \mathbb{R} \to \mathbb{R}^n$ is defined

    \begin{equation*}
z(x, s) = \varphi_s \big( u(x) \big)
\end{equation*}

Stuff

  • We say it's a conserved "current" because

    \begin{equation*}
\pdv{}{x^{\mu}} J^{\mu} = 0 \implies \int_{\partial D} \left\langle N, J \right\rangle \dd[n - 1]{x} = \int_D \underbrace{\text{div} \ J}_{= 0} \dd[n]{x} =\text{constant}
\end{equation*}

    where $N$ denotes the normal to the boundary

Consider

\begin{equation*}
\begin{split}
  \varphi_s: \quad & D \to D \\
  & x \mapsto \bar{x}(s, x)
\end{split}
\end{equation*}

And let

\begin{equation*}
\begin{split}
  \zeta^{\mu} &:= \pdv{\bar{u}^{\mu}}{s} \bigg|_{s = 0} \\
  \chi^{\mu} &:= \pdv{\bar{x}^{\mu}}{s} \bigg|_{s = 0}
\end{split}
\end{equation*}

such that

\begin{equation*}
\begin{split}
  \bar{x}^{\mu} &:= x^{\mu} + \chi^{\mu} s + \mathcal{O}(s^2) \\
  \bar{u}^{\mu}(x) &:= u^{\mu}(x) + \zeta^{\mu} s + \mathcal{O}(s^2)
\end{split}
\end{equation*}

We suppose that the action is invariant, so that the Lagrangian obeys

\begin{equation*}
L \Big( \bar{u}, \bar{\nabla} \bar{u}, \bar{x} \Big) \det \bigg( \dv{\bar{x}}{x} \bigg) = L \Big( u, \nabla u, x \Big)
\end{equation*}

or equivalently, one can (usually more easily) check that the following is true

\begin{equation*}
\pdv{L}{u^i} \zeta^i + \chi^{\mu} \big( \partial_{\mu} L \big) + L \dv{\chi^{\mu}}{x^{\mu}} + \pdv{L}{\big( \partial_{\mu} u^i \big)} \bigg( \dv{\zeta^i}{x^{\mu}} - \big( \partial_{\nu} u^i \big) \dv{\chi^{\nu}}{x^{\mu}} \bigg) = 0
\end{equation*}

The Noether current is then given by

\begin{equation*}
J^{\mu} = L \chi^{\mu} + \sum_{i}^{} \pdv{L}{\big( \partial_{\mu} u^i \big)} \bigg( \zeta^i - \sum_{\nu}^{} \big( \partial_{\nu} u^i \big) \chi^{\nu} \bigg)
\end{equation*}
\begin{equation*}
\dv{}{s}\bigg\rvert_{s = 0} L \Big( \bar{u}, \bar{\nabla} \bar{u}, \bar{x} \Big) \det \bigg( \dv{\bar{x}}{x} \bigg) = \underbrace{\dv{}{s}\bigg\rvert_{s = 0} L \Big( u, \nabla u, x \Big)}_{ = 0}
\end{equation*}

since the RHS is independent of $s$. The LHS on the other hand is given by

\begin{equation*}
\bigg( \sum_{i}^{} \pdv{L}{\bar{u}^i} \pdv{\bar{u}^i}{s} + \sum_{i}^{} \sum_{\mu}^{} \pdv{L}{\bar{u}_{\mu}^i} \pdv{\bar{u}_{\mu}^i}{s} + \sum_{\mu} \pdv{L}{\bar{x}^{\mu}} \pdv{\bar{x}^{\mu}}{s} \bigg) (D \bar{x}) + L \pdv{(D \bar{x})}{s} = 0
\end{equation*}

where $D \bar{x} := \det \dv{\bar{x}}{x}$.

Now we observe the following:

\begin{equation*}
\pdv{\bar{u}^i}{u^j} \bigg\rvert_{s = 0} = \delta_{j}^i, \qquad \pdv{\bar{u}^i}{x^{\mu}} \bigg|_{s = 0} = 0, \qquad \pdv{\bar{x}^{\mu}}{x^{\nu}} \bigg|_{s = 0} = \delta_{\nu}^{\mu}, \qquad \pdv{\bar{x}^{\mu}}{u^i} \bigg|_{s = 0} = 0
\end{equation*}

and

\begin{alignat*}{2}
\pdv[2]{\bar{u}^i}{s}{u^j} \bigg|_{s = 0} &= \pdv{\zeta^i}{u^j} \qquad \pdv[2]{\bar{x}^{\mu}}{s}{x^{\mu}} \bigg|_{s = 0} &= \pdv{\chi^{\mu}}{x^{\nu}} \\
\pdv[2]{\bar{u}^i}{s}{x^{\mu}} \bigg|_{s = 0} &= \pdv{\zeta^i}{x^{\mu}} \qquad \pdv[2]{\bar{x}^{\mu}}{s}{u^i} \bigg|_{s = 0} &= \pdv{\chi^{\mu}}{u^{i}}
\end{alignat*}

where we've simply taking the derivatives of the Taylor expansions. Hence, we are left with

\begin{equation*}
\bigg( \sum_{i}^{} \pdv{L}{\bar{u}^i} \zeta^i + \sum_{i}^{} \sum_{\mu}^{} \pdv{L}{\bar{u}_{\mu}^i} \pdv{\bar{u}_{\mu}^i}{s} \bigg|_{s = 0} + \sum_{\mu} \pdv{L}{\bar{x}^{\mu}} \chi^{\mu} \bigg) (D \bar{x}) \bigg|_{s = 0} + L \pdv{(D \bar{x})}{s}\bigg|_{s=0} = 0
\end{equation*}

Now we need to evaluate $(D \bar{x})$ and it's derivative wrt. $s$. First we notice that

\begin{equation*}
\dv{\bar{x}^{\mu}}{x^{\nu}} = \pdv{\bar{x}^{\mu}}{x^{\nu}} + \sum_{i}^{} \pdv{\bar{x}^{\mu}}{u^i} \pdv{u^i}{x^{\nu}} \implies \dv{\bar{x}^{\mu}}{x^{\nu}} \bigg|_{s = 0} = \delta_{\nu}^{\mu} \quad \implies \quad D \bar{x} \big|_{s = 0} = 1
\end{equation*}

We now compute $\pdv{(D \bar{x})}{s} \Big|_{s = 0}$. We first have

\begin{equation*}
\begin{split}
  \pdv{}{s} \dv{\bar{x}^{\mu}}{x^{\nu}} \bigg|_{s = 0} &= \pdv[2]{\bar{x}^{\mu}}{s}{x^{\nu}} \bigg|_{s = 0} + \sum_{i}^{} \pdv[2]{\bar{x}^{\mu}}{s}{u^i}\bigg|_{s= 0} \pdv{u^i}{x^{\nu}} \\
   &= \pdv{\chi^{\mu}}{x^{\nu}} + \sum_{i}^{} \pdv{\chi^{\mu}}{u^i} \pdv{u^i}{x^{\nu}} \\
   &= \dv{\chi^{\mu}}{x^{\nu}}
\end{split}
\end{equation*}

Finally using the fact that if a we have some matrix $M$ given by

\begin{equation*}
M(s) = 1 + s A + \mathcal{O}(s^2) \quad \implies \quad \det(M) = 1 + s \tr A + \mathcal{O}(s^2)
\end{equation*}

we have

\begin{equation*}
\pdv{(D \bar{x})}{s} \bigg|_{s = 0} = \sum_{\mu}^{} \dv{\chi^{\mu}}{x^{\mu}}
\end{equation*}

Finally we need to compute $\pdv{\bar{u}_{\mu}^i}{s}$, which one will find to be

AND I NEED TO PRACTICE FOR MY EXAM INSTEAD OF DOING THIS. General ideas are the above, and then just find an expression for the missing part. Then, you do some nice manipulation, botain an expression which vanish due to the EL equations being satisfied by the non-transformed $u$, and you end up with the Noether's current for the multi-dimensional case.

Examples

Minimal surface

Let $f: D \subset \mathbb{R}^2 \to \mathbb{R}$ be a twice differentiable function.

The grahp $z = f(x, y)$ defines a surface $\Sigma \subset \mathbb{R}^3$. The area of this surface is the functional of $f$ given by

\begin{equation*}
S[f] = \int_D \sqrt{1 + f_x^2 + f_y^2} \dd{x} \dd{y}
\end{equation*}

If $f$ is an extremal of this function, we say that $\Sigma$ is a minimal surface.

In this case the Lagrangian is

\begin{equation*}
L = \sqrt{1 + f_x^2 + f_y^2}
\end{equation*}

with the EL-equations

\begin{equation*}
\pdv{}{x} \pdv{L}{f_x} + \pdv{}{y} \pdv{L}{f_y} = 0
\end{equation*}

where

\begin{equation*}
\pdv{L}{f_x} = \frac{f_x}{\sqrt{1 + f_x^2 + f_y^2}}, \quad \pdv{L}{f_y} = \frac{f_y}{\sqrt{1 + f_x^2 + f_y^2}}
\end{equation*}

Therefore,

\begin{equation*}
\begin{split}
  \pdv{}{x} \pdv{L}{f_x} = \frac{f_{xx}}{\sqrt{1 + f_x^2 + f_y^2}} - \frac{f_x \big( f_x f_{xx} + f_y f_{xy} \big)}{\big( 1 + f_x^2 + f_y^2 \big)^{3 / 2}}
\end{split}
\end{equation*}

and similarily for $y$. When combined, and multiplied by $\big( 1 + f_x^2 + f_y^2 \big)^{3 / 2}$ since the combination equal zero anyways, we're left with

\begin{equation*}
\big( 1 + f_y^2 \big) f_{xx} + \big( 1 + f_x^2 \big) f_{yy} - 2 f_x f_y f_{xy} = 0
\end{equation*}

where we've used the fact that $f_{xy} = f_{yx}$.

This is then the equation which must be satisfied by a minimal surface.

$O(2)$ model
\begin{equation*}
L = \frac{1}{2} \sum_{i=1}^{2} \bigg( \Big( \phi_x^i \Big)^2 + \Big( \phi_y^i \Big)^2 \bigg) = \frac{1}{2} \delta_{ij} \phi_{\mu}^i \phi_{\mu}^j
\end{equation*}

then

\begin{equation*}
\begin{pmatrix}
  \phi^1 \\ \phi^2
\end{pmatrix}
\mapsto
\begin{pmatrix}
  \cos(s) & - \sin (s) \\ \sin(s) & \cos(s)
\end{pmatrix}
\begin{pmatrix}
  \phi^1 \\ \phi^2
\end{pmatrix}
= 
\begin{pmatrix}
  \overline{\phi}^1 \\ \overline{\phi}^2
\end{pmatrix}
\end{equation*}

Then

\begin{equation*}
\begin{pmatrix}
  \zeta^1 \\ \zeta^2
\end{pmatrix}
= \pdv{}{s}\bigg|_{s = 0}
\begin{pmatrix}
  \overline{\phi}^1 \\ \overline{\phi}^2
\end{pmatrix}
=
\begin{pmatrix}
  0 & -1 \\ 1 & 0
\end{pmatrix}
\begin{pmatrix}
  \phi^1 \\ \phi^2
\end{pmatrix}
= 
\begin{pmatrix}
  -\phi^2 \\ \phi^1
\end{pmatrix}
\end{equation*}

Then the Noether's current $J$ is

\begin{equation*}
J^{\mu} := \pdv{L}{\phi_{\mu}^i} \zeta^i
\end{equation*}

which explicitly in this case is

\begin{equation*}
\begin{split}
  J^x &= \pdv{L}{\phi_x^1} \zeta^1 + \pdv{L}{\phi_x^2} \zeta^2  \\
  &= \phi^2 \phi_x^1 + \phi^1 \phi_x^2 \\
  J^y &= - \phi^1 \phi_y^1 + \phi^1 \phi_y^2
\end{split}
\end{equation*}

Then

\begin{equation*}
\begin{split}
  \text{div} \ J &= \bigg( J^x \bigg)_x + \bigg( J^y \bigg)_y \\
  &= - \phi^2 \phi_{xx}^1 + \phi^1 \phi_{xx}^2 - \phi^2 \phi_{yy}^1 + \phi^1 \phi_{yy}^2 \\
  &= - \phi^2 \underbrace{\Delta \phi^1}_{0 \text{ by E-L}} + \phi^1 \underbrace{\Delta \phi^2}_{0 \text{ by E-L}} \\
  &= 0
\end{split}
\end{equation*}

Noether's current for multidimensional Lagrangian

Classical Field Theory

Notation

  • $u: \mathbb{R}^{m + 1} \to \mathbb{R}^n$ denotes a field written $u \big( t, x^1, \dots, x^m \big)$
  • $x^0 := t$
  • Concerned with action functionals of the form

    \begin{equation*}
S[u] = \int_{\Omega} \mathcal{L} \big( u, \nabla u, x \big) \dd[m + 1]{x}
\end{equation*}

    where $\mathcal{L}$ is called a Lagrangian density and $\Omega: [0, 1] \times \mathbb{R}^m$, i.e.

    \begin{equation*}
\Omega := \left\{ \big( x^0, x^1, \dots, x^m  \big) \in \mathbb{R}^{m + 1} \mid x^0 \in [0, 1] \right\}
\end{equation*}
  • $\varepsilon \big( t, x^1, \dots, x^m \big) = 0$ for $\big( x^1 \big)^2 + \dots + \big( x^m \big)^2 > R^2$ for some $R$ and

    \begin{equation*}
\varepsilon \big( 0, x^1, \dots, x^m \big) = \varepsilon \big( 1, x^1, \dots, x^m \big) = 0, \quad \forall x^1, \dots, x^m
\end{equation*}
  • $C_R$ denotes a "cylindrical" region

    \begin{equation*}
C_R = [a, b] \times \overline{B}_R \subset \Omega
\end{equation*}
  • $N$ is the outward normal to the boundary $C_R$

Stuff

Lagrangian density $\mathcal{L}$ is just used to refer to the fact that we are now looking at a variation of the form

\begin{equation*}
S[x] = \int \mathcal{L} \dd[m + 1]{x} = \int \underbrace{\bigg( \int \mathcal{L} \dd[m]{x} \bigg)}_{= L} \dd{t}
\end{equation*}

So it's like the $\int \mathcal{L} \dd[m]{x}$ is the Lagrangian now, and the "inner" functional is the a Lagrangian density.

Klein-Gordon equation in $\mathbb{R}^4$ (i.e. $m = 3$) is given by

\begin{equation*}
\pdv[2]{u}{\big( x^0 \big)} - \sum_{i=1}^{3} \pdv[2]{u}{\big( x^i \big)} + M^2 u = 0
\end{equation*}

where $M > 0$ is called the mass.

If $M = 0$, then this is the wave equation, whence the Klein-Gordon equation is a sort of massive wave equation.

More succinctly, introducing the matrix

\begin{equation*}
\tensor{\eta}{^{\mu\nu}} = 
\begin{pmatrix}
  -1 & 0 \\ 0 & \mathbb{1}_3
\end{pmatrix}
\end{equation*}

then the Klein-Gordon equation can be written

\begin{equation*}
\eta^{\mu \nu} \partial_{\mu} \partial_{\nu} u - M^2 u^2 = 0
\end{equation*}

Note: you sometimes might see this written

\begin{equation*}
\partial_{\mu} \partial^{\mu} u - M^2 u^2 = 0
\end{equation*}

where they use the notation $\partial^{\mu} = \eta^{\mu \nu} \partial_{\nu}$ so we have sort of "summed out" the $\nu$.

Calculus of variations with improper integrals

Noether's Theorem for improper actions

  • Consider action function for a classical field $u: \Omega \to \mathbb{R}^n$ which is invariant under continuous one-parameter symmetry with Noether current $J^{\mu}$
  • Integrate (zero) divergence of the current on a "cylindrical region" $C_R = [a, b] \times \overline{B}_R \subseteq \Omega$ and apply the Divergence Theorem

    \begin{equation*}
0 = \int_{C_R} \partial_{\mu} J^{\mu} \dd[m + 1]{x} = \int_{\partial C_R} N_{\mu} J^{\mu} \dd[m]{x}
\end{equation*}
  • $\partial C_R$ consists of "sides" $[a, b] \times S_R^m$ of the "cylinder", where
    • $S_R^m$ is the m-sphere of radius $R$
    • top cap $\left\{ b \right\} \times \overline{B}_R$
    • bottom cap $\left\{ a \right\} \times \overline{B}_R$
  • Can rewrite the above as

    \begin{equation*}
\begin{split}
  0 &= \int_{\overline{B}_R} J^0 \big( b, x^1, \dots, x^m \big) \dd[m]{x} - \int_{\overline{B}_R} J^0 \big( a, x^1, \dots, x^m \big) \dd[m]{x} \\
  & \qquad + \underbrace{\int_{a}^{b} \int_{S_R^m} N^{\mu} J^{\mu} \big( x^0, x^1, \dots, x^m \big) \dd[m + 1]{x}}_{\to 0 \text{ as } R \to \infty}
\end{split}
\end{equation*}

    using the fact that $N$ points outward at the bottom cap → negative $x^0$ axis

    • Last term vanishes due to BCs on the field implies $J^{\mu} \to 0$ on $[a, b] \times S_R^m$ as $R \to \infty$
  • Since $a, b$ arbitrary, we have

    \begin{equation*}
Q = \lim_{R \to \infty} \int_{\overline{B}_R} J^0 \big( t, x^1, \dots, x^m \big) \dd[m]{x}
\end{equation*}

    is conserved, i.e. we have a Noether's charge for the improper case!

Maxwell equations

Notation

  • $\mathbf{B}$ is the magnetic field
  • $\mathbf{E}$ is the electric field
  • $\rho$ is the electric charge density
  • $\mathbf{J}$ is the electric current density
  • $\mathbf{A}$ is the magnetic potential
  • Let

    \begin{equation*}
\mathbf{B} = \boldsymbol{\nabla} \times \mathbf{A}
\end{equation*}

Stuff

  • Maxwell's equations

    \begin{alignat*}{2}
  \boldsymbol{\nabla} \cdot \mathbf{B} &= 0 \quad & \quad \boldsymbol{\nabla} \cdot \mathbf{E} &= \rho \\
  \boldsymbol{\nabla} \times \mathbf{E} &= - \pdv{\mathbf{B}}{t} \quad & \quad \boldsymbol{\nabla} \times \mathbf{B} &= \pdv{\mathbf{E}}{t} + \mathbf{J}
\end{alignat*}
  • Observe that $\boldsymbol{\nabla} \cdot \mathbf{B} = 0$ can be solved by writing

    \begin{equation*}
\mathbf{B} = \boldsymbol{\nabla} \times \mathbf{A}
\end{equation*}
    • Does not determine $\mathbf{B}$ uniquely since

      \begin{equation*}
\mathbf{A} \mapsto \mathbf{A} + \boldsymbol{\nabla} \theta
\end{equation*}

      leaves $\mathbf{B}$ unchanged (since $\boldsymbol{\nabla} \times \boldsymbol{\nabla} = 0$), and is called a gauge transformation

  • Substituting $\mathbf{B} = \boldsymbol{\nabla} \times \mathbf{A}$ into Maxwell's equations:

    \begin{equation*}
\boldsymbol{\nabla} \times \mathbf{E} = - \pdv{}{t} \boldsymbol{\nabla} \times \mathbf{A} = - \boldsymbol{\nabla} \times \pdv{\mathbf{A}}{t} \implies \boldsymbol{\nabla} \times \bigg( \mathbf{E} + \pdv{\mathbf{A}}{t} \bigg) = 0
\end{equation*}
  • Thus there exists a function $\phi$ (again since $\boldsymbol{\nabla} \times \boldsymbol{\nabla} = 0$), called the electric potential, such that

    \begin{equation*}
\mathbf{E} + \pdv{\mathbf{A}}{t} = - \boldsymbol{\nabla} \phi
\end{equation*}
  • Performing gauge transformation → changes $\mathbf{A}$ and $\mathbf{E}$ unless also transform

    \begin{equation*}
\phi \mapsto \phi - \pdv{\theta}{t}
\end{equation*}
  • In summary, two of Maxwell's equations can be solved by

    \begin{equation*}
\mathbf{B} = \boldsymbol{\nabla} \times \mathbf{A} \quad \text{and} \quad \mathbf{E} = - \pdv{\mathbf{A}}{t} - \boldsymbol{\nabla} \phi
\end{equation*}

    where $\mathbf{A}$ and $\phi$ are defined up to gauge transformations

    \begin{equation*}
\mathbf{A} \mapsto \mathbf{A} + \pdv{\theta}{t} \quad \text{and} \quad \phi \mapsto \phi - \pdv{\theta}{t}
\end{equation*}

    for some function $\theta(t)$

    • We can fix the "gauge freedom" (i.e. limit the space of functions $\theta$) by imposing restrictions on $\theta$, which often referred to as a choice of gauge, e.g. Lorenz gauge

The ambiguity in the definition of $\mathbf{A}$ and $\phi$ in the Maxwell's equations can be exploited to impose the Lorenz gauge condition:

\begin{equation*}
\boldsymbol{\nabla} \cdot \mathbf{A} + \pdv{\phi}{t} = 0
\end{equation*}

In which case the remaining two Maxwell equations become wave equations with "sources":

\begin{equation*}
\pdv[2]{\phi}{t} - \nabla^2 \phi = \rho \quad \text{and} \quad \pdv[2]{\mathbf{A}}{t} - \nabla^2 \mathbf{A} = \mathbf{J}
\end{equation*}

From these wave-equations we get electromagnetic waves!

Maxwell's equations are variational

  • Let $\rho = 0$ and $\mathbf{J} = 0$ at first
  • Consider Lagrangian density

    \begin{equation*}
\mathcal{L} = \frac{1}{2} \big( \norm{\mathbf{E}}^2 - \norm{\mathbf{B}}^2 \big)
\end{equation*}

    as functions of $\mathbf{A}$ and $\phi$, i.e.

    \begin{equation*}
\mathcal{L} = \frac{1}{2} \bigg[ \big( \partial_i \phi \big)^2 + \big( \partial_t A_i \big)^2 + 2 \big( \partial_i \phi \big) \big( \partial_t A_i \big)  \bigg] - \frac{1}{2} \bigg[ \big( \partial_i A_j \big)^2 - \big( \partial_i A_j  \big)\big( \partial_j A_i \big) \bigg]
\end{equation*}
  • Observe that

    \begin{equation*}
\begin{split}
  \pdv{\mathcal{L}}{\big( \partial_i \phi \big)} &= \partial_i \phi + \partial_t A_t = - E_i \\
  \pdv{\mathcal{L}}{\big( \partial_t A_i \big)} &= \partial_i \phi + \partial_t A_i = - E_i \\
  \pdv{\mathcal{L}}{\big( \partial_i A_j \big)} &= - \partial_i A_j + \partial_j A_i = - \epsilon_{ijk} B^k
\end{split}
\end{equation*}
  • $\mathcal{L}$ does not depend explicitly on $\mathbf{A}$ or $\phi$, only on their derivatives, so E-L are

    \begin{equation*}
\partial_i \pdv{\mathcal{L}}{\big( \partial_i \phi \big)} = 0 \implies \partial_i E^i = 0
\end{equation*}

    and

    \begin{equation*}
\partial_j \pdv{\mathcal{L}}{\big( \partial_j A_i \big)} + \partial_t \pdv{\mathcal{L}}{\big( \partial_t A_i \big)} = 0 \implies \big( \boldsymbol{\nabla} \times \mathbf{B} \big)_i - \partial_t E_i = 0
\end{equation*}

    which are precisely the two remaining Maxwell equations when $\rho = 0$ and $\mathbf{J} = 0$.

We can obtain the Maxwell equations with $\rho$ and $\mathbf{J}$ nonzero by modifying $\mathcal{L}$:

\begin{equation*}
\mathcal{L} = \frac{1}{2} \Big( \norm{\mathbf{E}}^2 - \norm{\mathbf{B}}^2 \Big)  + \mathbf{A} \cdot \mathbf{J} - \phi \rho  
\end{equation*}

We can rewrite $\mathcal{L}$ by introducing the electromagnetic 4-potential

\begin{equation*}
A_{\mu} = 
\begin{pmatrix}
  - \phi \\ \mathbf{A}
\end{pmatrix}
\end{equation*}

with $\mu = 0, 1, 2, 3$ so that $A_0 = - \phi$.

The electromagnetic 4-current is defined

\begin{equation*}
J_{\mu} = 
\begin{pmatrix}
  - \rho \\ \mathbf{J}
\end{pmatrix}
\end{equation*}

so that $J_0 = - \rho$.

We define the fieldstrength

\begin{equation*}
F_{\mu \nu} = \partial_{\mu} A_{\nu} - \partial_{\nu} A_{\mu}
\end{equation*}

which obeys $F_{\mu \nu} = - F_{\nu \mu}$.

We can think of $F_{\mu \nu}$ as entries of the $4 \times 4$ antisymmetric matrix

\begin{equation*}
\big[ F_{\mu \nu} \big] = 
\begin{pmatrix}
  0 &  -E_1 & - E_2 & -E_3 \\
  E_1 & 0 & B_3 & - B_2 \\
  E_2  & - B_3 & 0 & B_1 \\ 
  E_3 & B_2 & - B_1 & 0
\end{pmatrix}
\end{equation*}

where we have used that

\begin{equation*}
F_{0 i }  = \partial_0 A_i + \partial_i \phi = - E_i \quad \text{and} \quad F_{ij} = \partial_i A_j - \partial_j A_i = \epsilon_{ijk} B^k
\end{equation*}

In the terms of the fieldstrength $F_{\mu \nu}$ we can write Maxwell's equations as

\begin{equation*}
\mathcal{L} = - \frac{1}{4} F_{\mu \nu} F^{\mu \nu} + \eta^{\mu \nu} A_{\mu} J_{\nu}
\end{equation*}

where we have used the "raised indices" of $F_{\mu \nu}$ with $\eta^{\mu \nu}$ as follows:

\begin{equation*}
F^{\mu \nu} = \eta^{\mu \alpha} \eta^{\nu \beta} F_{\alpha \beta}
\end{equation*}

The Euler-Lagrange equations of $\mathcal{L}$ are given by

\begin{equation*}
- \eta^{\mu \nu} \partial_{\mu} F_{\nu \alpha} = J_{\alpha}
\end{equation*}

and the gauge transformations are

\begin{equation*}
A_{\mu} \mapsto A_{\mu} + \partial_{\mu} \theta
\end{equation*}

under which $F_{\mu \nu}$ are invariant.

In the absence of sources, so when $J_{\mu} = 0$, $\mathcal{L}$ is gauge invariant.

Let $I$ denote the action corresponding to $\mathcal{L}$, then

\begin{equation*}
\dv{}{s}\bigg|_{s = 0} I \big[ A_{\mu} + s \varepsilon_{\mu} \big] = 0 \quad \iff \quad \pdv{\mathcal{L}}{A_{\mu}} = \pdv{}{x^{\nu}} \pdv{\mathcal{L}}{\big( \partial_{\nu} A_{\mu} \big)}
\end{equation*}

where

\begin{equation*}
\pdv{\mathcal{L}}{A_{\mu}} = \tensor{\eta}{^{\mu \nu}} J_{\nu}
\end{equation*}

and

\begin{equation*}
\begin{split}
  \pdv{\mathcal{L}}{\big( \partial_{\nu} A_{\mu} \big)} &= - \frac{1}{4} \times 2 \times F^{\alpha \beta} \pdv{F_{\alpha \beta}}{\big( \partial_{\nu} A_{\mu} \big)}
\end{split}
\end{equation*}
\begin{equation*}
\begin{split}
  \pdv{F_{\alpha \beta}}{\big( \partial_{\nu} A_{\mu} \big)} &= \pdv{}{\big( \partial_{\nu} A_{\mu} \big)} \big( \partial_{\alpha} A_{\beta} - \partial_{\beta} A_{\alpha} \big) \\
  &= \tensor{\delta}{^{\nu}_{\alpha}} \tensor{\delta}{^{\mu}_{\beta}} - \tensor{\delta}{^{\nu}_{\beta}} \tensor{\delta}{^{\mu}_{\alpha}}
\end{split}
\end{equation*}

Therefore,

\begin{equation*}
\pdv{\mathcal{L}}{\big( \partial_{\nu} A_{\mu} \big)} = - \frac{1}{2} F^{\alpha \beta} \Big( \tensor{\delta}{^{\nu}_{\alpha}} \tensor{\delta}{^{\mu}_{\beta}} - \tensor{\delta}{^{\nu}_{\beta}} \tensor{\delta}{^{\mu}_{\alpha}} \Big)
\end{equation*}

Substituing this into our E-L equations from above, we (apparently) get

\begin{equation*}
\eta^{\mu \nu} J_{\nu} = \partial_{\nu} F^{\mu \nu}
\end{equation*}

In the absence of sources, so when $J_{\mu} = 0$, $\mathcal{L}$ is gauge invariant. This is seen by only considering the second-order of the transformation

  1. First show that $\mathcal{L}$ is invariant under the following

Consider

\begin{equation*}
A_{\mu} \mapsto A_{\mu} + s \big( \delta A_{\mu} \big) + \mathcal{O}(s^2)
\end{equation*}

where

\begin{equation*}
\delta A_{\mu} = \partial_{\mu} \xi^{\nu} A_{\nu} + \xi^{\nu} \partial_{\nu} A_{\mu}
\end{equation*}

and

\begin{equation*}
\xi^{\mu} = a^{\mu} + \tensor{\Lambda}{^{\mu}_{\nu}} x^{\nu}
\end{equation*}

then

\begin{equation*}
\Lambda_{\mu \nu} := \eta_{\mu \alpha} \tensor{\Lambda}{^{\alpha}_{\nu}} = - \Lambda_{\nu \mu}
\end{equation*}
  1. Find the Nother currents $\tensor{T}{^{\mu}_{\nu}} a^{\nu}$ and $\tensor{M}{^{\mu \nu}_{\alpha}} \tensor{\Lambda}{^{\alpha}_{\nu}}$

Examples

The Kepler Problem

  • Illustrates Noether's Theorem and some techniques for the calculation of Poisson brackets
  • Will set up problem both from a Lagrangian and a Hamiltonian point of view and show how to solve the system by exploiting conserved quantities

Notation

  • Two particles of masses $m_1$ and $m_2$ moving in $\mathbb{R}^3$, with $\mathbf{x}_1$ and $\mathbf{x}_2$ denoting the corresponding positions
  • Assuming particles cannot occupy same position at same time, i.e. $\mathbf{x}_1(t) \ne \mathbf{x}_2(t)$ for all $t$.
  • We then have the total kinetic energy of the system given by

    \begin{equation*}
T = \frac{1}{2} m_1 \norm{\dot{\mathbf{x}}_1}^2 + \frac{1}{2} m_2 \norm{\dot{\mathbf{x}}_2}^2
\end{equation*}

    and potential energy

    \begin{equation*}
V = - \frac{k}{\norm{\mathbf{x}_1 - \mathbf{x}_2}}
\end{equation*}

Lagrangian description

  • Lagrangian is, as usual given by

    \begin{equation*}
L = T - V = \frac{1}{2} m_1 \norm{\dot{\mathbf{x}}_1}^2 + \frac{1}{2} m_2 \norm{\dot{\mathbf{x}}_2}^2 + \frac{k}{\norm{\mathbf{x}_1 - \mathbf{x}_2}}
\end{equation*}
  • $L$ is invariant under the diagonal action of the Euclidean group of $\mathbb{R}^3$ on the configuration space, i.e. if $A: \mathbb{R}^3 \to \mathbb{R}^3$ is an orthonormal transformation and $\mathbf{a} \in \mathbb{R}^3$, then

    \begin{equation*}
\big( \mathbf{x}_1, \mathbf{x}_2 \big) \mapsto \big( A \mathbf{x}_1 + \mathbf{a}, A \mathbf{x}_2 + \mathbf{a} \big)
\end{equation*}

    leaves the Lagrangian invariant.