Variational Calculus
Table of Contents
Sources
- Most of these notes comes from the Variational Calculus taught by Prof. José Figueroa O'Farrill, University of Edinburgh
Overview
Notation
denotes the space of possible paths (i.e.
curves) between points
and 
Introduction
Precise analytical techniques to answer:
- Shortest path between two given points on a surface
- Curve between two given points in the place that yields a surface of revolution of minimum area when revolved around a given axis
- Curve along which a bead will slide (under the effect of gravity) in the shortest time
Underpins much of modern mathematical physics, via Hamilton's principle of least action
Consider "standard" directional derivative of
, with
, at
along some vector
:
where
is a critical point of
, i.e.
(since we know that
form a basis in
, so by lin. indep. we have the above!)
Stuff
Let
be a continuous function which satisfies
for all
with
Then
.
Observe that if we only consider
, then we can simply genearlize to arbitrary
since integration is a linear operation.
Let
which obeys
Assume, for the sake of contradiction, that
, i.e.
Let, w.l.o.g.,
.
Since
is continuous, there is some interval
such that
and some
such that
Suppose for the moment that there exists some
such that
for all
outside 

Then observe that
This is clearly a contradiction with our initial assumption, hence
such that
. Hence, by continuity,
Now we just prove that there exists such a function
which satisfies the properties we described earlier.
Let
which is a smooth function. Then let
which is a smooth function, since it's a product of smooth functions.
To make this vanish outside of
, we have
This function is clearly always positive, hence,
Hence, letting
we get a function used in the proof above!
This concludes our proof of Fundamental Lemma of the Calculus of Variations
General variations
Suppose we want te shortest path in
between
and a curve
on
.
We assume that
with
differentiable, such that
.
Then observe that
, then
Then
where we've used the fact that
We cannot just drop the endpoint-terms anymore, since these are now not necessarily vanishing, as we had in the previous case.
Then, by our earlier assumption, we have
which implies that
Hence,
must hold for all
and
and
In particular,
, then
and
i.e.
is normal to
at the point where it intersects with
.
Euler-Lagrange equations
Notation
or
denotes the Lagrangian
Endpoint-fixed variations
Let
be the space of
curves
with
The Lagrangian is defined
where
and
. Let
be "sufficiently" differentiable (usually taken to be smooth in applications).
Then the function
, called the action, is defined
A path
is a critical point for the action if, for all endpoint-fixed variations
, we have
Bringing the differentiation into to the integral, we have
Properties
If
, then the "energy" given by
is constant along extremals of the Lagrangian. Then observe that
where we've used the fact that
in the thrid equality.
Thus,
and,
Hence,
i.e time invariance! This is an instance of Noether's Theorem.
If
, so that the lagrangian does not depend explicitly on
, then the energy
is constant.
This is known as the Beltrami's idenity.
Euler-Lagrange
Let
be the space of
curves
with
Let
where
and
, be sufficiently differentiable (typically smooth in applications) and let us consider the function
defined by
Then he extremals must satisfy the Euler-Lagrange equations:
Newtonian mechanics
Notation
worldlines refer to the trajectory of a particle:
with
Galilean relativity
- Affine transformations (don't assume a basis)
- Relativity group: group of transformations on the universe preserving whichever structure we've endowed the universe with
The subgroup of affine transformations of
which leave invariant the time interval between events and the distance between simultaneous events is called the Galilean group.
That is, the Galilean group consists of affine transformations of the form
These transformations can be written uniquely as a composition of three elementary galilean transformations:
translations in space and time:
orthogonal transformations in space:
and galilean boosts:
Observe that if choose the action
which has Lagrangian
then we observe that the minimizing path
should satisfy
which is
where
denotes the force. Further,
which is the momentum! Then,
Hence we're left with Newton's 2nd law.
Noether's Theorem
Notation
of
functions called one-parameter subgroup of
diffeomorphisms , which are differentiable wrt. 
and
are defined by
and


Stuff
We've seen the following continuous symmetries this far:
Momentum is conserved:
Energy is conserved:
We say that
is a symmetry of the Lagrangian
if
where

is a diffeomorphism
Equivalently, one says that
is invariant under
.
Let
of
functions, defined for all
and depending differentiably on
.
Moreover, let this family satisfies the following properties:
for all 
for all 
Then the family
is called a one-parameter subgroup of
diffeomorphisms on
.
Let
be an action for curves
, and let
be invariant under a one-parameter group of diffeomorphisms
.
Then the Noether charge
, defined by
is conserved; that is,
along physical trajectories.
Consider functions which are defined by Lagrangians
and such that they are invariant under one-parameter family of diffeomorphisms of
such that
This means, in particular, that
where
Then the Noether charge
is conserved along extremals; that is, along curves which obey the Euler-Lagrange equation.
Hamilton's canonical formalism
Notation
Considering
for
curves
- Differentiable function

denotes the Hamiltonian
Canonical form of Euler-Lagrange equation
Can convert E-L into equivalent first-order ODE:
is equivalent to system
for the variables
.
Consider
Then letting
the above system becomes
Hamiltonian becomes
which has the property
known as Hamilton's equations.
Observe that
where the coefficient matrix, let's call it
, can be thought of as a bilinear form on
, which defines a symplectic structure.
- In general case, existence of solution set to the
equations is guaranteed by the implicit function theorem in the case where the Hessian
is invertible
is then said to be regular (or non-degenerate)
General case:
Hamiltonian
Total derivative of
(or as we recognize, the exterior derivative of a continuous function)
where we have used that
.
Give us
First-order version of Euler-Lagrange equations in canonical (or hamiltonian) form:
which we call Hamilton's equations.
In general, the Hamiltonian
is given by
Taking the total derivative, we're left with
where we've used
.
This gives us
Conserved quantity using Poisson brackets
Consider energy conserved, i.e.
, and a differentiable function
:
where we have introduced the Poisson bracket
for any two differentiable functions
of
.
Hence
If
depend explicity on
, then the same calculation as above would show that
In this case
could still be conserved if
Given two conserved quantities
and
, i.e. Poisson-commute with the hamiltonian
.
Then we can generate new conserved quantities from old using the Jacobi identity:
Therefore
is a conserved quantity.
Can we associate some * to an conserved charge?
Consider a conserved quantity
which satisfy
Then
defines a vector field on the phase space
which we may integrate to find through every point a solution to the differential equation
Then, by existence and uniqueness of solutions to IVPs on some open interval
as given above, gives us the unique solutions
and
for some initial values
and
.
Therefore,
Thus we have a continuous symmetry of Hamilton's equations, which takes solutions to solutions.
That is, these solutions
which are in some sense generated from
and
contain symmetries!
E.g. solution to the system of ODEs above could for example be a linear combination of
and
, in which case we would then understand that "symmetry" generated by
is rotational symmetry.
The caveat is that this may not actually extend to a one-parameter family of diffeomorphisms, as we used earlier in Noether's theorem as the invariant functions or symmetries.
Example:
Lagrangian:
Exists
TODO Integrability
A set of functions which Poisson-commute among themselves are said to be in involution.
Liouville's theorem says that if a hamiltonian
on
admits
independent conserved quantities in involution, then there is a canonical transformation to so called action / angle variables
such that Hamilton's equations imply that
Such a system is said to be integrable.
Constrained problems
Isoperimetric problems
Notation
Typically we talk about the constrained optimization problem:
for
.
Stuff
- Extremise a functional subject to a functional constraint
Consider a closed loop
enclosing some area
.
The Dido's problem or (original) isoperimetric problem is the problem of maximizing the area of
while keeping the length of
constant.
Consider
for
with initial conditions
Using Green's theorem, we have
and length
The problem is then that we want to extremise
wrt. the constraint
.
More generally, given functions for
being endpoint fixed, e.g.
and
for some
.
Given the functionals
we want to extremise
subject to
.
Let
and let
be an extremum of
subject to
.
If
(i.e.
is not a critical point), then
(called a Lagrange multiplier) s.t.
is a critical point of the function
defined by
Supose
is an extremal of
in the space of
with
. Then
for small
, where
.
This might constrain
and prevent use of the Fundamental Lemma of Calculus of Variations.
Idea:
- Consider
and
for all
near
, defined
. - Then express one of the variations as the other, allowing us to eliminate one.
Let
be defined
Assume now that
and
.
If we specifically consider (wlog)
, then IFT implies that
for "small"
. Therefore,
is a
Let
be functionals of functions
subject to BCs
Suppose that
is an extremal of
subject to the isoperimetric constraint
. Then if
is not an extremal of
, there is a constant
so that
is an extremal of
. That is,
Method of Lagrange multipliers for functionals
Suppose that we wish to extremise a functional
for
.
Then the method consists of the following steps:
- Ensure that
has no extremals satisfying
. Solve EL-equation for
which is second-order ODE for
.
- Fix constants of integration from the BCs
- Fix the value of
using
.
Classical isoperimetric problem
The catenary
Notation
denotes height as a function of arclength 
Catenary
- Uniform chain og length
hangs under its own weight from two poles of height
a distance
apart Potential energy is given by
, which we can parametrize
giving us
- Observation:
- All extremals of arclength are straight lines
- Extremal to constrained problem has non-zero gradient of the constraint
- → straight lines are not the solutions
Consider lagrangian
Using Beltrami's identity:
rewritten to
which, given the BCs
we get the solution
which follows from taking the derivative of both sides and then solving.
Impose the isoperimetric condition:
Introducing
, we find the following transcendental equation
for which
is the trivial solution
small, together with the condition
ensures that
But for
the exponential term dominates
by continuity.
Holonomic and nonholonomic constraints
- Constraints are simply functions, i.e.
for some
- Instead of functionals
as we saw for isoperimetric problems
- Instead of functionals
We say a constraint is scleronomic if the constraint does not depend explicitly on
, and rheonomic if it does.
We say a constraint is holonomic if it does not depend explicitly on
, and nonholonomic if it does.
In the case where nonholonomic constraints are at most linear in
, then we say that the constraints are pfaffian constraints.
Typical usecases
- Finding geodesics on a surface as defined as the zero locus of a function, which are scleronomic and holonomic.
- Reducing higher-order lagrangians to first-order lagriangians
can be replaced by
which are scleronomic and nonholonomic.
- Mechanical problems: e.g. "rolling without sliding", which are typically nonholonomic
Holonomic constraints
is the gradient wrt.
for all
, NOT
.
Let
be admissible variation of
.
Then the constraint
requires
Let
Then the contraint above implies that
i.e. if
then
is tangent to the implicitly defined surface
.
Consider
, then the above gives us
We therefore suppose that
The implicit function then implies that we can solve
for one component. Then the above gives us
i.e.
is arbitrary and
is fully determined by
in this "small" neighborhood
.
Then
Since
is arbitrary, FTCV implies that
Which is jus the E-L equation for a lagrangian given by
Above we use the implicit function theorem to prove the existence of such extremals, but one can actually prove this using something called "smooth partion of unity".
In that case we will basically do the above for a bunch of different neighborhoods, and the sum them together to give us (apparently) the same answer!
Nonholonomic constraints
Examples
(Non-holonomic) Higher order lagrangians
can be replaced by
which are scleronomic and nonholonomic.
So we consider the Lagriangian
Then
So the E-L equations gives us
which gives us
where we've used
Variational PDEs
Notation
be a
function on the set 
Lagrangian over a surface
with corresponding action
where
and
denote collectively the
partial derivatives
and
for
BCs are given by
where
is given
Variations are
functions
such that
Stuff
Then
where we've used integration by parts in the last equality.
The Divergence (or rather, Stokes') theorem allows us to rewrite the last integral as
where
is the arclength. And since
this vanishes.
Generalisation of the FLCV the first integral term must vanish and so we get
We can generalise this to more than just 2D!
Multidimensional Euler-Lagrange equations
Let
be a bounded region with (piecewise) smooth boundary.
denote the coordinates for 
be the Lagrangian for maps
where
denotes collectively the
partial derivatives
Then the general Euler-Lagrange equations are given by
using Einstein summation.
Notice that we are treating
as a function of the x's and differentiate wrt.
keeping all other x's fixed!
(This is really Stokes' theorem)
Let
be bounded open set with (piecewise) smooth boundary 
be a smooth vector field defined on 
be the unit outward-pointing normal of 
Then,
(using Einstein summation) where
is the volume element in
and
is the area element in
and
denotes the Euclidean inner product in
.
Let
be bounded open with (piecewise) smooth boundary 
be a continuous function which obeys
for all
functions
vanishing on
.
Then
.
Noether's theorem for multidimesional Lagrangians
Notation

Lagrangian
where
Use the notation
Conserved now refers to "divergenceless", that is,
is a conserved quantity if
where we're using Einstein summation.
for
denotes a one-parameter group of diffeomorphisms
is defined
Stuff
We say it's a conserved "current" because
where
denotes the normal to the boundary
Consider
And let
such that
We suppose that the action is invariant, so that the Lagrangian obeys
or equivalently, one can (usually more easily) check that the following is true
The Noether current is then given by
since the RHS is independent of
. The LHS on the other hand is given by
where
.
Now we observe the following:
and
where we've simply taking the derivatives of the Taylor expansions. Hence, we are left with
Now we need to evaluate
and it's derivative wrt.
. First we notice that
We now compute
. We first have
Finally using the fact that if a we have some matrix
given by
we have
Finally we need to compute
, which one will find to be
AND I NEED TO PRACTICE FOR MY EXAM INSTEAD OF DOING THIS. General ideas are the above, and then just find an expression for the missing part. Then, you do some nice manipulation, botain an expression which vanish due to the EL equations being satisfied by the non-transformed
, and you end up with the Noether's current for the multi-dimensional case.
Examples
Minimal surface
Let
be a twice differentiable function.
The grahp
defines a surface
. The area of this surface is the functional of
given by
If
is an extremal of this function, we say that
is a minimal surface.
In this case the Lagrangian is
with the EL-equations
where
Therefore,
and similarily for
. When combined, and multiplied by
since the combination equal zero anyways, we're left with
where we've used the fact that
.
This is then the equation which must be satisfied by a minimal surface.
model
Noether's current for multidimensional Lagrangian
Classical Field Theory
Notation
denotes a field written 

Concerned with action functionals of the form
where
is called a Lagrangian density and
, i.e.
for
for some
and
denotes a "cylindrical" region
is the outward normal to the boundary 
Stuff
Lagrangian density
is just used to refer to the fact that we are now looking at a variation of the form
So it's like the
is the Lagrangian now, and the "inner" functional is the a Lagrangian density.
Klein-Gordon equation in
(i.e.
) is given by
where
is called the mass.
If
, then this is the wave equation, whence the Klein-Gordon equation is a sort of massive wave equation.
More succinctly, introducing the matrix
then the Klein-Gordon equation can be written
Note: you sometimes might see this written
where they use the notation
so we have sort of "summed out" the
.
Calculus of variations with improper integrals
is unbounded, and so we need to consider
where
i.e. the closed ball of radius
Vary the action
where we have omitted the boundary term
which is seen by applying the Divergence Theorem and using the BCs on the variation
Using Fundamental Lemma of Calculus of Variations we obtain the E-L equations
Noether's Theorem for improper actions
- Consider action function for a classical field
which is invariant under continuous one-parameter symmetry with Noether current 
Integrate (zero) divergence of the current on a "cylindrical region"
and apply the Divergence Theorem
consists of "sides"
of the "cylinder", where
is the m-sphere of radius 
- top cap

- bottom cap

Can rewrite the above as
using the fact that
points outward at the bottom cap → negative
axis
- Last term vanishes due to BCs on the field implies
on
as 
- Last term vanishes due to BCs on the field implies
Since
arbitrary, we have
is conserved, i.e. we have a Noether's charge for the improper case!
Maxwell equations
Notation
is the magnetic field
is the electric field
is the electric charge density
is the electric current density
is the magnetic potentialLet
Stuff
Maxwell's equations
Observe that
can be solved by writing
Does not determine
uniquely since
leaves
unchanged (since
), and is called a gauge transformation
Substituting
into Maxwell's equations:
Thus there exists a function
(again since
), called the electric potential, such that
Performing gauge transformation → changes
and
unless also transform
In summary, two of Maxwell's equations can be solved by
where
and
are defined up to gauge transformations
for some function
- We can fix the "gauge freedom" (i.e. limit the space of functions
) by imposing restrictions on
, which often referred to as a choice of gauge, e.g. Lorenz gauge
- We can fix the "gauge freedom" (i.e. limit the space of functions
The ambiguity in the definition of
and
in the Maxwell's equations can be exploited to impose the Lorenz gauge condition:
In which case the remaining two Maxwell equations become wave equations with "sources":
From these wave-equations we get electromagnetic waves!
Maxwell's equations are variational
- Let
and
at first Consider Lagrangian density
as functions of
and
, i.e.
Observe that
does not depend explicitly on
or
, only on their derivatives, so E-L are
and
which are precisely the two remaining Maxwell equations when
and
.
We can obtain the Maxwell equations with
and
nonzero by modifying
:
We can rewrite
by introducing the electromagnetic 4-potential
with
so that
.
The electromagnetic 4-current is defined
so that
.
We define the fieldstrength
which obeys
.
We can think of
as entries of the
antisymmetric matrix
where we have used that
In the terms of the fieldstrength
we can write Maxwell's equations as
where we have used the "raised indices" of
with
as follows:
The Euler-Lagrange equations of
are given by
and the gauge transformations are
under which
are invariant.
In the absence of sources, so when
,
is gauge invariant.
Let
denote the action corresponding to
, then
where
and
Therefore,
Substituing this into our E-L equations from above, we (apparently) get
In the absence of sources, so when
,
is gauge invariant. This is seen by only considering the second-order of the transformation
- First show that
is invariant under the following
Consider
where
and
then
- Find the Nother currents
and 
Examples
The Kepler Problem
- Illustrates Noether's Theorem and some techniques for the calculation of Poisson brackets
- Will set up problem both from a Lagrangian and a Hamiltonian point of view and show how to solve the system by exploiting conserved quantities
Notation
- Two particles of masses
and
moving in
, with
and
denoting the corresponding positions - Assuming particles cannot occupy same position at same time, i.e.
for all
. We then have the total kinetic energy of the system given by
and potential energy
Lagrangian description
Lagrangian is, as usual given by
is invariant under the diagonal action of the Euclidean group of
on the configuration space, i.e. if
is an orthonormal transformation and
, then
leaves the Lagrangian invariant.
for all
such that