Variational Calculus
Table of Contents
Sources
- Most of these notes comes from the Variational Calculus taught by Prof. José Figueroa O'Farrill, University of Edinburgh
Overview
Notation
- denotes the space of possible paths (i.e. curves) between points and
Introduction
Precise analytical techniques to answer:
- Shortest path between two given points on a surface
- Curve between two given points in the place that yields a surface of revolution of minimum area when revolved around a given axis
- Curve along which a bead will slide (under the effect of gravity) in the shortest time
Underpins much of modern mathematical physics, via Hamilton's principle of least action
Consider "standard" directional derivative of , with , at along some vector :
where is a critical point of , i.e.
(since we know that form a basis in , so by lin. indep. we have the above!)
Stuff
Let be a continuous function which satisfies
for all with
Then .
Observe that if we only consider , then we can simply genearlize to arbitrary since integration is a linear operation.
Let which obeys
Assume, for the sake of contradiction, that , i.e.
Let, w.l.o.g., .
Since is continuous, there is some interval such that and some such that
Suppose for the moment that there exists some such that
- for all outside
Then observe that
This is clearly a contradiction with our initial assumption, hence such that . Hence, by continuity,
Now we just prove that there exists such a function which satisfies the properties we described earlier.
Let
which is a smooth function. Then let
which is a smooth function, since it's a product of smooth functions.
To make this vanish outside of , we have
This function is clearly always positive, hence,
Hence, letting we get a function used in the proof above!
This concludes our proof of Fundamental Lemma of the Calculus of Variations
General variations
Suppose we want te shortest path in between and a curve on .
We assume that with differentiable, such that .
Then observe that , then
Then
where we've used the fact that
We cannot just drop the endpoint-terms anymore, since these are now not necessarily vanishing, as we had in the previous case.
Then, by our earlier assumption, we have
which implies that
Hence,
must hold for all and and
In particular, , then
and
i.e. is normal to at the point where it intersects with .
Euler-Lagrange equations
Notation
- or denotes the Lagrangian
Endpoint-fixed variations
Let be the space of curves with
The Lagrangian is defined
where and . Let be "sufficiently" differentiable (usually taken to be smooth in applications).
Then the function , called the action, is defined
A path is a critical point for the action if, for all endpoint-fixed variations , we have
Bringing the differentiation into to the integral, we have
Properties
If , then the "energy" given by
is constant along extremals of the Lagrangian. Then observe that
where we've used the fact that in the thrid equality. Thus,
and,
Hence,
i.e time invariance! This is an instance of Noether's Theorem.
If , so that the lagrangian does not depend explicitly on , then the energy
is constant.
This is known as the Beltrami's idenity.
Euler-Lagrange
Let be the space of curves with
Let
where and , be sufficiently differentiable (typically smooth in applications) and let us consider the function defined by
Then he extremals must satisfy the Euler-Lagrange equations:
Newtonian mechanics
Notation
worldlines refer to the trajectory of a particle:
with
Galilean relativity
- Affine transformations (don't assume a basis)
- Relativity group: group of transformations on the universe preserving whichever structure we've endowed the universe with
The subgroup of affine transformations of which leave invariant the time interval between events and the distance between simultaneous events is called the Galilean group.
That is, the Galilean group consists of affine transformations of the form
These transformations can be written uniquely as a composition of three elementary galilean transformations:
translations in space and time:
orthogonal transformations in space:
and galilean boosts:
Observe that if choose the action
which has Lagrangian
then we observe that the minimizing path should satisfy
which is
where denotes the force. Further,
which is the momentum! Then,
Hence we're left with Newton's 2nd law.
Noether's Theorem
Notation
- of functions called one-parameter subgroup of diffeomorphisms , which are differentiable wrt.
and are defined by
and
Stuff
We've seen the following continuous symmetries this far:
Momentum is conserved:
Energy is conserved:
We say that is a symmetry of the Lagrangian if
where
- is a diffeomorphism
Equivalently, one says that is invariant under .
Let of functions, defined for all and depending differentiably on .
Moreover, let this family satisfies the following properties:
- for all
- for all
Then the family is called a one-parameter subgroup of diffeomorphisms on .
Let be an action for curves , and let be invariant under a one-parameter group of diffeomorphisms .
Then the Noether charge , defined by
is conserved; that is, along physical trajectories.
Consider functions which are defined by Lagrangians and such that they are invariant under one-parameter family of diffeomorphisms of such that
This means, in particular, that
where
Then the Noether charge
is conserved along extremals; that is, along curves which obey the Euler-Lagrange equation.
Hamilton's canonical formalism
Notation
Considering
for curves
- Differentiable function
- denotes the Hamiltonian
Canonical form of Euler-Lagrange equation
Can convert E-L into equivalent first-order ODE:
is equivalent to system
for the variables .
Consider
Then letting
the above system becomes
Hamiltonian becomes
which has the property
known as Hamilton's equations.
Observe that
where the coefficient matrix, let's call it , can be thought of as a bilinear form on , which defines a symplectic structure.
- In general case, existence of solution set to the equations is guaranteed by the implicit function theorem in the case where the Hessian is invertible
- is then said to be regular (or non-degenerate)
General case:
Hamiltonian
Total derivative of (or as we recognize, the exterior derivative of a continuous function)
where we have used that .
Give us
First-order version of Euler-Lagrange equations in canonical (or hamiltonian) form:
which we call Hamilton's equations.
In general, the Hamiltonian is given by
Taking the total derivative, we're left with
where we've used .
This gives us
Conserved quantity using Poisson brackets
Consider energy conserved, i.e. , and a differentiable function :
where we have introduced the Poisson bracket
for any two differentiable functions of .
Hence
If depend explicity on , then the same calculation as above would show that
In this case could still be conserved if
Given two conserved quantities and , i.e. Poisson-commute with the hamiltonian .
Then we can generate new conserved quantities from old using the Jacobi identity:
Therefore is a conserved quantity.
Can we associate some * to an conserved charge?
Consider a conserved quantity which satisfy
Then
defines a vector field on the phase space which we may integrate to find through every point a solution to the differential equation
Then, by existence and uniqueness of solutions to IVPs on some open interval as given above, gives us the unique solutions and for some initial values and .
Therefore,
Thus we have a continuous symmetry of Hamilton's equations, which takes solutions to solutions.
That is, these solutions which are in some sense generated from and contain symmetries!
E.g. solution to the system of ODEs above could for example be a linear combination of and , in which case we would then understand that "symmetry" generated by is rotational symmetry.
The caveat is that this may not actually extend to a one-parameter family of diffeomorphisms, as we used earlier in Noether's theorem as the invariant functions or symmetries.
Example:
Lagrangian:
Exists
TODO Integrability
A set of functions which Poisson-commute among themselves are said to be in involution.
Liouville's theorem says that if a hamiltonian on admits independent conserved quantities in involution, then there is a canonical transformation to so called action / angle variables such that Hamilton's equations imply that
Such a system is said to be integrable.
Constrained problems
Isoperimetric problems
Notation
Typically we talk about the constrained optimization problem:
for .
Stuff
- Extremise a functional subject to a functional constraint
Consider a closed loop enclosing some area .
The Dido's problem or (original) isoperimetric problem is the problem of maximizing the area of while keeping the length of constant.
Consider for with initial conditions
Using Green's theorem, we have
and length
The problem is then that we want to extremise wrt. the constraint .
More generally, given functions for
being endpoint fixed, e.g. and for some .
Given the functionals
we want to extremise subject to .
Let and let be an extremum of subject to .
If (i.e. is not a critical point), then (called a Lagrange multiplier) s.t. is a critical point of the function defined by
Supose is an extremal of in the space of with . Then
for small , where .
This might constrain
and prevent use of the Fundamental Lemma of Calculus of Variations.
Idea:
- Consider and for all near , defined .
- Then express one of the variations as the other, allowing us to eliminate one.
Let be defined
Assume now that and .
If we specifically consider (wlog) , then IFT implies that
for "small" . Therefore,
is a
Let
be functionals of functions subject to BCs
Suppose that is an extremal of subject to the isoperimetric constraint . Then if is not an extremal of , there is a constant so that is an extremal of . That is,
Method of Lagrange multipliers for functionals
Suppose that we wish to extremise a functional
for .
Then the method consists of the following steps:
- Ensure that has no extremals satisfying .
Solve EL-equation for
which is second-order ODE for .
- Fix constants of integration from the BCs
- Fix the value of using .
Classical isoperimetric problem
The catenary
Notation
- denotes height as a function of arclength
Catenary
- Uniform chain og length hangs under its own weight from two poles of height a distance apart
Potential energy is given by , which we can parametrize
giving us
- Observation:
- All extremals of arclength are straight lines
- Extremal to constrained problem has non-zero gradient of the constraint
- → straight lines are not the solutions
Consider lagrangian
Using Beltrami's identity:
rewritten to
which, given the BCs we get the solution
which follows from taking the derivative of both sides and then solving.
Impose the isoperimetric condition:
Introducing , we find the following transcendental equation
for which
- is the trivial solution
- small, together with the condition ensures that
But for the exponential term dominates
by continuity.
Holonomic and nonholonomic constraints
- Constraints are simply functions, i.e. for some
- Instead of functionals as we saw for isoperimetric problems
We say a constraint is scleronomic if the constraint does not depend explicitly on , and rheonomic if it does.
We say a constraint is holonomic if it does not depend explicitly on , and nonholonomic if it does.
In the case where nonholonomic constraints are at most linear in , then we say that the constraints are pfaffian constraints.
Typical usecases
- Finding geodesics on a surface as defined as the zero locus of a function, which are scleronomic and holonomic.
- Reducing higher-order lagrangians to first-order lagriangians
can be replaced by
which are scleronomic and nonholonomic.
- Mechanical problems: e.g. "rolling without sliding", which are typically nonholonomic
Holonomic constraints
is the gradient wrt. for all , NOT .
Let be admissible variation of .
Then the constraint requires
Let
Then the contraint above implies that
i.e. if then is tangent to the implicitly defined surface .
Consider , then the above gives us
We therefore suppose that
The implicit function then implies that we can solve for one component. Then the above gives us
i.e. is arbitrary and is fully determined by in this "small" neighborhood .
Then
Since is arbitrary, FTCV implies that
Which is jus the E-L equation for a lagrangian given by
Above we use the implicit function theorem to prove the existence of such extremals, but one can actually prove this using something called "smooth partion of unity".
In that case we will basically do the above for a bunch of different neighborhoods, and the sum them together to give us (apparently) the same answer!
Nonholonomic constraints
Examples
(Non-holonomic) Higher order lagrangians
can be replaced by
which are scleronomic and nonholonomic.
So we consider the Lagriangian
Then
So the E-L equations gives us
which gives us
where we've used
Variational PDEs
Notation
- be a function on the set
Lagrangian over a surface with corresponding action
where and denote collectively the partial derivatives and for
BCs are given by
where is given
Variations are functions such that
Stuff
Then
where we've used integration by parts in the last equality.
The Divergence (or rather, Stokes') theorem allows us to rewrite the last integral as
where is the arclength. And since this vanishes.
Generalisation of the FLCV the first integral term must vanish and so we get
We can generalise this to more than just 2D!
Multidimensional Euler-Lagrange equations
Let
- be a bounded region with (piecewise) smooth boundary.
- denote the coordinates for
be the Lagrangian for maps where denotes collectively the partial derivatives
Then the general Euler-Lagrange equations are given by
using Einstein summation.
Notice that we are treating as a function of the x's and differentiate wrt. keeping all other x's fixed!
(This is really Stokes' theorem)
Let
- be bounded open set with (piecewise) smooth boundary
- be a smooth vector field defined on
- be the unit outward-pointing normal of
Then,
(using Einstein summation) where is the volume element in and is the area element in and denotes the Euclidean inner product in .
Let
- be bounded open with (piecewise) smooth boundary
be a continuous function which obeys
for all functions vanishing on .
Then .
Noether's theorem for multidimesional Lagrangians
Notation
Lagrangian
where
Use the notation
Conserved now refers to "divergenceless", that is, is a conserved quantity if
where we're using Einstein summation.
- for denotes a one-parameter group of diffeomorphisms
is defined
Stuff
We say it's a conserved "current" because
where denotes the normal to the boundary
Consider
And let
such that
We suppose that the action is invariant, so that the Lagrangian obeys
or equivalently, one can (usually more easily) check that the following is true
The Noether current is then given by
since the RHS is independent of . The LHS on the other hand is given by
where .
Now we observe the following:
and
where we've simply taking the derivatives of the Taylor expansions. Hence, we are left with
Now we need to evaluate and it's derivative wrt. . First we notice that
We now compute . We first have
Finally using the fact that if a we have some matrix given by
we have
Finally we need to compute , which one will find to be
AND I NEED TO PRACTICE FOR MY EXAM INSTEAD OF DOING THIS. General ideas are the above, and then just find an expression for the missing part. Then, you do some nice manipulation, botain an expression which vanish due to the EL equations being satisfied by the non-transformed , and you end up with the Noether's current for the multi-dimensional case.
Examples
Minimal surface
Let be a twice differentiable function.
The grahp defines a surface . The area of this surface is the functional of given by
If is an extremal of this function, we say that is a minimal surface.
In this case the Lagrangian is
with the EL-equations
where
Therefore,
and similarily for . When combined, and multiplied by since the combination equal zero anyways, we're left with
where we've used the fact that .
This is then the equation which must be satisfied by a minimal surface.
model
Noether's current for multidimensional Lagrangian
Classical Field Theory
Notation
- denotes a field written
Concerned with action functionals of the form
where is called a Lagrangian density and , i.e.
for for some and
denotes a "cylindrical" region
- is the outward normal to the boundary
Stuff
Lagrangian density is just used to refer to the fact that we are now looking at a variation of the form
So it's like the is the Lagrangian now, and the "inner" functional is the a Lagrangian density.
Klein-Gordon equation in (i.e. ) is given by
where is called the mass.
If , then this is the wave equation, whence the Klein-Gordon equation is a sort of massive wave equation.
More succinctly, introducing the matrix
then the Klein-Gordon equation can be written
Note: you sometimes might see this written
where they use the notation so we have sort of "summed out" the .
Calculus of variations with improper integrals
is unbounded, and so we need to consider
where
i.e. the closed ball of radius
Vary the action
where we have omitted the boundary term
which is seen by applying the Divergence Theorem and using the BCs on the variation
Using Fundamental Lemma of Calculus of Variations we obtain the E-L equations
Noether's Theorem for improper actions
- Consider action function for a classical field which is invariant under continuous one-parameter symmetry with Noether current
Integrate (zero) divergence of the current on a "cylindrical region" and apply the Divergence Theorem
- consists of "sides" of the "cylinder", where
- is the m-sphere of radius
- top cap
- bottom cap
Can rewrite the above as
using the fact that points outward at the bottom cap → negative axis
- Last term vanishes due to BCs on the field implies on as
Since arbitrary, we have
is conserved, i.e. we have a Noether's charge for the improper case!
Maxwell equations
Notation
- is the magnetic field
- is the electric field
- is the electric charge density
- is the electric current density
- is the magnetic potential
Let
Stuff
Maxwell's equations
Observe that can be solved by writing
Does not determine uniquely since
leaves unchanged (since ), and is called a gauge transformation
Substituting into Maxwell's equations:
Thus there exists a function (again since ), called the electric potential, such that
Performing gauge transformation → changes and unless also transform
In summary, two of Maxwell's equations can be solved by
where and are defined up to gauge transformations
for some function
- We can fix the "gauge freedom" (i.e. limit the space of functions ) by imposing restrictions on , which often referred to as a choice of gauge, e.g. Lorenz gauge
The ambiguity in the definition of and in the Maxwell's equations can be exploited to impose the Lorenz gauge condition:
In which case the remaining two Maxwell equations become wave equations with "sources":
From these wave-equations we get electromagnetic waves!
Maxwell's equations are variational
- Let and at first
Consider Lagrangian density
as functions of and , i.e.
Observe that
does not depend explicitly on or , only on their derivatives, so E-L are
and
which are precisely the two remaining Maxwell equations when and .
We can obtain the Maxwell equations with and nonzero by modifying :
We can rewrite by introducing the electromagnetic 4-potential
with so that .
The electromagnetic 4-current is defined
so that .
We define the fieldstrength
which obeys .
We can think of as entries of the antisymmetric matrix
where we have used that
In the terms of the fieldstrength we can write Maxwell's equations as
where we have used the "raised indices" of with as follows:
The Euler-Lagrange equations of are given by
and the gauge transformations are
under which are invariant.
In the absence of sources, so when , is gauge invariant.
Let denote the action corresponding to , then
where
and
Therefore,
Substituing this into our E-L equations from above, we (apparently) get
In the absence of sources, so when , is gauge invariant. This is seen by only considering the second-order of the transformation
- First show that is invariant under the following
Consider
where
and
then
- Find the Nother currents and
Examples
The Kepler Problem
- Illustrates Noether's Theorem and some techniques for the calculation of Poisson brackets
- Will set up problem both from a Lagrangian and a Hamiltonian point of view and show how to solve the system by exploiting conserved quantities
Notation
- Two particles of masses and moving in , with and denoting the corresponding positions
- Assuming particles cannot occupy same position at same time, i.e. for all .
We then have the total kinetic energy of the system given by
and potential energy
Lagrangian description
Lagrangian is, as usual given by
is invariant under the diagonal action of the Euclidean group of on the configuration space, i.e. if is an orthonormal transformation and , then
leaves the Lagrangian invariant.