Measure theory
Table of Contents
Notation
and
are used to denote the indicator or characteristic function
Definition
Motivation
The motivation behind defining such a thing is related to the Banach-Tarski paradox, which says that it is possible to decompose the 3-dimensional solid unit ball into finitely many pieces and, using only rotations and translations, reassemble the pieces into two solid balls each with the same volume as the original. The pieces in the decomposition, constructed using the axiom of choice, are non-measurable sets.
Informally, the axiom of choice, says that given a collecions of bins, each containing at least one object, it's possible to make a selection of exactly one object from each bin.
Measure space
If
is a set with the sigma-algebra
and the measure
, then we have a measure space .
Product measure
Given two measurable spaces and measures on them, one can obtain a product measurable space and a product measure on that space.
A product measure
is defined to be a measure on the measurable space
, where we've let
be the algebra on the Cartesian product
. This sigma-algebra is called the tensor-product sigma-algebra on the product space, which is defined
A product measure
is defined to be a measure on the measurable space
satisfying the property
and 
Let
be a sequence of extended real numbers.
The limit inferior is defined
The limit supremum is defined
Premeasure
Given a space
, and a collection of sets
is an algebra of sets on
if

- If
, then 
- If
and
are in
, then 
Thus, a algebra of sets allow only finite unions, unlike σ-algebras where we allow countable unions.
Given a space
and an algebra
, a premeasure is a function
such that

For every finite or countable collection of disjoint sets
with
, if
then
Observe that the last property says that IF this "possibly large" union is in the algebra, THEN that sum exists.
A premeasure space is a triple
where
is a space,
is an algebra, and a premeasure
.
Complete measure
A complete measure (or, more precisely, a complete measure space ) is a measure space in which every subset of every null set is measurable (having measure zero).
More formally,
is complete if and only if
If
is a premeasure space, then there is a complete measure space
such that

we have 
If
is σ-finite, then
is the only measure on
that is equal to
on
.
Atomic measure
Let
be a measure space.
Then a set
is called an atom if
and
A measure
which has no atoms is called non-atomic or diffuse
In other words, a measure
is non-atomic if for any measurable set
with
, there exists a measurable subset
s.t.
π-system
Let
be any set. A family
of subsets of
is called a π-system if

If
, then
So this is an even weaker notion than being an (Boolean) algebra. We introduce it because it's sufficient to prove uniqueness of measures:
Theorems
Jensen's inequality
Let
be a probability space
be random variable
be a convex function
Then
is the supremum of a sequence of affine functions
for
, with
.
Then
is well-defined, and
Taking the supremum over
in this inequality, we obtain
be a convex function
Then
is the supremum of a sequence of affine functions
Suppose
is convex, then for each point
there exists an affine function
s.t.
- the line
corresponding to
passes through 
- the graph of
lies entirely above 
Let
be the set of all such functions. We have
because
passes through the point 
beacuse all
lies below 
Hence
(note this is for each
, i.e. pointwise).
Sobolev space
Notation
is an open subset of 
denotes a infinitively differentiable function
with compact support
is a multi-index of order
, i.e.
Definition
Vector space of functions equipped with a norm that is a combination of
norms of the function itself and its derivatoves to a given order.
Intuitively, a Sobolev space is a space of functions with sufficiently many derivatives for some application domain, e.g. PDEs, and equipped with a norm that measures both size and regularity of a function.
The Sobolev space spaces
combine the concepts of weak differentiability and Lebesgue norms (i.e.
spaces).
For a proper definition for different cases of dimension of the space
, have a look at Wikipedia.
Motivation
Integration by parst yields that for every
where
, and for all infinitively differentiable functions with compact support
:
Observe that LHS only makes sense if we assume
to be locally integrable. If there exists a locally integrable function
, such that
we call
the weak
-th partial derivative of
. If this exists, then it is uniquely defined almost everywhere, and thus it is uniquely determined as an element of a Lebesgue space (i.e.
function space).
On the other hand, if
, then the classical and the weak derivative coincide!
Thus, if
, we may denote it by
.
Example
is not continuous at zero, and not differentiable at −1, 0, or 1. Yet the function
satisfies the definition of being the weak derivative of
, which then qualifies as being in the Sobolev space
(for any allowed
).
Lebesgue measure
Notation
denotes the collection of all measurable sets
Stuff
Given a subset
, with the length of a closed interval
given by
, the Lebesgue outer measure
is defined as
Lebesgue outer-measure has the following properties:
Idea: Cover by
.(Monotinicy)
if
, then
Idea: a cover of
is a cover of
.
(Countable subadditivity) For every set
and every sequence of sets
if
then
Idea: construct a cover of each
,
such that
:
- Every point in
is in one of the 

- Every point in
Q: Is it possible for every
to find a cover
such that
?
A: No. Consider
. Given
, consider
.
This is a cover of
so
.
If
is a cover by open intervals of
, then there is at least one
such that
is a nonempty open interval, so it has a strictly positive lenght, and
If
, then
Idea:
, so
.
For reverse, cover
by intervals giving a sum within
.
Then cover
and
by intervals of length
.
Put the 2 new sets at at the start of the sequence, to get a cover of
, and sum of the lengths is at most
. Hence,
If
is an open interval, then
.
Idea: lower bound from
.
Only bounded nonempty intervals are interesting.
Take the closure to get a compact set. Given a countable cover by open intervals, reduce to a finite subcover.
Then arrange a finite collection of intervals in something like increasing order, possibly dropping unnecessary sets.
Call these new intervals
and let
be the number of such intervals, and such that
i.e. left-most interval cover the starting-point, and right-most interval cover the end-point. Then
Taking the infimum,
The Lebesgue measure is then defined on the Lebesgue sigma-algebra, which is the collection of all the sets
which satisfy the condition that, for every
For any set in the Lebesgue sigma-algrebra, its Lebesgue measure is given by its Lebesgue outer measure
.
IMPORTANT!!! This is not necessarily related to the Lebesgue integral! It CAN be be, but the integral is more general than JUST over some Lebesgue measure.
Intuition
- First part of definition states that the subset
is reduced to its outer measure by coverage by sets of closed intervals - Each set of intervals
covers
in the sense that when the intervals are combined together by union, they contain 
- Total length of any covering interval set can easily overestimate the measure of
, because
is a subset of the union of the intervals, and so the intervals include points which are not in 
Lebesgue outer measure emerges as the greatest lower bound (infimum) of the lengths from among all possible such sets. Intuitively, it is the total length of those interval sets which fit
most tightly and do not overlap.
In my own words: Lebesgue outer measure is smallest sum of the lengths of subintervals
s.t. the union of these subintervals
completely "covers" (i.e. are equivalent to)
.
If you take an a real interval
, then the Lebesge outer measure is simply
.
Properties
Notation
For
and
, we let
Stuff
The collection of Lebesgue measurable sets is a sigma-algebra.
Easy to see
is in this collection:
Closed under complements is clear: let
be Lebesgue measurable, then
hence this is also true for
, and so
is Lebesgue measurable.
- Closed under countable unions:
Finite case:
.
Consider
both Lebesgue measurable and some set
.
Since
is L. measurable:
Since
is L. measurable:
which allows us to rewrite the above equation for
:
Observe that
By subadditivity:
Hence,
Then this follows for all finite cases by induction.
Countable disjoint case: Let
, and
. Further, let
.
Hence
is L. measurable. Thus,
Since the
are disjoint
and
:
Let
and note that
.
Thus, by indiction
Thus,
Taking
:
Thus,
is L. measurable if the
are disjoint and L. measurable!
- Countable (not-necessarily-disjoint) case:
If
are not disjoint, let
and let
, which gives a sequence of disjoint sets, hence the above proof applies.
Every open interval is Lebesgue measurable, and the Borel sigma-algebra is a subset of the sigma-algebra of Lebesgue measurable sets.
Want to prove measurability of intervals of the form
.
Idea:
- split any set
into the left and right part - split any cover in the same way
- extend covers by
to make them open
is a measure space, and for al intervals
, the measure is the length.
Cantor set
Define
For
, with
being identity, and
Let
and
. Then the Cantor set is defined
The Cantor set has a Lebesgue measure zero.
We make the following observations:
- Scaled and shifted closed sets are closed
is a finite union of closed intervals and so is in the Borel sigma-algebra- σ-algebras are closed under countable intersections, hence Cantor set is in the Borel σ-algebra
- Finally, Borel σ-algebra is a subset of Lebesgue measurable sets, hence the Cantor set is Lebesuge measurable!
Since Lebesgue measure satisfy
for any Lebesgue measurable set
with finite measure and any
with
. Since Lebesgue measure is subadditive, we have for any
Since
, by induction, it follows that
Taking the infimum of over
, we have that the Cantor set has measure zero:
Cardinality of the Cantor set
Let
.
The terniary expansion is a sequence
with
such that
The Cantor set
is uncountable.
We observe that if the first
elements of the expansion for
are in
, then
. But importantly, observe that some numbers have more than one terniary expansion, i.e.
in the terniary expansion. One can show that a number
if and only if
has a terniary expansion with no 1 digits. Hence, the Cantor set
is uncountable!
One can see that
if and only if terniary expansion with no 1 digits, since such an
would land in the "gaps" created by the construction of the Cantor set.
Uncountable Lebesgue measurable set
There exists uncountable Lebesgue measurable sets.
Menger sponge
- Generalization of Cantor set to

Vitali sets
Let
if and only if
.
- There are uncountable many equivalence classes, with each equivalence class being countable (as a set).
- By axiom of choice, we can pick one element from each equivalence class.
- Can assume each representative picked is in
, and this set we denote 
Suppose, for the sake of contradiction, that
is measurable.
Observe if
, then there is a
and
s.t.
, i.e.
Then, by countable additivity
where we've used
Hence, we have our contradiction and so this set, the Vitali set, is not measurable!
There exists a subset of
that is not measurable wrt. Lebesgue measure.
Lebesgue Integral
The Lebesgue integral of a function
over a measure space
is written
which means we're taking the integral wrt. the measure
.
Special case: non-negative real-valued function
Suppose that
is a non-negative real-valued function.
Using the "partitioning of range of
" philosophy, the integral of
should be the sum over
of the elementary area contained in the thin horizontal strip between
and
, which is just
Letting
The Lebesgue integral of
is then defined by
where the integral on the right is an ordinary improper Riemann integral. For the set of measurable functions, this defines the Lebesgue integral.
Radon measure
- Hard to find a good notion of measure on a topological space that is compatible with the topology in some sense
- One way is to define a measure on the Borel set of the topological space
Let
be a measure on the sigma-algebra of Borel sets of a Hausdorff topological space
.
is called inner regular or tight if, for any Borel set
,
is the supremum of
over all compact subsets of
of
, i.e.
where
denotes the compact interior, i.e. union of all compact subsets
.
is called outer regular if, for any Borel set
,
is the infimum of
over all open sets
containing
, i.e.
where
denotes the closure of
.
is called locally finite if every point of
has a neighborhood
for which
is finite (if
is locally finite, then it follows that
is finite on compact sets)
The measure
is called a Radon measure if it is inner regular and locally finite.
Suppose
and
are two
measures on a measures on a measurable space
and
is absolutely continuous wrt.
.
Then there exists a non-negative, measurable function
on
such that
The function
is called the density or Radon-Nikodym derivative of
wrt.
.
If a Radon-Nikodym derivative of
wrt.
exists, then
denotes the equivalence class of measurable functions that are Radon-Nikodym derivatives of
wrt.
.
is often used to denote
, i.e.
is just in the equivalence class of measurable functions such that this is the case.
This comes from the fact that we have
Suppose
and
are Radon-Nikodym derivatives of
wrt.
iff
.
The δ measure cannot have a Radon-Nikodym derivative since integrating
gives us zero for all measurable functions.
Continuity of measure
Suppose
and
are two sigma-finite measures on a measure space
.
Then we say that
is absolutely continuous wrt.
if
We say that
and
are equivalent if each measure is absolutely continuous wrt. to the other.
Density
Suppose
and
are two sigma-finite measures on a measure space
and that
is absolutely continuous wrt.
. Then there exists a non-negative, measurable function
on
such that
Measure-preserving transformation
is a measure-preserving transformation is a transformation on the measure-space
if
Measure
A measure on a set is a systematic way of defining a number to each subset of that set, intuitively interpreted as size.
In this sense, a measure is a generalization of the concepts of length, area, volume, etc.
Formally, let
be a
of subsets of
.
Suppose
is a function. Then
is a measure if

Whenever
are pairwise disjoint subsets of
in
, then
- Called σ-additivity or sub-additivity
Properties
Let
be a measure space, and
such that
.
Then
.
Let
Then
, and by finite additivity property of a measure:
since
by definition of a measure.
If
are
subsets of
, then
We know for a sequence of disjoint sets
we have
So we just let
Then,
Thus,
Concluding our proof!
Let
be an increasing sequence of measurable sets.
Then
Let
be sets from some
.
If
, then
Examples of measures
Let
be a space
The δ-measure (at
) is
Sigma-algebra
Definition
Let
be some set, and let
be its power set. Then the subset
is a called a σ-algebra on
if it satisfies the following three properties:

is closed under complement: if 
is closed under countable unions: if 
These properties also imply the following:

is closed under countable intersections: if 
Generated σ-algebras
Given a space
and a collection of subsets
, the σ-algebra generated by
, denoted
, is defined to be the intersection of all σ-algebras on
that contain
, i.e.
where
Let
be a measurable space and
a function from some space
to
.
The σ-algebra generated by
is
Observe that though this is similar to σ-algebra generated by MEASURABLE function, the definition differs in a sense that the preimage does not have to be measurable. In particular, the σ-algebra generated by a measurable function can be defined as above, where
is measurable by definition of
being a measurable function, hence corresponding exactly to the other definition.
Let
and
be measure spaces and
a measurable function.
The σ-algebra generated by
is
Let
be a space.
If
is a collection of σ-algebras, then
is also a σ-algebra.
σ-finite
A measure or premeasure space
is finite if
.
A measure
on a measure space
is said to be sigma-finite if
can be written as a countable union of measurable sets of finite measure.
Example: counting measure on uncountable set is not σ-finite
Let
be a space.
The counting measure is defined to be
such that
On any uncountable set, the counting measure is not σ-finite, since if a set has finite counting measure it has countably many elements, and a countable union of finite sets is countable.
Properties
Let
be a
of subsets of a set
. Then

If
, then
- If
then 
Borel sigma-algebra
Any set in a topological space that can be formed from the open sets through the operations of:
- countable union
- countable intersection
- complement
is called a Borel set.
Thus, for some topological space
, the collection of all Borel sets on
forms a σ-algebra, called the Borel algebra or Borel σ-algebra .
More compactly, the Borel σ-algebra on
is
where
is the σ-algebra generated by the standard topology on
.
Borel sets are important in measure theory, since any measure defined on the open sets of a space, or on the closed sets of a space, must also be defined on all Borel sets of that space.
Any measure defined on the Borel sets is called a Borel measure.
Lebesgue sigma-algebra
Basically the same as the Borel sigma-algebra but the Lebesgue sigma-algebra forms a complete measure.
Note to self
Suppose we have a Lebesgue mesaure on the real line, with measure space
.
Suppose that
is non-measurable subset of the real line, such as the Vitali set. Then the
measure of
is not defined, but
and this larger set (
) does have
measure zero, i.e. it's not complete !
Motivation
Suppose we have constructed Lebesgue measure on the real line: denote this measure space by
. We now wish to construct some two-dimensional Lebesgue measure
on the plane
as a product measure.
Naïvely, we could take the sigma-algebra on
to be
, the smallest sigma-algebra containing all measureable "rectangles"
for
.
While this approach does define a measure space, it has a flaw: since every singleton set has one-dimensional Lebesgue measure zero,
for any subset of
.
What follows is the important part!
However, suppose that
is non-measureable subset of the real line, such as the Vitali set. Then the
measure of
is not defined (since we just supposed that
is non-measurable), but
and this larger set (
) does have
measure zero, i.e. it's not complete !
Construction
Given a (possible incomplete) measure space
, there is an extension
of this measure space that is complete .
The smallest such extension (i.e. the smallest sigma-algebra
) is called the completion of the measure space.
It can be constructed as follows:
- Let
be the set of all
measure zero subsets of
(intuitively, those elements of
that are not already in
are the ones preventing completeness from holding true) - Let
be the sigma-algebra generated by
and
(i.e. the smallest sigma-algreba that contains every element of
and of
)
has an extension to
(which is unique if
is sigma-finite), called the outer measure of
, given by the infimum
Then
is a complete measure space, and is the completion of
.
What we're saying here is:
- For the "multi-dimensional" case we need to take into account the zero-elements in the resulting sigma-algebra due the product between the 1D zero-element and some element NOT in our original sigma-algebra
- The above point means that we do NOT necessarily get completeness, despite the sigma-algebras defined on the sets individually prior to taking the Cartesian product being complete
- To "fix" this, we construct a outer measure
on the sigma-algebra where we have included all those zero-elements which are "missed" by the naïve approach, 
Measurable functions
Let
and
be measurable spaces.
A function
is a measurable function if
where
denotes the preimage of the
for the measurable set
.
Let
.
We define the indicator function of
to be the function
given by
Let
. Then
is measurable if and only if
.
Let
be a measure space or a probability space.
Let
be a sequence of measurable functions.
- For each
, the function
is measurable - The function
is measurable - Thus, if
converge pointwise,
is measurable.
Let
be a measurable space, and let
.
The following statements are equivalent:
is measurable.
we have
.
we have
.
we have
.
we have
.
A function
is measurable if
We also observe that by Proposition proposition:equivalent-statements-to-being-a-measurable-function, it's sufficient to prove
so that's what we set out to do.
For
and
, consider the following equivalent statements:
Thus,
so
Recall that for each
, the sequence
is an increasing sequence in
. Therefore, similarily, the following are equivalent:
Thus,
Hence,
concluding our proof!
Basically says the same as Prop. proposition:limits-of-measurable-functions-are-measurable, but a bit more "concrete".
Let
be a
of subsets of a set
, and let
with
be a sequence of measurable functions.
Furthermore, let
Then
is a measurable function.
Simple functions
Let
be a
of subsets of a set
.
A function
is called a simple function if
- it is measurable
- only takes a finite number of values
Let
be a
of subsets of a set
.
Let
be a nonnegative measurable function.
Then there exists a sequence
of simple functions such that
for all 
Converges to
:
Define a function
as follows. Let
and let
Then the function
obeys the required properties!
Almost everywhere and almost surely
Let
be a measure or probability space.
Let
be a sequence of measurable functions
- For each
the function
is measurable - The function
is measurable - Thus, if the
converge pointwise, then
is measurable
Let
be a measure space. Let
be a condition in oe variable.
holds almost everywhere (a.e.) if
Let
be a probability space and
be a condition in one variable, then
holds almost surely (a.e.) if
also denoted
Let
be a complete measure space.
- If
is measurable and if
a.e. then
is measurable. - Being equal a.e. is an equivalence relation on measurable functions.
Convergence theorems for nonnegative functions
Problems
Clearly if
with
s.t.
, then
hence
Therefore it's sufficient to prove that if
, then there exists a non-degenerate open interval
s.t.
. (first I said contained in
, but that is a unecessarily strong statement; if contained then what we want would hold, but what we want does not imply containment).
As we know, for every
there exists
such that
and
Which implies
which implies
Letting
, this implies that there exists an open cover
s.t.
and
(this fact that this is true can be seen by considering
for all
and see that this would imply
not being a cover of
, and if
, then since
there exists a "smaller" cover).
Thus,
Hence, letting
be s.t.
we have
as wanted!
, we have
for almost every
if and only if for almost every
,
for all
.
This is equivalent to saying
if and only if
i.e.
is a set of measure zero.
Then clearly
by the assumption.
Follows by the same logic:
This concludes our proof.
Integration
Notation
We let
where
Stuff
Let
where
are a set of positive values.
Then the integral
of
over
wrt.
is given by
Let
be a sequence of nonnegative measurable functions on
. Assume that
for each 
for each
.
Then, we write
pointwise.
Then
is measurable, and
Let
. By Proposition proposition:limit-of-measurable-functions-is-measurable,
is measurable.
Since each
satisfies
, we know
.
- If
, then since
and for all
we have
, and
.
Let
and
.
Step 1: Approximate
by a simple function.
Let
be a simple function such that
and
.
Such an
exists by definition of Lebesgue integral. Thus, there are
such that
, and disjoint mesurable sets
such that
If any
, it doesn't contribute to the integral, so we may ignore it and assume that there are no such sets.
Step 2: Find sets of large measure where the convergence is controlled.
Note that for all
we have
That is, for each
and
,
For
and
, let
And since it's easier to work with disjoint sets,
Observe that,
Then,
We don't have a "rate of convergence" on
, but on
we know that we are
close, and so we can "control" the convergence.
Step 3: Approximate
from below.
For each
if
, then let
be such that
and otherwise, let
be such that
Let
, and let
.
For each
,
and
we have
Thus,
, and
,
If there is a
such that
, then
Otherwise (if the integral is finite), then
For every
and
, there is an
such that
For every
such that
Therefore
Thus,
as wanted.
Let
be any nonnegative measurable functions on
.
Then
Let
and observe
are pointwise increasing
Properties of integrals
Let
be a measure space.
If
is a nonnegative measurable function, then there is an increasing sequence of simple functions
such that
Given
as above and
for
, let
and
Or a bit more explicit (and maybe a bit clearer),
For each
,
is a cover of
. On each
we have
, hence
on entirety of
.
Consider
. If
, then for
which in turn implies
Hence
.
Finally, if
, then
and for all
take on values
Hence,
for all cases.
Furthermore, for any
and
, there is the nesting property
so on
we have
.
(This can be seen by observing that what we're really doing here is dividing the values
takes on into a grid, and observing that if we're in
then we're either in
or
).
For
, then
so again
and
is pointwise increasing.
Let
be a measure space.
Let
be nonnegative, measurable functions
s.t.
is defined
be a sequence of nonnegative measurable functions.
Then
Finite sum
Scalar multiplication
Infinte sums
Let
and
be increasing sequence of simple functions converging to
,
, respectively.
Note
is aslo increasing to
.
By monotone convergence theorem
The argument is similar for products.
Finally,
is an increasing sequence of nonnegative measurable functions, since sums of measurable functions is a measurable function.
Thus, by monotone convergence and the result for finite sums
Integrals on sets
Let
be a measure or probability space.
If
is a sequence of disjoint measurable sets then
Let
be a measure or probability space.
If
is a simple function and
is a measurable set, then
is a simple function.
Let
be a measure or probability space.
Let
be a nonnegative measurable function and
.
The integral of
on
is defined to be
Let
be a measure or probability space.
Let
be a nonnegative measurable function.
If
and
are disjoint measurable sets, then
If
are disjoint measurable sets, then
Let
be a measure or probability space.
If
is a nonnegative measurable function, then
defined by
:
is a measure on
.
If
, then
defined by
:
The (real) Gaussian measure on
is defined as:
where
denotes the Lebesgue measure.
A Gaussian probability measure can also be defined for an arbitrary Banach space
as follows:
Then, we say
is a Gaussian probability measure on
if and only if
is a Borel measure, i.e.
such that
is a real Gaussian probability measure on
for every linear functional
, i.e.
.
Here we have used the notation
, defined
where
denotes the Borel measures on
.
Integrals of general functions
Let
be a measure or probability space.
If
is a measurable function, then the positive and negative parts are defined by
Note:
and
are nonnegative.
Let
be a measure or probability space.
If
is a measurable function, then
and
are measurable functions.
Let
be a measure or probability space.
- A nonnegative function is defined to be integrable if it is measurable and
. - A function
is defined to be integrable if it is measurable and
is integrable.
For an integrable function
, the integral of
is defined to be
On a set
, the integral is defined to be
Note that
, but in the actual definition of the integral, we use
.
Let
be a measure or probability space.
If
and
are real-valued integrable functions and
, then
(Scalar multiplication)
(Additive)
Let
be a measure or probability space.
Let
and
be measurable functions s.t.
If
is integrable then
is integrable.
Examples
Consider
with Lebesgue measure. Is
integrable?
And
and
therefore
Thus,
is integrable.
Lebesge dominated convergence theorem
Let
be a measure or probability space.
Let
be a nonnegative integrable function and let
be a sequence of (not necessarily nonnegative!) measurable functions.
Asssume
and all
are real-valued.
If
and
such that
and the pointwise limit
exists.
Then
That is, if there exists a "dominating function"
, then we can "move" the limit into the integral.
Since
and
such that
, we find
that
and
are nonnegative.
Consider
From Fatou's lemma, we have
Therefore
Consider
, then
(this looks very much like Fatou's lemma, but it ain't;
does not necessarily have to be nonnegative as in Fatou's lemma)
Consider
Therefore,
Which implies
Since
, we then have
exists and is equal to
.
Examples of failure of dominated convergence
Where dominated convergence does not work
On
with Lebesgue measure, consider
such that
instead of
as "usual" with
.
Both of these are nonnegative sequences that converge to
pointwise.
Notice there is no integrable dominating function for either of these sequences:
would require a dominating function to have infinite integral, therefore no dominating integrable function exists.
on the right, and so a dominating function would have to be above
on some interval
which would lead to infinite integral.
Thus, Lebesgue dominated convergence does not apply
Noncummtative limits: simple case
Noncommutative limits: another one
Consider
with Lebesgue measure and
Consider
and $ b > 1$ and
Note that
, so
is not integrable.
Consider
Commutative limits
Consider
We know that
is integrable and for all
and
,
By multiple applications of LDCT
Showing that in this case the limits do in fact commute.
Riemann integrable functions are measurable
All Riemann integrable functions are measurable.
For any Riemann integrable function, the Riemann integral and the Lebesgue integral are equal.
Almost everywhere and Lp spaces
If
is a nonnegative, measurable function, and
, then
.
For
, let
Observe the
are disjoint and
Suppose that
. This implies that
on a set of positive measure, i.e.
but this implies that
Thus,
which is a contradiction, hence
.
Let
and
be integrable.
is the set of all equivalence classes of integrable functions wrt. the equivalence relation given by a.e. equality, i.e.
If
is an integrable function, the
norm is
If
and
, the integral and norm are defined to be
If
, then
, and
is a real vector space with addition and scalar multiplication given pointwise almost everywhere.
Functions taking on
on a set of zero measure are fine!
These functions are still the almost everywhere equal to some integrable function (even those these infinite-valued functions are integrable), hence these are in
.
Let
be a Cauchy sequence. Since the
are integrable, we may assume we choose
valued representatives.
For
, let
be such that for
,
and
.
Thus,
and
Thus,
is finite almost everywhere. Thus, this series is infinite on a set of measure zero, so we may assume the representatives
are zero there and the sum is finite at each
.
Thus,
converges everywhere.
Let
(observe that the last part is just rewriting the
).
By monotone convergence theorem
Observe that pointwise
Applications to Probability
Notation
is a probability space- Random variable
is a measurable function
denotes the Borel sigma-algebra on 
denotes the probability distribution measure for 
be a sequence of random events
be a sequence of finitely many random events
Probability and cumulative distributions
An elementary event is an element of
.
A random event is an element of
A random variable is a measurable function from
to
.
Let
be a measure space and
be a measurable space
be a measurable function
Then we say that the push-forward of
by
is defined
The probability distribution measure of
, denoted
, is defined
Equivalently, it's the push-forward of
by
:
In certain circles not including measure-theorists (existence of such circles is trivial), you might hear talks about "probability distributions". Usually what is meant by this is
for some random variable
.
That is, a "distribution of
" usually means that there is some probability space
in which
is a random variable, i.e.
and the "distribution of
" is the corresponding probability distribution measure!
Confusingly enough, they will often talk about "
distribution of
", in which case
is NOT a probability measure, but denotes a probability distribution measure of the random variable.
The cumulative distribution function of
, denoted
, is defined by
where
is the probability distribution measure of
.
The probability distribution measure
is a probability measure on the Borel sets
.
If
is a disjoint sequence of sets in
, then
so
satisfies countable additivity and is a measure.
Finally,
so
is a probability measure.
is increasing
and 
is right continuous (i.e. continuous from the right)
If
, then
Consider the limit as
. Let
so
Then,
which, since
is increasing implies
Let
and
. Let
The
are nested, and similarily
are nested.
Thus, given
, there exists
such that
Let
so
Radon-Nikodym derivatives and expectations
Let
be a rv.
its probability distribution measure
its cumulative distribution function
a Borel measureable function
The following are equivalent:

is a Radon-Nikodym derivative for
wrt.
(the Lebesgue measure but restricted to Borel measurable sets)
(2) and (3) are immediately equivalent:
iff (2) or (3) holds when considering only sets of the form
.
This statement is also equivalent to (1).
Thus (1) is equivalent to (2) or (3) restricted to sets of the form
.
However, sets of the form
generate
, so from the Carathéodory extension theorem this gives
.
To prove
more rigorously, let
for
s.t.
and none of these intervals overlap. That is all finite unions of left-closed, right-open, disjoint intervals.
Also let
Observe that
and that
One can show that
is a premeasure space. Therefore, by the Carathéodory extension theorem, there is a measure
on
s.t.
Furthermore, since
,
is unique! But both the measures
and
satisfy these properties, thus
which is the definition of
being a Radon-Nikodym derivative of
wrt. Lebesgue measure restricted to the Borel σ-algebra, as wanted.
A function
is a probability density function for
if
is a Radon-Nikodym derivative of the probability distribution measure
, wrt. Lebesgue measure restricted to Borel sets, i.e.
Expectation via distributions
Expectation of a random variable is
If
is a nonnegative function that is
measurable, then
If
is the characterstic function, then, if
,
so
Multiplying by constants and summing over different characteristic functions, we get the result to be true for any simple function.
Given a nonnegative function
, let
be an increasing sequence of simple functions converging pointwise to
.
Note
is the increasing limit of
. By two applications of Monotone Convergence
This techinque, of going from characterstic function → simple functions → general functions, is used heavily, not just in probability theory.
Independent events & Borel-Cantelli theorem
A collection of random events
are independent events if for every finite collection of distinct indices
,
A random event
occurs at
if
.
The probability that the event occurs is
.
If
are independent then
are also independent.
Prove that
are independent.
Consider
, we want to prove
RHS can be written
which is equal to LHS above, and implies that the complement is indeed independent.
The condition that infinitively many of the events occurs at
is
This is equivalent to
where we have converted the
and
.
Furthermore,
is itself a random event.
If
then probability of infinitely many of the events occuring is 0, i.e.
If the
are independent and
, then probability of infinitely many of the events occuring is 1, i.e.
Suppose
.
Suppose
are now independent and that
.
Fix
. Then
Chebyshev's inequality
Let
be a probability space.
If
is a random variable with mean
and variance
, then
Let
Then
everywhere, so
Hence,
Independent random variables
Let
be a probability space.
A collection of σ-algebras
, where
for all
, is independent if for every collection of events
s.t
for all
, then
is a set of independent events.
A collection of random variables
is independent if the collection of σ-algebras they generate is independent.
A sequence of random variables
is independent and identically distributed (i.i.d) if they are independent variables and for
we have
where
is the cumulative distribution function for
.
Let
and
be independent.
- We have
- If
or
then ![$\mathbb{E} \big[ \left| XY \right| \big] = 0$](../../assets/latex/measure_theory_761e7445761b7eb7decf7480a3f579dde5590770.png)
If
and
, then
- If
Furthermore, if
and
, then
Consider
- first nonnegative functions
- subcase
![$\mathbb{E} \big[ X \big] = 0$](../../assets/latex/measure_theory_384ab7ec94ace068cc33f7a3b4036e557945d4e1.png)
Since
is nonnegative
Thus,
so
.
Now consider the subcase where
and
.
Let
and
be the σ-algebras generated by
and
.
Observe that
and
are measure spaces. Let
be an increasing sequence of simple functions that are measurable wrt.
and similarily
simple increasing to
and
measurable.
As simple functions, these can be written as
Then,
Since
increases to
, by MCT
Dividing into positive & negative parts & summing gives
.
Strong Law of Large numbers
Notation
are i.i.d. random variables, and we will assume
Stuff
Let
be a probability space and
be a sequence of i.i.d. random variables with
Then the sequence of random variables
converges almost surely to
, i.e.
This is equivalent to
occuring with probability 0, and this is the approach we will take.
First consider
.
For
and
, let
and
Since
are i.i.d. we have
and since variance rescales quadratically,
Using Chebyshev's inequality
Observe then that with
, we have
And so by Borel-Cantelli, since this is a sequence of independent random variables, we have
In particular, for any
, there are almost surely only finitely many
with
Step: showing that we can do this for any
.
Consider
. Observe that by countable subadditivity,
Now let
, which occurs almost surely from the above. For any
, let
Since
, there are only finitely many
s.t.
as found earlier (the parenthesis are indeed different here, compared to before). Therefore
is arbitrary, so this is true for all
. Hence,
This proves that there is a subsequential limit almost surely.
Step: subsequential limit to "sequential" limit.
Given
, let
be such that
. Since
are nonnegative
and therefore
and since
,
Since the first and the last expressions converge to
,t by the squeeze theorem we have
Step: Relaxing nonnegativity assumption on
.
Suppose
is not necessarily nonnegative. Since, by assumption,
has finite expectation,
is integrable. Therefore we know that the positive and negative parts of
, denoted
, are also integrable. Therefore we can compute the expectations
Similarily, we have that the variance of
is finite, which allows us to the apply the result we found for
being nonnegative to both
and
:
Let
be the set where the mean of the positive / negative part converges. Since
(since otherwise the limit would not converge almost surely). We then have
Thus, almost surely,
, and on this we have convergence, so
Concluding our proof.
Ergodic Theory
Let
be a measure-preserving transformation on a measure space
with
, i.e. it's a probability space.
Then
is ergodic if for every
we have
Bochner integrable
The Bochner integral is a notion of integrability on Banach spaces, and is defined in very much the same way as integrability wrt. Lebesgue-measure.
Let
be a measure space and
a Banach space.
A simple function is defined similarily as before, but now taking values on a Banach space instead. That is,
with the integral
A measurable function
is said to be Bochner integrable if there exists a sequence of integrable simple functions
such that
where the integral on the LHS is an ordinary Lebesgue integral.
If this is the case, then the Bochner integral is defined
It can indeed be shown that a function
is Bochner integrable if and only if
, the
Bochner space, defined similarily as L1-space for functions but with the absolute value replaced by the
.
Concentration inequalities
Stochastic processes
Let
be a filtration
be an
martingale
be an
stopping time
such that one of the following holds:
such that 
and there exists a constatn
s.t. for all
,
almost surely on the even that
.
such that
almost surely for all 
Then
is a.s. well-defined and
.
Furthermore, when
is supersub-martingale rather than a martingale, then equality is replaced with lessgreather-than, respectively.
Let
be a supermartingale with
a.s. for all
.
Then for any
Let
be the event that
and
, where we assume
so that
if
for all
.
Clearly
is a stopping time and
. Then by Doob's optional stopping theorem and an elementary calculation
Course: Advanced Probability
Notation
is used as a binary operation which takes minimum of the two arguments
is used as a binary operation which takes maxmium of the two arguments
Lecture 1
Notation
denotes a measurable space (with a measure
it becomes a measure space)
denotes the set of measurable functions wrt.
and non-negative measurable functions
We write
Stuff
Let
be a measure space.
Then there exists a unique
s.t.
for all 
Linearity
for all
with
.
-
for
pointwise.
There exists a unique measure
on
called the product measure
Let
. For
define
Then
is
. Hence, we can define
Then
is
and
where
is the product measure.
Applying the above in both directions, we have
with
and
Conclusion:
Lecture 2: conditional expectation
Notation
a probability space, i.e. 
denotes rv, i.e.
is
and integrable, with expectation
Also write
or, as we're used to,
instead of
Stuff
Let
with
.
Then
is called the conditional probability of
given
.
Similarily, we define
to be the conditional expectation of
given
.
- Quite restrictive since we require probability of
to be non-zero - Goal: improve prediction for
if additional "information" is available
- "Information" is modelled by a sigma-algebra

- "Information" is modelled by a sigma-algebra
Let
be a sequence of disjoint events, whose union is
. Set
For any integrable random variable
, we can define
where we set
Notice that
in (discrete) definition of conditional expectation is
Let
, then
because
, and each of these sets are measurable
Notice that this is simply the union of intersections
which is just
But this is just
since
! That is,
Which means we end up with
which is union of
sets and so
is
random variable.
is integrable and
This is easily seen from
since
and
is integrable.
There's an issue with the (discrete) definition of conditional expectation though. Example:
,
and
be the Lebesgue measureConsider the case
- Then consider
is a rv. Then let
Then
Issue: if
has an absolutely continuous distribution, e.g.
, i.e.
then the set we're summing over
is the empty set!
- This motivates the more general defintion which comes next!
Let
with
is a σ-algebra.
A random variable
is called (a version of) the conditional expectation of
given by
if
is 
And
So we write
can be replaced by
throughout- If
with
it suffices to check for all 
If
with
a rv., then
is
by condition (1) in def of conditional expectation, so it's of the form
for some function
; therefore it's common to define
Let
with
a sigma-algebra.
Then
exists- Any two versions of
coincide 
- Let
be as in conditional expectation and let
satisfy the conditions in the same def for some
with
almost surely.
- Let
with
(in
because both
are
) Then
since
. The first equality is due to condition (2) in def of cond. expectation.
implies, by def of
, that
- If
, a similar argument shows that
(using
and
)
- The reason why we did the inequality first is because we'll need that later on.
- Let
- We're going to do this by orthogonal projection in
.
Assume
. Since
is a complete subspace of
, so such
has an orthogonal projection
on
, i.e.
Choosing
for some
, we get
so
satisfies (1) and (2) in def of cond expectation, from equation above.
But this is assuming
which is not strict enough for the case when
! So we gotta do some more work.
Assume
. Then
and
for some
. By Step 1, we know that
and
a.s. (by proof of (2) above).
Further, let
with
which is just the set where the sequence is increasing. Then
is
and by MCT we get
Then, letting
,
so
a.s. (and thus
) and
satisfies the condtions in def of cond expectation
For general
, apply Step 2 on
and
to obtain
and
. Then
satisfies the condtions in def of cond expectation.
Let
, i.e. integrable random variable, and let
be a σ-algebra.
We have the following properties:
![$\mathbb{E} \big[ \mathbb{E}[X \mid \mathcal{G}] \big] = \mathbb{E}[X]$](../../assets/latex/measure_theory_b14b07e854cdd765efad83ceb2d2280e02ec899d.png)
- If
is
, then ![$\mathbb{E}[X \mid \mathcal{G}] \overset{\text{a.s.}}{=} X$](../../assets/latex/measure_theory_fa97a9f2f107a062bb6f5e4c0e65d02bb208342c.png)
- If
is independent of
, then ![$\mathbb{E}[X \mid \mathcal{G}] \overset{\text{a.s.}}{=} \mathbb{E}[X]$](../../assets/latex/measure_theory_1a492266028a6b7630c03206584471cd5c07473d.png)
- If
, then
. For
and any integrable random variable
, we have
Let
be a sequence of random variables. Suppose further
a.s., then
a.s., for some
random variable
.
(conditional MCT) By MCT, we therefore have
which implies that
This is basically the conditional MCT:
(conditional Fatou's lemma)
(conditional Dominated convergence) If
and
for all
, almost surely, for some integrable variable
, then
(conditional Jensen's inequality) If
is convex, then
In particular, for
, we have
where we used Jensen's inequality for the inequality. Thus we have
For any σ-algebra
, the rv.
is
and satisfies, for all
,
(Tower property)
"Take out what is known": if
is bounded and
, then
since
is then
and
(actually, you need to first consider
for some
, and then extend to all measurable functions
using simple functions as usual)
If
is independent of
, then
Because suppose
and
, then
The set of such intersections
is a π-system generating
, so the desired formula follows from [SOME PROPOSITION].
From defintion of conditional expectation we know that
And since
, we must also have
Let
. Observe that
since
and both
and
are
, thus the their preimages must be in
and
Moreover, from the definition of
, we know that
which implies
i.e.
[TODO]
being independent of
means that
Let
since
is
.
Then
by def of expectation. But
- Proven in proof:1.4.-lec-existence-and-uniqueness-of-conditional-expectation
If we can show that properties (1) and (2) from def:conditional-expectation is satisfied by
, then by (2) we get LHS immediately.
Measurability follows from linear combination of measurables being measurable. And observe that for all
by the fact that
and
are both
.
Let
. Then the set of random variables
of the form
where
is a σ-algebra is uniformly integrable.
Given
, we can find
so that
Then choose
so that
Suppose
, then
. In particular,
so, by Markov's inequality, we have
Then
Since our choice of
was independent of
, we have our proof for any σ-algebra
.
Martingales in discrete time
Let
be a probability space.
A filtration on this space is a sequence
of σ-algebras such that,
We also define
Then
.
A random process (in discrete time) is a sequence of random variables
.
Let
be a random process (discrete time).
Then we define the natural filtration of
to be
, given by
Then
models what we know about
by time
.
We say
is adapted to
if
is
for all
.
This is equivalent to requiring that
for all
.
Let
be a filtration
is
if
is
for each
.
We say a random process is integrable if
is an integrable random variable for all
.
A martingale is an adapted random process (discrete time)
such that
If instead
we say
is a supermartingale.
And if instead
we say
is a submartingale.
Every process which is martingale wrt. a given filtration
is also martingale wrt. its natural filtration.
We say a random variable
is a stopping time if
for all
.
For a stopping time
, we set
Let
and
be stopping times and let
be an adapted process. Then
is a stopping time
is a σ-algebra- If
, then 
is an
random variable
is adapted- If
is integrable, then
is integrable.
Let
denote a probability space
be a filtration
adapted to 
Note that
And, since
and
are a stopping times,
which also implies
, since a σ-algebra is closed under complements- Similarily for

- σ-algebra is closed under finite intersections and unions
Hence
Importantly this holds for all
, and so
i.e.
is a stopping time.
Recall
which can equivalently be written
Problem sheets
PS1
1.1
Let
and let
be a σ-algebra. Then
First of,
is
because each
and
are
and we know that linear combinations of measurables are measurable.
Second,
1.2
Let
be a non-negative rv.
be a version of ![$\mathbb{E}[X \mid \mathcal{G}]$](../../assets/latex/measure_theory_85a350ffd2a11f2402d03fa9f54e98334381e887.png)
Then
i.e.
Further,
Course: Advanced Financial Models
Notation
Lecture 2
If
is
, then
a.s.
Conversely, if
a.s. then
s.t
i.e. everything "interesting" happens in the intersection of
and
If
then
since
is
.
Suppose
where
.
Note tthat
where in the last equality we've used
be two finite measures on
be a nonnegative
of
and
be finite or
.
defined by
![$\mathbb{P}(A \mid \mathcal{G}) = \mathbb{E} \big[ 1_A \mid \mathcal{G} \big]$](../../assets/latex/measure_theory_c23d0999542f850256cae9f7b064df7c745c3fca.png)