Exercises
Table of Contents
Information Theory, Inference and Learning Algorithms
6.14
Since there is no covariance between the different dimensions, i.e. the
are independent of all
with
, we know that
where each of the
are following
, hence
Hence,
The variance is then given by
where
But since there is no covariance between the different
, the second sum vanishes, and since
(which we knew from the hint 6.14 in the book). Hence
This all means that for large
, we will have
And since
will be neglible for large
, compared to
(of course assuming
is finite), then
as watned. The "thickness" will simply be the
, i.e. twice the variance of
.
Either by:
- Computing an
dimensional integral :) - Empirically looking at
for some
and making use of the symmetry of the Gaussian to infer that all
with same radius have the same probability, and that
decreases when
moves away (in whatever "direction" / dimension) from the mean
We can observe that the majority of the probability mass is clustered about this "shell".
Bibliography
Bibliography
- [mackay2003information] MacKay, Kay & Cambridge University Press, Information Theory, Inference and Learning Algorithms, Cambridge University Press (2003).