Exercises
Table of Contents
Information Theory, Inference and Learning Algorithms
6.14
Since there is no covariance between the different dimensions, i.e. the are independent of all with , we know that
where each of the are following , hence
Hence,
The variance is then given by
where
But since there is no covariance between the different , the second sum vanishes, and since
(which we knew from the hint 6.14 in the book). Hence
This all means that for large , we will have
And since will be neglible for large , compared to (of course assuming is finite), then
as watned. The "thickness" will simply be the , i.e. twice the variance of .
Either by:
- Computing an dimensional integral :)
- Empirically looking at for some and making use of the symmetry of the Gaussian to infer that all with same radius have the same probability, and that decreases when moves away (in whatever "direction" / dimension) from the mean
We can observe that the majority of the probability mass is clustered about this "shell".
Bibliography
Bibliography
- [mackay2003information] MacKay, Kay & Cambridge University Press, Information Theory, Inference and Learning Algorithms, Cambridge University Press (2003).