Linear Algebra

Table of Contents

Definitions

Ill-conditoned matrix

A matrix linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png is ill-conditioned if small changes in its entries can produce large changes in the olutisns to linear_algebra_5184fd5260183d42f5f7818bd93528466c1cc2cd.png. If small changes in the entries of linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png produce small changes in the solutions to linear_algebra_5184fd5260183d42f5f7818bd93528466c1cc2cd.png, then linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png is called well-conditioned.

Singular values

For any linear_algebra_1395dc909262a8364fc2adb6e58fbfe880a195c7.png matrix linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png, the linear_algebra_812a7be1c6b4e6d0c54540febd3d26b36f71738b.png linear_algebra_865f4a4b93f60a5e93e18fb20bb567d44c1c5206.png is symmetric and thus orthogonally diagonalizable by the Spectral Theorem.

The eigenvalues of linear_algebra_865f4a4b93f60a5e93e18fb20bb567d44c1c5206.png are real and non-negative. Therefore,

linear_algebra_61ffe3231bc77ac5a601555463b4b31864442068.png

where linear_algebra_4a04e51217d16433a28813f41f74e8ed6b7f931f.png is a unit eigenvector of linear_algebra_865f4a4b93f60a5e93e18fb20bb567d44c1c5206.png. Hence, we can take the square root of all these eigenvalues, and define what we call singular values:

If linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png is an linear_algebra_1395dc909262a8364fc2adb6e58fbfe880a195c7.png matrx, the singular values of linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png are the square roots of the eigenvalues of linear_algebra_865f4a4b93f60a5e93e18fb20bb567d44c1c5206.png and are denoted linear_algebra_1670d66d9c8810ef3f4ae9252d9022b6b92856fa.png.

linear_algebra_c39712f506bd44cdb613426ffe6e891aaf68e8cd.png

By convention we order the singular values s.t. linear_algebra_4d369dcdd53cf45a18a5dbb9c93985d9754a09e4.png.

Cholesky decomposition

The Cholesky decomposition or Cholesky factorization is a decomposition of a Hermitian positive-definite matrix into a product of lower triangular matrix and it's conjugate transpose:

linear_algebra_df7af2a1f5445ba0380cb82c2005967fc32a884a.png

where linear_algebra_9dfb630910b0b3d2a61cb5f1673b9adaa08fe150.png is lower-triangular.

Every Hermitian matrix linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png has a unique Cholesky decomposition.

Theorems

Spectral Theorem

Let linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png be an linear_algebra_812a7be1c6b4e6d0c54540febd3d26b36f71738b.png real matrix. Then linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png is symmetric iff it is orthogonally diagonalizable.

Singular Value Decomposition

Let linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png be an linear_algebra_1395dc909262a8364fc2adb6e58fbfe880a195c7.png matrix with singular values linear_algebra_094363886d530222179346bc0424a1b8a75d6bd6.png and linear_algebra_385ec8b11e09139c7f3aa8f743fa9bcd41641768.png. Then there exists:

  • linear_algebra_9be6ea639045ed23d9f8db404913449401688717.png orthogonal matrix linear_algebra_5760e806c8eb938d6a2563d076feec6ab9f9cafc.png
  • linear_algebra_812a7be1c6b4e6d0c54540febd3d26b36f71738b.png orthogonal matrix linear_algebra_79dc7b66c784a82e557ef3df794e82b67a292b17.png
  • linear_algebra_1395dc909262a8364fc2adb6e58fbfe880a195c7.png "diagonal" matrix linear_algebra_e1f3883ec70409e90d20f7b0388bd1a915ebe226.png, where linear_algebra_bb1f9a514c83c26e73bbdcb78269dc9a073550da.png

such that:

linear_algebra_cbc56c525a7ab170164b298070ae8b644ebf9f13.png

See page 615 of "Linear Algebra: A Modern Introduction" :)

Constructing V

To construct the orthogonal matrix linear_algebra_79dc7b66c784a82e557ef3df794e82b67a292b17.png, we first find an orthonormal basis linear_algebra_e9961c0f43a94b35f6f7b438acc44c2a69dd83c0.png for linear_algebra_9f3373f62a3202ea9c8c771e8432b7f069ccf631.png consisting of eigenvectors of the linear_algebra_812a7be1c6b4e6d0c54540febd3d26b36f71738b.png symmetric matrix linear_algebra_865f4a4b93f60a5e93e18fb20bb567d44c1c5206.png. Then

linear_algebra_267bfdb509c1cd3502fd6862e521740a4f63fbe4.png

is an orthogonal matrix.

Constructing U

For the orthogonal matrix linear_algebra_5760e806c8eb938d6a2563d076feec6ab9f9cafc.png, we first note that linear_algebra_55d99d0648f001dfb50295ba3dc67ded85932479.png is an orthogonal set of vectors in linear_algebra_b5f1f60c4ad3a0c86404b5004ddd88be6d62fb41.png (we're NOT saying they form a basis in linear_algebra_b5f1f60c4ad3a0c86404b5004ddd88be6d62fb41.png!). To see this, suppose that linear_algebra_cfa8216e24dfb30990940547f035172a909ed028.png is a eigenvector of linear_algebra_865f4a4b93f60a5e93e18fb20bb567d44c1c5206.png corresponding to the eigenvalue linear_algebra_031c55aa57bd247de3214605bc78f119d897b407.png. Then, for linear_algebra_a41c04a3f3dafda05d9e0c209b6ed8613c844475.png, we have

linear_algebra_048161fd5bea79f3313b145d3f075d967336928c.png

since the eigenvectors are orthogonal.

Since linear_algebra_4f5c81f9b2e5995c3a43f409727926de72a72e89.png we can normalize linear_algebra_8e13d4d4c53c0e77adb2fb347d38400e84ac090d.png by

linear_algebra_8c97882669a7512a55e1106090cd9eb15c924b77.png

This guarantees that linear_algebra_a76938f277d99ce062ebab083c8eecb6c3fd9cf5.png is a orthonormal set in linear_algebra_b5f1f60c4ad3a0c86404b5004ddd88be6d62fb41.png, but if linear_algebra_caac95e239cb8a87e7c10484497fab47ed5e0b8e.png it's not a basis. In this case, we extend linear_algebra_a76938f277d99ce062ebab083c8eecb6c3fd9cf5.png to an orthonormal basis linear_algebra_d4cb19a1f813615f27ca22947df33abf23f95190.png for linear_algebra_b5f1f60c4ad3a0c86404b5004ddd88be6d62fb41.png by some technique, e.g. the Gram-Schmidt Process, giving us

linear_algebra_b0244ee9dab2ae53789200c71fd29d516f251a54.png

Outer Product Form of the SVD

Let linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png be an linear_algebra_1395dc909262a8364fc2adb6e58fbfe880a195c7.png matrix with singular values linear_algebra_094363886d530222179346bc0424a1b8a75d6bd6.png and linear_algebra_ca0a971d3b2007eb91e731a178e6f4c1c616c386.png, then

linear_algebra_fb18392f4b3020951aa0874c5566b5d745b0d3fa.png

Geometric insight into the effect of matrix transformations

Let linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png be an linear_algebra_1395dc909262a8364fc2adb6e58fbfe880a195c7.png matrix with rank linear_algebra_b4bc85fbe6b1085924f1ba268e01893b52ddad89.png. Then the image of the unit sphere in linear_algebra_b5f1f60c4ad3a0c86404b5004ddd88be6d62fb41.png under the matrix transformation that maps linear_algebra_ed39d9a397196f8f0ce6388b0ea4e0c1dd8becee.png to linear_algebra_b87bd79f91711f6c9b8ab832f03e8f0b3327a878.png is

  1. the surface of a an ellipsoid in linear_algebra_b5f1f60c4ad3a0c86404b5004ddd88be6d62fb41.png if linear_algebra_e76a9a9d3447aad49809b1f5605afedfdf7c6cfa.png
  2. a solid ellipsoid in linear_algebra_b5f1f60c4ad3a0c86404b5004ddd88be6d62fb41.png if linear_algebra_0079c61ce9f824e3571d3f83df76cd1f050cc488.png

We have linear_algebra_243fb3a303fd619d9fae4acb8767ed3e290d5573.png with linear_algebra_4495531942a49e6c2d44b6b69a72bdb0fc65de01.png. Then

linear_algebra_947abbe2b7e7019ddb6a3f22534570249e6acdf0.png

Let linear_algebra_4c9703e2577a7606d986334cb40e2d6ecea9571a.png (i.e. unit vector = surface of unit sphere) in linear_algebra_c56d2a2224fedeb46b49b50d1b41f11e963216ce.png. linear_algebra_79dc7b66c784a82e557ef3df794e82b67a292b17.png is orthogonal, thus linear_algebra_242ea673f886565ef2b0974c8d1a949bc82f798c.png is orthogonal and then linear_algebra_4827bcab06f594492798a93066af61cc2abcb766.png is a unit vector too.

linear_algebra_17d333beb8b62fd13d1cf0698292ff4b6a80476a.png

From the Outer Product Form we have:

linear_algebra_dfc2e0cdea28f7763de9d0d5b41a195e6217c305.png

where we are letting linear_algebra_451f7170fd5f2f9e1dc762872f0fff42095d4174.png denote the scalar linear_algebra_2a490579222de45ff7af12c090aa13044e307f6a.png.

(1) If linear_algebra_01db43f2be8fc194a4ef99095ad6d018bc8dd971.png, then we must have linear_algebra_ad412f7e2e823715387ce1e0cb13429e97a918a2.png and

linear_algebra_787a41e864a24bd21b3267f24d490ee547695a1a.png

Since linear_algebra_5760e806c8eb938d6a2563d076feec6ab9f9cafc.png is orthogonal, we have linear_algebra_3224970545d2f5729a4a48831e1c4c073e41b15c.png, thus

linear_algebra_c66d6f9bfcd71a5daf7ad198cc55aff8b14c31bf.png

which shows that the vectors linear_algebra_b87bd79f91711f6c9b8ab832f03e8f0b3327a878.png form the surface of an ellipsoid in linear_algebra_b5f1f60c4ad3a0c86404b5004ddd88be6d62fb41.png !

(2) If linear_algebra_0079c61ce9f824e3571d3f83df76cd1f050cc488.png, swapping equality with inequality (but performing the same steps):

linear_algebra_8f93a76abf7501210dd8020b819959c036a44a48.png

where the inequality corresponds to a solid ellipsoid in linear_algebra_b5f1f60c4ad3a0c86404b5004ddd88be6d62fb41.png !

Furthermore, we can view the each of the matrix transformations of the SVD separately:

  1. linear_algebra_242ea673f886565ef2b0974c8d1a949bc82f798c.png is an orthogonal matrix, thus it maps the unit sphere onto itself (but in a different basis)
  2. linear_algebra_1395dc909262a8364fc2adb6e58fbfe880a195c7.png matrix linear_algebra_bc4d8c2b4155ba3e3f727e5dc3041a210b747a1a.png first collapses to linear_algebra_fe600a4d232aa451ce3d2fb055e7bd0293eff23e.png dimensions of the unit sphere, leaving a unit sphere of linear_algebra_b4bc85fbe6b1085924f1ba268e01893b52ddad89.png dimensions, and then "distorts" into an ellipsoid
  3. linear_algebra_5760e806c8eb938d6a2563d076feec6ab9f9cafc.png is orthogonal, aligning the axes of this ellipsoid with the orthonormal basis vectors $ 1, …, r$ in linear_algebra_b5f1f60c4ad3a0c86404b5004ddd88be6d62fb41.png

Least squares solution for dependent column-vectors

The least squares problem linear_algebra_5184fd5260183d42f5f7818bd93528466c1cc2cd.png has a unique least squares solution linear_algebra_ce489ceee1659f96c341c9e09fe94540a6a066fd.png of minimal length that is given by:

linear_algebra_cfe9d07c839d6a8132932219afa032dee45eb3f3.png

See page 627 of "Linear Algebra: A Modern Introduction" :)

Determinant

Trace is the sum of the eigenvalues

Let linear_algebra_d36ba5e7bd73c1f88efce6f8c8de9dac7df701d7.png be a matrix, then

linear_algebra_12db3e3de54412bd4691127b85cd6871b0d985a0.png

We start with the characteristic equation

linear_algebra_3ba262118a85d11d862b92608a9a40aaf23af2d7.png

We then notice that

linear_algebra_90fd077e6e5a91b2aec5db29a0daaefef33227b0.png

which can be seen by expanding the determinant along the diagonal:

linear_algebra_05f11f0d7534b35a455b6ecd89bf3f090439ac88.png

Here we see that the only "sub-determinant" which contains a factor of linear_algebra_85da018a6a01ecceb17d0e7f2aed9caacff03a4f.png is going to be the very first one which is multiplied by linear_algebra_c680c0405a7e1a46a6de8a5a9c1eee06d92224e4.png, since each of the other "sub-determinants" will not have a coefficient with linear_algebra_ec150cd354803373f200e63346b8a8df0f63f554.png, and it will not contain linear_algebra_381701498eaedb83e0187eb27965d65cc1e62564.png, hence at most contain a factor of linear_algebra_bf3783e6ad2ef4d4458ebbb1a51779bb7bedce03.png.

Further, since

linear_algebra_3aab05016cf271d8e7bc1e26723296f803166c4e.png

we observe that linear_algebra_6a091534c83256e78ab570348dfa5923a2cf6281.png, which we also know to be the linear_algebra_310452019af08f2ccddb2f9dd4f94868b4a1227b.png from above, i.e.

linear_algebra_6fdc288ae16b484dd5b7e002051b25d95894ef0e.png

as wanted.

Determinant is the product of the eigenvalues

Let linear_algebra_d36ba5e7bd73c1f88efce6f8c8de9dac7df701d7.png be a matrix, then

linear_algebra_a88b5a32316ddd08751781ecefc54fe0708bbae7.png

Let linear_algebra_d36ba5e7bd73c1f88efce6f8c8de9dac7df701d7.png be a matrix. Then

linear_algebra_a4e9fd3f16fdbd46440780277ec46b8685d71c0d.png

defines the characteristic equation, which as the eigenvalues linear_algebra_031c55aa57bd247de3214605bc78f119d897b407.png as its roots, therefore

linear_algebra_06801f64233a324aa2d121092f32626d97dea96e.png

Since linear_algebra_ec150cd354803373f200e63346b8a8df0f63f554.png is just a variable, we let linear_algebra_8b085d8508b191e410edcf915f7eb2b9ade5c9b1.png and get

linear_algebra_e28f21f763d6e74b77ad1e4fe8766f4d92b57876.png

as wanted.

Cramers Rule

Consider the system of linear_algebra_f07b932b12c91bca424fd428d383495413b5b6a9.png linear equations represented by linear_algebra_f07b932b12c91bca424fd428d383495413b5b6a9.png unknowns; in matrix form

linear_algebra_e22266fdcfda99041488ff74b86ad897c555fc44.png

where linear_algebra_34613c2dfcdb922234c8506a3bf073d21f320906.png has nonzero determinant, then

linear_algebra_ce77a5cfeae7536acd1f1745ffc12ab72741c16c.png

where linear_algebra_481ba121eb1e7727e9bbec4a6485feadca1c78c4.png is the matrix formed by replacing the i-th column of linear_algebra_d36ba5e7bd73c1f88efce6f8c8de9dac7df701d7.png by the column vector linear_algebra_5d1bec66359813fe81f08ce2bb174b8a266d97eb.png.

The reason why we can think of OLS as projection

The proof of the following theorem is really important in my opinion, as it provides a "intuitive" way of viewing least-squares approximations.

The Best Approximation Theorem says the following:

If linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png is a finite-dimensional subspace of an inner product space linear_algebra_79dc7b66c784a82e557ef3df794e82b67a292b17.png and if linear_algebra_4a04e51217d16433a28813f41f74e8ed6b7f931f.png is a vector in linear_algebra_79dc7b66c784a82e557ef3df794e82b67a292b17.png, then linear_algebra_37550ece70711944ef4f2645e491a6ddb4e0c41c.png is the best approximation to linear_algebra_4a04e51217d16433a28813f41f74e8ed6b7f931f.png in linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png.

Let linear_algebra_4016a81e52b1f12c3e1d907335d404f43d7cf519.png be a vector in linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png different from linear_algebra_37550ece70711944ef4f2645e491a6ddb4e0c41c.png. Then linear_algebra_bdb6b659093500739de5d53acad7d515ec8a6c18.png is also in linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png, so linear_algebra_3596caab8da52f6b329a28af1e06630da5f335bc.png is orthogonal to linear_algebra_bdb6b659093500739de5d53acad7d515ec8a6c18.png.

Pythagoras' Theorem now implies that

linear_algebra_8e781882350af66b77c9f3a7ece7a0b2b4bd564a.png

But linear_algebra_1cbad9b5721f688f4a2e55cdec146d8b8be381b8.png, so

linear_algebra_1ad9d4eaa5078a73ad8d12dd67799d313e9933a5.png

i.e. linear_algebra_29c381b47fc947a0fce7038048d014377fd8cd75.png and thus that linear_algebra_37550ece70711944ef4f2645e491a6ddb4e0c41c.png is the best approximation.

Distance and Approximation

Applications

Function-approximation

Given a continunous function linear_algebra_cdd1cc131da6040eca078917132a377727053c44.png on an interval linear_algebra_1e317d7bcb9876e4394628a1f5d060094895e80c.png and a subspace linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png of linear_algebra_7d29a4ca429974946f0719eff87b4cdacb87d5f0.png, find the function "closest" to linear_algebra_cdd1cc131da6040eca078917132a377727053c44.png in linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png.

  • linear_algebra_7d29a4ca429974946f0719eff87b4cdacb87d5f0.png denotes the space of all continuous functions defined on the domain linear_algebra_1e317d7bcb9876e4394628a1f5d060094895e80c.png
  • linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png is a subspace of linear_algebra_7d29a4ca429974946f0719eff87b4cdacb87d5f0.png

Problem is analogous to least squares fitting of data points, except we have infinitely many data points - namely, the points in the graph of the function linear_algebra_cdd1cc131da6040eca078917132a377727053c44.png.

What should "approximate" mean in this context? Best Approximation Theorem has the answer!

The given function linear_algebra_cdd1cc131da6040eca078917132a377727053c44.png lives in the vector space linear_algebra_7d29a4ca429974946f0719eff87b4cdacb87d5f0.png of continuous functions on the interval linear_algebra_1e317d7bcb9876e4394628a1f5d060094895e80c.png. This is an inner product space, with inner product:

linear_algebra_1f426a50eb1acd0422ae10f6422ba1622b0df98e.png

If linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png is finite-dimensional subspace of linear_algebra_7d29a4ca429974946f0719eff87b4cdacb87d5f0.png, then the best approximation to linear_algebra_cdd1cc131da6040eca078917132a377727053c44.png in linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png is given by the projection of linear_algebra_cdd1cc131da6040eca078917132a377727053c44.png onto linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png.

Furthermore, if linear_algebra_157ee1972d879cfa066713cbbb3e40c97a4a7947.png is an orthogonal basis for linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png, then

linear_algebra_59f181db62d60dce5f56f7e56f49f45b50b2c0ad.png

The error from this approximation is just

linear_algebra_9f33ec29978e8ba30d30d6f9def0591283d6593d.png

and is often called the root mean square error. For functions we can think of this as the area between the graphs.

Fourier approximation

A function of the form

linear_algebra_eab6f33c5027d3e31807e1880759aa6ea204585a.png

is called a trigonometric polynomial.

Restricting our attention to the space of continuous functions on the interval linear_algebra_898572e496f3d654e1f84640dbbbbfd92b86fbfe.png, i.e. linear_algebra_89bc2be775c3c1bb2a3ee27ab77abefc5768b9a9.png, with the inner product

linear_algebra_48be39a97d0f2fe4dff6f4f3bc2fab018d4a36f4.png

and since the trigonometric polynomials are linear combinations of the set

linear_algebra_00c94fb9632c5e71347000807cacb345d67ff050.png

the best approximation to a function linear_algebra_cdd1cc131da6040eca078917132a377727053c44.png in linear_algebra_89bc2be775c3c1bb2a3ee27ab77abefc5768b9a9.png by trigonometric polynomials of order linear_algebra_f07b932b12c91bca424fd428d383495413b5b6a9.png is therefore linear_algebra_17fa1cda98bd31460ce246554105ce00aaa7ae6c.png, where linear_algebra_8de76198f5597ddecf4caed932d9dd113897c2c3.png. It turns out that linear_algebra_fd9dfa5f6219ea1d43e26a4e8277dff88894b3f8.png is an orthogonal set, hence a basis for linear_algebra_fe7108f38f2db55db9cb7eee3f63077437a2e510.png.

The approximation to linear_algebra_cdd1cc131da6040eca078917132a377727053c44.png using trigonometric polynomials is called the n-th order Fourier approximation to linear_algebra_cdd1cc131da6040eca078917132a377727053c44.png on linear_algebra_bd35a26b31a4c2350ce912a55384ae2b02d5cdb2.png, with the coefficients are known as Fourier coefficients of linear_algebra_cdd1cc131da6040eca078917132a377727053c44.png.

Rayleigh principle

A lot of the stuff here I got from these lecture notes.

Notation

  • linear_algebra_d56a276fab901c3f31c78cd673a489f2b5ec7c6c.png denotes the Rayleigh coefficient of some Herimitian / real symmetric matrix linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png
  • linear_algebra_16f7d0b3aceb2236135b896b0a5d0119a2795935.png is the diagonal matrix with entries being eigenvalues of linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png
  • Eigenvalues are ordered as linear_algebra_abdd8e8f2397bdd18356c425e4c0d2f3f7ac1370.png
  • linear_algebra_88c6e1c008509ea8de35316286b272c4107bed01.png is the orthogonal matrix with eigenvectors of linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png as columns
  • linear_algebra_78c22efa5ae9f37e984731448407c1539dda8b27.png, thus linear_algebra_bb6363129e368a320f44d930ab67ba47c81e73d2.png
  • linear_algebra_7eb8cef863c35c824c0197776def8ddbaec41c43.png

Definition

For a given complex Hermitian matrix linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png and non-zero vector linear_algebra_3c314f80373742988ad542a6d4ce66a111b17847.png, the Rayleigh quotient is defined as:

linear_algebra_fe114da33eb5d3e4c067f8b707885864077af343.png

In the case of linear_algebra_9f3373f62a3202ea9c8c771e8432b7f069ccf631.png, linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png and is symmetric, then

linear_algebra_7e6196a9c4be99aab8fdd70ffcb5bf34d60cb829.png

Further, in general,

linear_algebra_f86e383ba2d876d3bac9f0c313eadc5f0cf8b22a.png

is the generalized Rayleigh quotient, where linear_algebra_aa45b78ca4fcfb1128b81983bc69cebaddb6486d.png is a positive-definite Herimitian matrix.

The generalized Rayleigh quotient can be reduced to linear_algebra_055a355c6d5257ff28a4b9c4c64fdb01f8ca69e3.png through the transformation

linear_algebra_2168996a498f79c11ba120fd84a7081c73a5ad77.png

where linear_algebra_31ff731a26178c6cd51db4dc5edc116a7aa94325.png is the Cholesky decomposition of the Hermitian positive-definite matrix linear_algebra_aa45b78ca4fcfb1128b81983bc69cebaddb6486d.png.

Minimization

Personal explanation

linear_algebra_0b42e0dfa208c24c61977258c402110dd26c020d.png

At the maximum we must have

linear_algebra_f04467975891b9075a08c2855e1bde54309cdbb2.png

But observe that for whatever linear_algebra_32e55c6e155c100dc50ef35a36a56df33f4c0687.png we have here, we have

linear_algebra_112d1ed4e0522fa9141644d3ba76561872411d36.png

i.e. it's invariant to scaling.

Further, since linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png is symmetric, we have the diagonalization

linear_algebra_39503122d6e64acbb3dc229100f4928a54a4db33.png

which means that if we let

linear_algebra_a18ce070c8657ef3b5425d8282c19f1f225b21b5.png

for some linear_algebra_c8a3cfd5551477977a517afd4b0b03b22a3d24f2.png, we get

linear_algebra_7cec1f82ce0ed135ae657c26c3d56f4005ce4bd3.png

where we've applied the constraint linear_algebra_9f78247ea1ca467ff6fd9840fbeb1adc3d49b704.png, which we can since again, invariant under scaling.

Therefore, we're left with the following optimization problem:

linear_algebra_43d68839b872ea6d7a33a93270fdc97601e909ac.png

Since we assume that linear_algebra_abdd8e8f2397bdd18356c425e4c0d2f3f7ac1370.png, we clearly obtain the minima when

linear_algebra_90c4f3b009d02bef5eb65259a3d4bad41b3e19c4.png

Hence,

linear_algebra_10f0ea3d3b670e3ae8f2f8687ec3fd622b13e8b7.png

where linear_algebra_aad1809173bf7244c27e7be7f9687fe3cbf78fe4.png denotes the corresponding eigenvector.

Now suppose we'd like to find the minima, which is NOT in the space spanned by the first eigenvector, then we only consider the subspace

linear_algebra_9aac56b80621b63c4940a40ed403b7f1854d05fc.png

Thus, the optimization problem is

linear_algebra_2ab3948d72940378dcf329493a63fe29220a943a.png

Finally, observe that linear_algebra_5ec904371d63116ae149acba576ee62891eab356.png, hence

linear_algebra_903d3bcf863a46f2829c1acae109935b9674b795.png

And the subspace linear_algebra_21b48c6606fec191e9abeda281e687000f54cbe3.png is just all linear_algebra_44f35a9df8053171acf31f4426f647a3484a1df3.png, giving us

linear_algebra_2c5fffedca978907fd747fb4447c1766a0c4ce76.png

And if we just keep going, heyooo, we get all our eigenvalues and eigenvectors!

Stuff

We're going to assume linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png is real.

Our objective is:

linear_algebra_4c23660b1f3332f4a874094978e46a57cd00c498.png

Observe the following:

  • Objective is invariant to scaling, since we'd be scaling the denominator too
  • linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png is by assumption symmetric: all real eigenvalues

Thus, we have

linear_algebra_39503122d6e64acbb3dc229100f4928a54a4db33.png

Further,

linear_algebra_b88e69f80afccec1fc777bd0c804e1013889c5b5.png

Hence,

linear_algebra_ceda7b766dcdec293aa1d20d947f72417d1329bc.png

Thus, by letting linear_algebra_78c22efa5ae9f37e984731448407c1539dda8b27.png, we get

linear_algebra_2fdaefaf4da4757f46b0d0be533a4ae7e2bcfb67.png

Since this quotient is invariant under scaling, we frame it as follows:

linear_algebra_b33191d14c555ef5ebd221923e652811ab06f7cc.png

Implying that linear_algebra_d8e5d008513539c071b632ee0f089f6decd249a5.png (seen from linear_algebra_b08ef8d3c14d2bece45de2e3577233d973697ac9.png and linear_algebra_333fe77c6392e7c9a06999357f745ea29723fe7e.png) since we assume the eigenvalues to be ordered.

Hence, we have proved what's called the Rayleigh Principle:

linear_algebra_dcac5d4abb938b3c0b2f0882c846fd700f1e7268.png

where linear_algebra_4948d53a91e437ba6124a0426878e933b4a1b559.png is the smallest eigenvalue of linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png.

Planar Slices

Suppose we'd like to find the answer to the following:

linear_algebra_093bc28b7160a95b6b53a2a17a50286e8415e58f.png

i.e. minimizing the Rayleigh quotient over some hyperplane defined by linear_algebra_d9003da1fc665a63d130afe2b39d355012d2d510.png.

This minimum is bounded blow as follows:

linear_algebra_92887ff47b46c8b4b0c0b176109480b82f47f2b6.png

Further, we can obtain an upper bound for this by considering linear_algebra_6b14964d5137620b6110a3971e71ad4c52067f76.png again:

linear_algebra_d9c8645b7742ca3aac4bcdb61ad0e51aa9b91024.png

By scaling we can assume linear_algebra_2e5f862200a30b0bee637cae8abe965b1ecce024.png, giving us the optimization problem

linear_algebra_5e0cd76806dd40193fdf730a3569f078ddde1dfa.png

There is at least one vector of the form

linear_algebra_b5bff29f9bba08f1a4c18830d603009f5d7dba62.png

If linear_algebra_1db8461a83ca17afcfccfef7e041d1b32de86681.png, then the above translates to

linear_algebra_8ff488a1016169582975686b388df91661094c93.png

assuming linear_algebra_78587b7f710a1cb46a03a6dfdb52f33a7a3494e0.png (does not satisfy second constraint), then the first constraint can be represented as:

linear_algebra_e28c2ce005e13e2a1eba287ea98f18c04f553d08.png

which is just a line through the origin. The second constraint is the unit-circle. Thus, there are only two points where both these constraints are satisfied (a line crosses the circle in two places). Let linear_algebra_175ce37eefef103d2144974747e25ed645800f3a.png be one of these points of intersection, then

linear_algebra_0a8ca1f7d4caad6f05466ace2f11055f96647d14.png

Which means we have bounded the minimum of linear_algebra_d56a276fab901c3f31c78cd673a489f2b5ec7c6c.png on the hyperplane linear_algebra_d9003da1fc665a63d130afe2b39d355012d2d510.png! This becomes incredibly useful when have a look at Minimax Principle for the Second Smallest Eigenvalue.

Minimax Principle for the Second Smallest Eigenvalue

This is a slighly different way of viewing finding the second eigenvalue, and poses the "answer" in a frame where we might have asked the following question:

"Can we obtain linear_algebra_97edc990c61f092b1a52218bc612180fd885db92.png without finding linear_algebra_4948d53a91e437ba6124a0426878e933b4a1b559.png?"

And ends with the answer:

linear_algebra_0e2e1e761e7794d44c830054af3816cdcb017373.png

which is very useful computationally, as we can alternate between miniziming and maximizing linear_algebra_d56a276fab901c3f31c78cd673a489f2b5ec7c6c.png to obtain linear_algebra_97edc990c61f092b1a52218bc612180fd885db92.png.

What if we could attain the maximum in this equation such that we would know that "maximizing in the hyperplane linear_algebra_d9003da1fc665a63d130afe2b39d355012d2d510.png, where linear_algebra_3c314f80373742988ad542a6d4ce66a111b17847.png minimizes linear_algebra_d56a276fab901c3f31c78cd673a489f2b5ec7c6c.png, gives us the second smallest eigenvalue linear_algebra_97edc990c61f092b1a52218bc612180fd885db92.png!

Or equivalently, the minimum in the restricted hyperplane linear_algebra_d9003da1fc665a63d130afe2b39d355012d2d510.png is given by linear_algebra_97edc990c61f092b1a52218bc612180fd885db92.png!

Thus, if we can force somehow force linear_algebra_2b18d6c53b2ab6cd412861f480dd15f163d8d3e2.png and linear_algebra_80bcb24708b7851669342eba5fdf93576d26ca59.png, then we would get a upper bound for this. If we had linear_algebra_9ad5d20b46c1370f9a3e7d0753dbc35f2189c6a0.png then the only solutions to this would be:

linear_algebra_c677b501da47f69948599feb8b3ca528cdffb576.png

To show that linear_algebra_cfeb2fdf7b039944dc8810badfdb64732bf8bad1.png, we only consider solutions for linear_algebra_3c314f80373742988ad542a6d4ce66a111b17847.png where linear_algebra_38ee25c051d9cba97bb13a1cb2b1b6741eb78813.png, i.e. let

linear_algebra_236d18222c8524da7fd4b088c156d2e4b8643f81.png

linear_algebra_3d46b8839cccd715d82bd8587d2538f082be0fa9.png

Which is clearly attained when linear_algebra_9343ffed5be1bcd5306611d23abdab264653ed54.png and linear_algebra_c18903363387bae9e33009321d6cecf7b6e0b3be.png, since linear_algebra_5cdd8b741676f474b662ad5767e09bc8116eae77.png

Hence,

linear_algebra_0e2e1e761e7794d44c830054af3816cdcb017373.png

Thus we've shown that min-maxing over some function linear_algebra_d56a276fab901c3f31c78cd673a489f2b5ec7c6c.png is equivalent of finding the second smallest eigenvalue!

Since linear_algebra_d9003da1fc665a63d130afe2b39d355012d2d510.png means that linear_algebra_3c314f80373742988ad542a6d4ce66a111b17847.png lies in the orthogonal complement of linear_algebra_9c15196dd07b1add486b8b54592e74bfe946ed95.png, which is an (n-1)-dimensional subspace. Thus, we may rewrite the above as

linear_algebra_3b3f0d2e2b08d6c8813611b5d5fc605dcee88bcf.png

In fact, it turns out this works for general linear_algebra_92483d0bb3bbad1c0b4630e7e529b0b5f3f8c48b.png :

linear_algebra_9d4121b854c27da8768aa379f6c8663fd01b4480.png

All this means that if we want to find some vector linear_algebra_3c314f80373742988ad542a6d4ce66a111b17847.png which minimizes the Rayleigh quotient, then we simply compute the eigenvalues of linear_algebra_8939ac39ed16887f3b3c033a47d4ab60d120251a.png and look for the corresponding eigenvector, giving us the solution!