Rayleigh Quotients

Rayleigh Quotients#

There turns out to be an interesting connection between the quadratic form of a symmetric matrix and its eigenvalues. This connection is provided by the Rayleigh quotient

\[R_\mathbf{A}(\mathbf{x}) = \frac{\mathbf{x}^{\!\top\!}\mathbf{A}\mathbf{x}}{\mathbf{x}^{\!\top\!}\mathbf{x}}\]

The Rayleigh quotient has a couple of important properties:

Lemma (Properties of the Rayleigh Quotient)

(i) Scale invariance: for any vector \(\mathbf{x} \neq \mathbf{0}\) and any scalar \(\alpha \neq 0\), \(R_\mathbf{A}(\mathbf{x}) = R_\mathbf{A}(\alpha\mathbf{x})\).

(ii) If \(\mathbf{x}\) is an eigenvector of \(\mathbf{A}\) with eigenvalue \(\lambda\), then \(R_\mathbf{A}(\mathbf{x}) = \lambda\).

Proof. (i)

\[R_\mathbf{A}(\alpha\mathbf{x}) = \frac{(\alpha\mathbf{x})^{\!\top\!}\mathbf{A}(\alpha\mathbf{x})}{(\alpha\mathbf{x})^{\!\top\!}(\alpha\mathbf{x})} = \frac{\alpha^2}{\alpha^2}\frac{\mathbf{x}^{\!\top\!}\mathbf{A}\mathbf{x}}{\mathbf{x}^{\!\top\!}\mathbf{x}}=R_\mathbf{A}(\mathbf{x}).\]

(ii) Let \(\mathbf{x}\) be an eigenvector of \(\mathbf{A}\) with eigenvalue \(\lambda\), then

\[R_\mathbf{A}(\mathbf{x})= \frac{\mathbf{x}^{\!\top\!}\mathbf{A}\mathbf{x}}{\mathbf{x}^{\!\top\!}\mathbf{x}} = \frac{\mathbf{x}^{\!\top\!}(\lambda\mathbf{x})}{\mathbf{x}^{\!\top\!}\mathbf{x}}=\lambda\frac{\mathbf{x}^{\!\top\!}\mathbf{x}}{\mathbf{x}^{\!\top\!}\mathbf{x}} = \lambda.\]

We can further show that the Rayleigh quotient is bounded by the largest and smallest eigenvalues of \(\mathbf{A}\).

But first we will show a useful special case of the final result.

Theorem (Bound Rayleigh Quotient)

For any \(\mathbf{x}\) such that \(\|\mathbf{x}\|_2 = 1\),

\[\lambda_{\min}(\mathbf{A}) \leq \mathbf{x}^{\!\top\!}\mathbf{A}\mathbf{x} \leq \lambda_{\max}(\mathbf{A})\]

with equality if and only if \(\mathbf{x}\) is a corresponding eigenvector.

Proof. We show only the \(\max\) case because the argument for the \(\min\) case is entirely analogous.

Since \(\mathbf{A}\) is symmetric, we can decompose it as \(\mathbf{A} = \mathbf{Q}\mathbf{\Lambda}\mathbf{Q}^{\!\top\!}\).

Then use the change of variable \(\mathbf{y} = \mathbf{Q}^{\!\top\!}\mathbf{x}\), noting that the relationship between \(\mathbf{x}\) and \(\mathbf{y}\) is one-to-one and that \(\|\mathbf{y}\|_2 = 1\) since \(\mathbf{Q}\) is orthogonal.

Hence

\[\max_{\|\mathbf{x}\|_2 = 1} \mathbf{x}^{\!\top\!}\mathbf{A}\mathbf{x} = \max_{\|\mathbf{y}\|_2 = 1} \mathbf{y}^{\!\top\!}\mathbf{\Lambda}\mathbf{y} = \max_{y_1^2+\dots+y_n^2=1} \sum_{i=1}^n \lambda_i y_i^2\]

Written this way, it is clear that \(\mathbf{y}\) maximizes this expression exactly if and only if it satisfies \(\sum_{i \in I} y_i^2 = 1\) where \(I = \{i : \lambda_i = \max_{j=1,\dots,n} \lambda_j = \lambda_{\max}(\mathbf{A})\}\) and \(y_j = 0\) for \(j \not\in I\).

That is, \(I\) contains the index or indices of the largest eigenvalue.

In this case, the maximal value of the expression is

\[\sum_{i=1}^n \lambda_i y_i^2 = \sum_{i \in I} \lambda_i y_i^2 = \lambda_{\max}(\mathbf{A}) \sum_{i \in I} y_i^2 = \lambda_{\max}(\mathbf{A})\]

Then writing \(\mathbf{q}_1, \dots, \mathbf{q}_n\) for the columns of \(\mathbf{Q}\), we have

\[\mathbf{x} = \mathbf{Q}\mathbf{Q}^{\!\top\!}\mathbf{x} = \mathbf{Q}\mathbf{y} = \sum_{i=1}^n y_i\mathbf{q}_i = \sum_{i \in I} y_i\mathbf{q}_i\]

where we have used the matrix-vector product identity.

Recall that \(\mathbf{q}_1, \dots, \mathbf{q}_n\) are eigenvectors of \(\mathbf{A}\) and form an orthonormal basis for \(\mathbb{R}^n\).

Therefore by construction, the set \(\{\mathbf{q}_i : i \in I\}\) forms an orthonormal basis for the eigenspace of \(\lambda_{\max}(\mathbf{A})\).

Hence \(\mathbf{x}\), which is a linear combination of these, lies in that eigenspace and thus is an eigenvector of \(\mathbf{A}\) corresponding to \(\lambda_{\max}(\mathbf{A})\).

We have shown that \(\max_{\|\mathbf{x}\|_2 = 1} \mathbf{x}^{\!\top\!}\mathbf{A}\mathbf{x} = \lambda_{\max}(\mathbf{A})\), from which we have the general inequality \(\mathbf{x}^{\!\top\!}\mathbf{A}\mathbf{x} \leq \lambda_{\max}(\mathbf{A})\) for all unit-length \(\mathbf{x}\). ◻

By the scale invariance of the Rayleigh quotient, we immediately have as a corollary

Theorem (Min-Max Theorem)

For all \(\mathbf{x} \neq \mathbf{0}\),

\[\lambda_{\min}(\mathbf{A}) \leq R_\mathbf{A}(\mathbf{x}) \leq \lambda_{\max}(\mathbf{A})\]

with equality if and only if \(\mathbf{x}\) is a corresponding eigenvector.

Proof. Let \(\mathbf{x}\neq \boldsymbol{0},\) then

\(R_\mathbf{A}(\mathbf{x}) = \frac{\mathbf{x}^{\!\top\!}\mathbf{A}\mathbf{x}}{\mathbf{x}^{\!\top\!}\mathbf{x}} = \frac{\mathbf{x}^{\!\top\!}\mathbf{A}\mathbf{x}}{\|\mathbf{x}\|^2} = (\frac{\mathbf{x}}{\|\mathbf{x}\|})^{\!\top\!}\mathbf{A}(\frac{\mathbf{x}}{\|\mathbf{x}\|})\)

Thus, minimimum and maximum of the Rayleigh quotient are identical to minimum and maximum of the squared form \(\mathbf{y}^\top\mathbf{A}\mathbf{y}\) for the unit-norm vector \(\mathbf{y}=\mathbf{x}/\|\mathbf{x}\|\):

\[\lambda_{\min}(\mathbf{A}) \leq R_\mathbf{A}(\mathbf{x}) \leq \lambda_{\max}(\mathbf{A})\]

◻

../_images/d47903e7f2bab3bc94fed0c86b10ad387b2a4aa02846ff510e365bdef58fa2a3.png

This combined visualization brings together the Rayleigh quotient and the level sets of the quadratic form \(\mathbf{x}^\top \mathbf{A} \mathbf{x}\):

Left panel: Rayleigh quotient \(R_\mathbf{A}(\mathbf{x})\) on the unit circle
- Color shows how the value varies with direction.
- Extremes occur at eigenvector directions (marked with arrows).
Right panel: Level sets (contours) of the quadratic form
- Elliptical shapes aligned with eigenvectors.
- Red vectors indicate principal axes (scaled eigenvectors).

Together, these panels illustrate how the direction of a vector determines how strongly it is scaled by the symmetric matrix, and how this scaling relates to the matrix’s eigenstructure.

✅ As guaranteed by the Min–Max Theorem, the maximum and minimum of the Rayleigh quotient occur precisely at the eigenvectors corresponding to the largest and smallest eigenvalues.