This note uses Named tensor notation.
The singular values decomposition of a matrix $\ndef{\axl}{left}\ndef{\axr}{right}A \in \R^{\axl\times \axr}$ is a complete description of how it scales different directions in the spaces $\R^\axl$ and $\R^\axr$. Any matrix $A$ can be expressed as the sum of $r \ce \rank(A)$ outer products
A = \sum_\sing \sigma UV,
- $U \in \R^{\sing\times \axl}$ is made of $r$ orthonormal vectors that span $A$ within $\R^\axl$;
- $V \in \R^{\sing \times \axr}$ is made of $r$ orthonormal vectors that span $A$ within $\R^{\axr}$;
- the singular values $\sigma \in \R^\sing$ describes how these vectors are scaled by $A$.
- This shows that any matrix is diagonal up to a rotation of the spaces it acts on!
- Unlike for eigenvalues, the vectors $U_{\sing(i)}$ and $V_{\sing(i)}$ do not need to point in the same direction (and indeed they’re not part of the same space!).
- Since we limited the sum to $r$ terms, $\sigma$ only contains only the nonzero singular values (this is known as the compact SVD).
Linear map
When we apply $A$ to a vector $x \in \R^\axl$, $x$’s component in the direction of $U_{\sing(i)}$ gets scaled by a factor $\sigma_i$ and transformed into direction $V_{\sing(i)}$:
A \ndot \axl x
&= \p{\sum_\sing \sigma UV} \ndot\axl x\\
&= \sum_\sing \UB{\sigma}_\text{scaling}\UB{\p{U \ndot\axl x}}_\text{``coordinates of $x$''}\UB{V}_\text{new directions}.
By symmetry, $A \ndot\axr y$ transforms direction $V_{\sing(i)}$ into direction $U_{\sing(i)}$ and scales by $\sigma_i$.
Geometric interpretation
In particular, the transformation $x \mapsto A \ndot\axl x$ turns the unit ball
B\ce \setco{x \in \R^\axl}{\norm{x}_\axl \le 1}
into an ellipsoid of dimension $r$ within $\R^\axr$, whose semi-axes are $\sigma_i V_{\sing(i)}$. In particular, the length of the semi-axes are given by the singular values. Indeed, we have
A \ndot\axl x
&= \p{\sum_\sing \sigma U V} \ndot\axl x\\
&= \p{U \ndot\axl x} \ndot\sing \p{\sigma V},
and the only constraint that $x$ being in $B$ imposes on $U \ndot\axl x \in \R^\sing$ is that it should have norm at most $1$.
In a similar vein, when we look at the components of $x$ and $y$ relative to $U$ and $V$, the singular values give the coefficients of the bilinear form associated with $A$:
x \ndot\axl A \ndot\axr y
&= x \ndot\axl \p{\sum_\sing \sigma U V}\ndot\axr y\\
&= \sum_\sing \sigma \UB{\p{U \ndot\axl x}}_\text{``coordinates of $x$''}\ \UB{\p{V\ndot\axr y}}_\text{``coordinates of $y$''}.
Relation to eigenvalues
The singular values of $A$ are the square roots of the eigenvalues of
A \ndot\axr A\nt\axl \in \R^{\axl \times \axl'},
which corresponds to the linear map that uses $A$ to take a vector $x \in \R^{\axl}$ into $\R^\axr$ and then back to $\R^\axl$. Indeed,
A \ndot\axr A\nt\axl
&= \p{\sum_\sing \sigma U V} \ndot\axr \p{\sum_\sing \sigma U\nt\axl V}\\
%&= \p{\sum_\sing \sigma U V} \ndot\axr \p{\sum_{\sing'} \p{\sigma U\nt\axl V}\nt\sing}\\
&= \sum_{\substack{\sing\\\sing'}}\sigma U V\ndot\axr\p{\sigma U\nt\axl V}\nt\sing\\
&= \sum_{\substack{\sing\\\sing'}}\sigma U \p{\sigma U\nt\axl}\nt\sing \UB{\p{V \ndot\axr V\nt\sing}}_{I_{\sing,\sing'}}\\
&= \sum_\sing \sigma^2UU\nt\axl,
which is an eigendecomposition since the vectors of $U$ (in $\R^{\axl}$) are linearly independent.