# SVD — Where V and U Come From
## The Intuition
SVD decomposes any matrix as $A = U\Sigma V^T$. But where do $V$ and $U$ actually come from computationally?
- **Columns of $V$** are eigenvectors of $A^TA$
- **Columns of $U$** are eigenvectors of $AA^T$
- **Singular values** are $\sigma_i = \sqrt{\lambda_i(A^TA)}$
This is not a coincidence — it connects directly to the geometric picture. $A$ transforms the unit sphere into an ellipsoid. The columns of $V$ are the input directions that map cleanly onto the ellipsoid's axes, and the columns of $U$ are those axes in the output space. $A^TA$ and $AA^T$ are the matrices that expose these directions: $A^TA$ measures how $A$ stretches the input space, and $AA^T$ measures how $A$ stretches the output space. Their eigenvectors are the natural axes of each space.
In general $A^TA \neq AA^T$, so their eigenvectors differ — the input and output spaces have independent natural bases. This is precisely why $U \neq V$ in general. They do share the same nonzero eigenvalues $\sigma_i^2$, because the stretching magnitude is the same regardless of which side you measure from.
---
## The Derivation
The two fundamental SVD relations are symmetric counterparts of each other, both baked into $A = U\Sigma V^T$:
$Av_i = \sigma_i u_i \qquad\qquad A^T u_i = \sigma_i v_i$
**Deriving $Av_i = \sigma_i u_i$:** Apply $A = U\Sigma V^T$ directly to $v_i$:
$Av_i = U\Sigma V^T v_i$
Since $V$ is orthogonal, $V^T v_i = e_i$ by the same orthonormality argument. Then $\Sigma e_i = \sigma_i e_i$, and $Ue_i = u_i$:
$U\Sigma V^T v_i = U\Sigma e_i = U(\sigma_i e_i) = \sigma_i u_i$
So $Av_i = \sigma_i u_i$ is not an assumption — it falls straight out of the definition $A = U\Sigma V^T$.
The second relation follows from $A^T = V\Sigma U^T$. Applying it to $u_i$:
$A^T u_i = V\Sigma U^T u_i$
Since $U$ is orthogonal, $U^T u_i$ computes the dot product of $u_i$ against every column of $U$. Orthonormality means $u_j^T u_i = 0$ for $j \neq i$ and $u_i^T u_i = 1$, so the result is the standard basis vector $e_i$ — a vector with $1$ in position $i$ and zeros elsewhere:
$U^T u_i = e_i$
Then $\Sigma e_i$ picks out the $i$-th diagonal entry of $\Sigma$:
$\Sigma e_i = \sigma_i e_i$
And $Ve_i$ picks out the $i$-th column of $V$, which is $v_i$:
$V(\sigma_i e_i) = \sigma_i v_i$
So the full chain $V\Sigma U^T u_i = \sigma_i v_i$ confirms $A^T u_i = \sigma_i v_i$.
---
### Columns of V are eigenvectors of $A^TA$
Start from $Av_i = \sigma_i u_i$ and hit both sides on the left with $A^T$:
$A^TAv_i = A^T(\sigma_i u_i) = \sigma_i(A^T u_i) = \sigma_i(\sigma_i v_i) = \sigma_i^2 v_i$
So $v_i$ is an eigenvector of $A^TA$ with eigenvalue $\sigma_i^2$.
### Columns of U are eigenvectors of $AA^T$
Start from $A^T u_i = \sigma_i v_i$ and hit both sides on the left with $A$:
$AA^T u_i = A(\sigma_i v_i) = \sigma_i(Av_i) = \sigma_i(\sigma_i u_i) = \sigma_i^2 u_i$
So $u_i$ is an eigenvector of $AA^T$ with eigenvalue $\sigma_i^2$.
In both derivations, scalars factor out freely at each step — $A^T(\sigma_i u_i) = \sigma_i(A^T u_i)$ — because any linear map commutes with scalar multiplication. The singular values follow as:
$\sigma_i = \sqrt{\lambda_i(A^TA)} = \sqrt{\lambda_i(AA^T)}$