SVD - Where V and U Come From

# SVD — Where V and U Come From ## The Intuition SVD decomposes any matrix as $A = U\Sigma V^T$. But where do $V$ and $U$ actually come from computationally? - **Columns of $V$** are eigenvectors of $A^TA$ - **Columns of $U$** are eigenvectors of $AA^T$ - **Singular values** are $\sigma_i = \sqrt{\lambda_i(A^TA)}$ This is not a coincidence — it connects directly to the geometric picture. $A$ transforms the unit sphere into an ellipsoid. The columns of $V$ are the input directions that map cleanly onto the ellipsoid's axes, and the columns of $U$ are those axes in the output space. $A^TA$ and $AA^T$ are the matrices that expose these directions: $A^TA$ measures how $A$ stretches the input space, and $AA^T$ measures how $A$ stretches the output space. Their eigenvectors are the natural axes of each space. In general $A^TA \neq AA^T$, so their eigenvectors differ — the input and output spaces have independent natural bases. This is precisely why $U \neq V$ in general. They do share the same nonzero eigenvalues $\sigma_i^2$, because the stretching magnitude is the same regardless of which side you measure from. --- ## The Derivation The two fundamental SVD relations are symmetric counterparts of each other, both baked into $A = U\Sigma V^T$: $Av_i = \sigma_i u_i \qquad\qquad A^T u_i = \sigma_i v_i$ **Deriving $Av_i = \sigma_i u_i$:** Apply $A = U\Sigma V^T$ directly to $v_i$: $Av_i = U\Sigma V^T v_i$ Since $V$ is orthogonal, $V^T v_i = e_i$ by the same orthonormality argument. Then $\Sigma e_i = \sigma_i e_i$, and $Ue_i = u_i$: $U\Sigma V^T v_i = U\Sigma e_i = U(\sigma_i e_i) = \sigma_i u_i$ So $Av_i = \sigma_i u_i$ is not an assumption — it falls straight out of the definition $A = U\Sigma V^T$. The second relation follows from $A^T = V\Sigma U^T$. Applying it to $u_i$: $A^T u_i = V\Sigma U^T u_i$ Since $U$ is orthogonal, $U^T u_i$ computes the dot product of $u_i$ against every column of $U$. Orthonormality means $u_j^T u_i = 0$ for $j \neq i$ and $u_i^T u_i = 1$, so the result is the standard basis vector $e_i$ — a vector with $1$ in position $i$ and zeros elsewhere: $U^T u_i = e_i$ Then $\Sigma e_i$ picks out the $i$-th diagonal entry of $\Sigma$: $\Sigma e_i = \sigma_i e_i$ And $Ve_i$ picks out the $i$-th column of $V$, which is $v_i$: $V(\sigma_i e_i) = \sigma_i v_i$ So the full chain $V\Sigma U^T u_i = \sigma_i v_i$ confirms $A^T u_i = \sigma_i v_i$. --- ### Columns of V are eigenvectors of $A^TA$ Start from $Av_i = \sigma_i u_i$ and hit both sides on the left with $A^T$: $A^TAv_i = A^T(\sigma_i u_i) = \sigma_i(A^T u_i) = \sigma_i(\sigma_i v_i) = \sigma_i^2 v_i$ So $v_i$ is an eigenvector of $A^TA$ with eigenvalue $\sigma_i^2$. ### Columns of U are eigenvectors of $AA^T$ Start from $A^T u_i = \sigma_i v_i$ and hit both sides on the left with $A$: $AA^T u_i = A(\sigma_i v_i) = \sigma_i(Av_i) = \sigma_i(\sigma_i u_i) = \sigma_i^2 u_i$ So $u_i$ is an eigenvector of $AA^T$ with eigenvalue $\sigma_i^2$. In both derivations, scalars factor out freely at each step — $A^T(\sigma_i u_i) = \sigma_i(A^T u_i)$ — because any linear map commutes with scalar multiplication. The singular values follow as: $\sigma_i = \sqrt{\lambda_i(A^TA)} = \sqrt{\lambda_i(AA^T)}$