1  Mathematical Foundations of SVD

1.1 Mathematical Foundations

1.1.1 Definition and Basic Properties

The Singular Value Decomposition (SVD) of a matrix \(A \in \mathbb{R}^{n\times p}\) is a factorization of the form:

\[A = U\Sigma V^T\]

where:

  • \(U \in \mathbb{R}^{n\times n}\) is orthogonal (\(U^TU = UU^T = I_n\))
  • \(\Sigma \in \mathbb{R}^{n\times p}\) is diagonal with non-negative entries
  • \(V \in \mathbb{R}^{p\times p}\) is orthogonal (\(V^TV = VV^T = I_p\))

Let’s illustrate this with Python:

Code
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Create a sample matrix
np.random.seed(42)
A = np.random.rand(5, 3)

# Compute SVD
U, s, Vt = np.linalg.svd(A, full_matrices=True)

# Print the shapes of resulting matrices
print(f"Original matrix A shape: {A.shape}")
print(f"U matrix shape: {U.shape}")
print(f"Singular values shape: {s.shape}")
print(f"V^T matrix shape: {Vt.shape}")
Original matrix A shape: (5, 3)
U matrix shape: (5, 5)
Singular values shape: (3,)
V^T matrix shape: (3, 3)

1.1.2 Properties Visualization

Let’s visualize some key properties:

Code
# Verify orthogonality of U
U_orthogonality = np.round(U.T @ U, decimals=2)

plt.figure(figsize=(8, 6))
sns.heatmap(U_orthogonality, annot=True, cmap='RdBu', center=0)
plt.title("Orthogonality of U: $U^TU = I$")
plt.tight_layout()
plt.show()

1.1.3 Singular Values

The singular values \(\sigma_i\) appear in descending order:

\[\sigma_1 \geq \sigma_2 \geq ... \geq \sigma_r \geq 0\]

Code
# Plot singular values
plt.figure(figsize=(8, 4))
plt.stem(s)
plt.title("Singular Values")
plt.xlabel("Index")
plt.ylabel("Value")
plt.grid(True)
plt.show()

1.1.4 Reduced Form

For a matrix of rank \(r\), we can use the reduced form:

\[A = U_r\Sigma_r V_r^T\]

where:

  • \(U_r\) consists of the first \(r\) columns of \(U\)
  • \(\Sigma_r\) is \(r \times r\)
  • \(V_r\) consists of the first \(r\) columns of \(V\)
Code
def plot_svd_reconstruction(A, k):
    U, s, Vt = np.linalg.svd(A)
    # Reconstruct with k components
    A_k = U[:, :k] @ np.diag(s[:k]) @ Vt[:k, :]
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
    
    im1 = ax1.imshow(A, cmap='RdBu')
    ax1.set_title('Original Matrix')
    plt.colorbar(im1, ax=ax1)
    
    im2 = ax2.imshow(A_k, cmap='RdBu')
    ax2.set_title(f'Rank-{k} Approximation')
    plt.colorbar(im2, ax=ax2)
    
    plt.tight_layout()
    plt.show()
    
    error = np.linalg.norm(A - A_k, 'fro')
    print(f"Frobenius norm error: {error:.4f}")

# Create a test matrix and show reconstruction
A = np.random.rand(10, 8)
plot_svd_reconstruction(A, 2)

Frobenius norm error: 1.8496

1.2 Geometric Interpretation

The SVD provides a geometric decomposition where:

  1. \(V^T\) rotates the unit vectors
  2. \(\Sigma\) scales them by the singular values
  3. \(U\) rotates the scaled vectors to the final position
Code
def plot_svd_transformation(A):
    U, s, Vt = np.linalg.svd(A)
    
    # Create unit circle points
    theta = np.linspace(0, 2*np.pi, 100)
    circle = np.vstack([np.cos(theta), np.sin(theta)])
    
    # Apply transformations
    transformed = A @ circle
    
    plt.figure(figsize=(12, 4))
    
    plt.subplot(131)
    plt.plot(circle[0], circle[1], 'b-')
    plt.title('Original Unit Circle')
    plt.axis('equal')
    plt.grid(True)
    
    plt.subplot(133)
    plt.plot(transformed[0], transformed[1], 'r-')
    plt.title('After Transformation')
    plt.axis('equal')
    plt.grid(True)
    
    plt.tight_layout()
    plt.show()

# Create a 2x2 transformation matrix and visualize
A = np.array([[2, 1], [1, 3]])
plot_svd_transformation(A)

1.3 Properties and Theorems

1.3.1 Fundamental Properties

Let \(A \in \mathbb{R}^{n\times p}\) with SVD \(A = U\Sigma V^T\). The following properties hold:

  1. Existence

Theorem 1.1 Every real matrix \(A \in \mathbb{R}^{n\times p}\) has a singular value decomposition.

  1. Uniqueness:

    • The singular values \(\sigma_i\) are unique and can be ordered: \(\sigma_1 \geq \sigma_2 \geq ... \geq \sigma_r > 0\)
    • If the singular values are distinct, the corresponding singular vectors are unique up to sign
  2. Rank:

    • rank\((A) = r\) = number of non-zero singular values
    • \(r \leq \min(n,p)\)
  3. Norms:

    • \(\|A\|_2 = \sigma_1\) (spectral norm)
    • \(\|A\|_F^2 = \sum_{i=1}^r \sigma_i^2\) (Frobenius norm)

Let’s verify these properties:

Code
import numpy as np

# Create a matrix
A = np.array([[4, 0], [0, 3], [0, 0]])
U, s, Vt = np.linalg.svd(A)

print(f"Singular values: {s}")
print(f"Matrix rank: {np.linalg.matrix_rank(A)}")
print(f"Number of non-zero singular values: {np.sum(s > 1e-10)}")
print(f"Spectral norm: {np.linalg.norm(A, 2)}")
print(f"Frobenius norm: {np.linalg.norm(A, 'fro')}")
print(f"Sum of squared singular values: {np.sum(s**2)}")
Singular values: [4. 3.]
Matrix rank: 2
Number of non-zero singular values: 2
Spectral norm: 4.0
Frobenius norm: 5.0
Sum of squared singular values: 25.0

1.3.2 Relationship with Eigenvalues

Several important relationships exist between singular values and eigenvalues:

  1. For symmetric matrices:

    • If \(\lambda\) is an eigenvalue of \(A\), then \(\sqrt{|\lambda|}\) is a singular value of \(A\)
    • The singular vectors are eigenvectors
  2. For general matrices:

    • The singular values of \(A\) are the square roots of the eigenvalues of \(A^TA\) or \(AA^T\)
    • The right singular vectors are eigenvectors of \(A^TA\)
    • The left singular vectors are eigenvectors of \(AA^T\)
Code
def compare_svd_eigen(A):
    # SVD
    U, s, Vt = np.linalg.svd(A)
    
    # Eigenvalues of A^T A
    eigenvals_ATA = np.linalg.eigvals(A.T @ A)
    eigenvals_ATA = np.sort(eigenvals_ATA)[::-1]  # sort in descending order
    
    print("Singular values:", s)
    print("Square root of eigenvalues of A^T A:", np.sqrt(np.abs(eigenvals_ATA)))

# Test with symmetric matrix
A_sym = np.array([[4, 1], [1, 4]])
print("Symmetric matrix:")
print(f"A = {A_sym}")
compare_svd_eigen(A_sym)

# Test with general matrix
A_gen = np.array([[2, 1], [1, 3], [0, 2]])
print("\nGeneral matrix:")
print(f"A = {A_gen}")
compare_svd_eigen(A_gen)
Symmetric matrix:
A = [[4 1]
 [1 4]]
Singular values: [5. 3.]
Square root of eigenvalues of A^T A: [5. 3.]

General matrix:
A = [[2 1]
 [1 3]
 [0 2]]
Singular values: [4.02825173 1.66528916]
Square root of eigenvalues of A^T A: [4.02825173 1.66528916]

1.3.3 Important Theorems

Theorem 1.2 (Best Low-Rank Approximation) For any matrix \(A\) of rank \(r\), and any \(k < r\), the best rank-\(k\) approximation to \(A\) in both Frobenius and spectral norms is:

\[A_k = \sum_{i=1}^k \sigma_i u_i v_i^T\]

This is known as the Eckart-Young-Mirsky theorem.

Code
def demonstrate_low_rank_approx(A, k):
    U, s, Vt = np.linalg.svd(A)
    
    # Best rank-k approximation
    Ak = U[:,:k] @ np.diag(s[:k]) @ Vt[:k,:]
    
    # Random rank-k matrix
    B = np.random.randn(A.shape[0], k) @ np.random.randn(k, A.shape[1])
    
    print(f"Error with best rank-{k} approximation: {np.linalg.norm(A - Ak, 'fro')}")
    print(f"Error with random rank-{k} matrix: {np.linalg.norm(A - B, 'fro')}")

# Create a test matrix
A = np.random.randn(5, 4)
demonstrate_low_rank_approx(A, 2)
Error with best rank-2 approximation: 1.682214710482391
Error with random rank-2 matrix: 7.867547407071715

Theorem 1.3 (Pseudo-inverse) The Moore-Penrose pseudo-inverse of \(A\) is given by:

\[A^+ = V\Sigma^+ U^T\]

where \(\Sigma^+\) is obtained by taking the reciprocal of non-zero singular values.

Code
def compare_pseudoinverse(A):
    # SVD pseudo-inverse
    U, s, Vt = np.linalg.svd(A)
    s_plus = np.zeros_like(s)
    s_plus[s > 1e-10] = 1/s[s > 1e-10]
    
    # Rank
    r = len(s_plus)
    A_plus = Vt.T @ np.diag(s_plus) @ U.T[:r]
    
    # NumPy's pseudo-inverse
    A_pinv = np.linalg.pinv(A)
    
    print(f"rank(A) = {r}")
    print("Max difference between methods:", np.max(np.abs(A_plus - A_pinv)))
    return

# Test with a rectangular matrix
A = np.random.randn(4, 3)
compare_pseudoinverse(A)
rank(A) = 3
Max difference between methods: 2.220446049250313e-16

Theorem 1.4 (SVD of Matrix Product) For matrices \(A\) and \(B\) of compatible dimensions:

\[\sigma_i(AB) \leq \sigma_1(A)\sigma_i(B)\]

This is known as the multiplicative singular value inequality.

Key Applications

These properties make SVD particularly useful for:

  1. Dimensionality reduction
  2. Low-rank approximation enables dimensionality reduction
  3. Solving least squares problems
  4. Computing matrix rank
  5. Determining matrix condition number
Computational Considerations
  • The singular values can be computed without computing U and V
  • For large sparse matrices, iterative methods are preferred
  • The condition number \(\kappa(A) = \sigma_1/\sigma_r\) affects numerical stability
  • In practice, most software uses algorithms that compute the SVD without explicitly forming matrices like \(AA^T\) or \(A^TA\), which would be numerically unstable.

1.4 Connection with Eigendecomposition

For symmetric matrices, SVD is closely related to eigendecomposition:

  • If \(A\) is symmetric, then \(U = V\)
  • The singular values are absolute values of eigenvalues
  • The left/right singular vectors are eigenvectors
Code
# Create a symmetric matrix
A_sym = np.array([[4, 1], [1, 4]])

# Compare SVD and eigendecomposition
U, s, Vt = np.linalg.svd(A_sym)
eigvals, eigvecs = np.linalg.eigh(A_sym)

print("Singular values:", s)
print("Absolute eigenvalues:", np.abs(eigvals))
Singular values: [5. 3.]
Absolute eigenvalues: [3. 5.]
Key Points
  • SVD exists for any real matrix
  • Singular values are unique
  • Number of non-zero singular values equals the rank
  • Provides optimal low-rank approximation (Eckart-Young theorem)

1.5 References