In this chapter, we generalize the Singular Value Decomposition (SVD) to account for row weights and column metrics defined by symmetric positive definite matrices. This generalized SVD (GSVD) is particularly useful when the rows of the matrix are associated with weights and the space of columns has a metric.
4.1 Problem Setup
Given a matrix \(A \in \mathbb{R}^{n \times p}\), suppose:
The rows of \(A\) are associated with weights that are represented by a symmetric positive definite matrix \(D \in \mathbb{R}^{n \times n}\).
The space of columns of \(A\) has a metric defined by a symmetric positive definite matrix \(M \in \mathbb{R}^{p \times p}\).
The goal is to define a generalized SVD (GSVD) that incorporates both the row weights and the column metrics.
4.2 Generalized SVD Definition
The Generalized Singular Value Decomposition (GSVD) of \(A\) in the weighted and metric space is defined as:
\[
A = U \Sigma V^T
\]
where:
\(U \in \mathbb{R}^{n \times n}\) is the matrix of generalized left singular vectors with respect to \(D\), satisfying the condition \(U^T D U = I_n\).
\(\Sigma \in \mathbb{R}^{n \times p}\) is a diagonal matrix containing the generalized singular values.
\(V \in \mathbb{R}^{p \times p}\) is the matrix of generalized right singular vectors with respect to \(M\), satisfying the condition \(V^T M V = I_p\).
4.2.1 Weighted Row Space
The matrix \(D\) defines a weighted inner product on the row space of \(A\). For any two vectors \(u, v \in \mathbb{R}^n\), the weighted inner product is given by:
\[
\langle u, v \rangle_D = u^T D v
\]
Thus, the generalized left singular vectors matrix \(U\) satisfies:
\[
U^T D U = I_n
\]
4.2.2 Metric on Column Space
The matrix \(M\) defines a metric on the column space of \(A\). For any two vectors \(x, y \in \mathbb{R}^p\), the metric is defined as:
\[
\langle x, y \rangle_M = x^T M y
\]
Thus, the generalized right singular vectors matrix \(V\) satisfies:
\[
V^T M V = I_p
\]
4.3 Generalized SVD Computation
4.3.1 Step 1: Form the Weighted Matrix
We first form the weighted matrix \(\tilde{A}\) using the matrices \(D\) and \(M\) as follows:
\[
\tilde{A} = D^{1/2} A M^{1/2}
\]
where \(D^{1/2}\) and \(M^{1/2}\) are the square root matrices of \(D\) and \(M\), respectively.
Reminder: Square Root of a Matrix
In matrix algebra, the square root of a symmetric positive definite matrix ( D ^{n n} ) is often computed using the Cholesky decomposition. This decomposition expresses the matrix ( D ) as:
\[
D = L L^T
\]
where ( L ^{n n} ) is a lower triangular matrix. The matrix ( L ) is considered the square root of ( D ), and it satisfies:
\[
L = D^{1/2}
\]
The Cholesky decomposition is used because it guarantees that ( D ) can be factored into this form as long as ( D ) is symmetric and positive definite.
Example 4.1 In Python, the numpy and scipy libraries can be used to compute the Cholesky decomposition as follows:
import numpy as npfrom scipy.linalg import cholesky# Define a symmetric positive definite matrix DD = np.array([[4, 1], [1, 3]])# Compute the square root of the matrix (Cholesky decomposition)D_sqrt = cholesky(D, lower=True)print("Square root of matrix D (Cholesky factor):")
Square root of matrix D (Cholesky factor):
print(D_sqrt)
[[2. 0. ]
[0.5 1.6583124]]
Example 4.2
# Define a symmetric positive definite matrix DD <-matrix(c(4, 1, 1, 3), nrow=2, byrow=TRUE)# Compute the square root of the matrix (Cholesky decomposition)D_sqrt <-chol(D)print("Square root of matrix D (Cholesky factor):")
[1] "Square root of matrix D (Cholesky factor):"
print(D_sqrt)
[,1] [,2]
[1,] 2 0.500000
[2,] 0 1.658312
4.3.2 Step 2: Compute Standard SVD
Next, we compute the standard SVD of the matrix \(\tilde{A}\):
\[
\tilde{A} = \tilde{U} \Sigma \tilde{V}^T
\]
where: - \(\tilde{U}\) and \(\tilde{V}\) are orthogonal matrices, - \(\Sigma\) is the diagonal matrix of singular values.
4.3.3 Step 3: Recover Generalized SVD
From the standard SVD of \(\tilde{A}\), we can recover the generalized singular vectors and singular values as follows:
The generalized left singular vectors are given by:
\[
U = D^{-1/2} \tilde{U}
\]
The generalized right singular vectors are given by:
\[
V = M^{-1/2} \tilde{V}
\]
The matrix \(\Sigma\) remains the same.
Thus, the generalized SVD of \(A\) is:
\[
A = U \Sigma V^T
\]
4.4 Properties of the Generalized SVD
4.4.1 Generalized Singular Values
The entries of \(\Sigma\) represent the generalized singular values of the matrix \(A\), which reflect the scaling factors of the matrix in the weighted and metric space defined by \(D\) and \(M\).
4.4.2 Orthogonality
The matrices \(U\) and \(V\) satisfy the following orthogonality conditions with respect to the weighted and metric space:
For the row space:
\[
U^T D U = I_n
\]
For the column space:
\[
V^T M V = I_p
\]
4.4.3 Best Rank-\(k\) Approximation in Generalized Space
Similar to the standard SVD, the GSVD provides the best rank-\(k\) approximation of a matrix in the generalized space. The best rank-\(k\) approximation of \(A\) in the generalized sense is given by:
\[
A_k = U_k \Sigma_k V_k^T
\]
where \(U_k\) and \(V_k\) are the first \(k\) columns of \(U\) and \(V\), respectively, and \(\Sigma_k\) is the diagonal matrix containing the top \(k\) generalized singular values.
This approximation minimizes the generalized Frobenius norm:
\[
\| A - A_k \|_{D, M}
\]
which is defined as:
\[
\| A - A_k \|_{D, M}^2 = \text{trace}((A - A_k)^T M (A - A_k) D)
\]
4.5 Applications of Generalized SVD
4.5.1 Generalized Principal Component Analysis (GPCA)
In Generalized PCA, the data matrix \(A\) is projected onto new axes that maximize variance while taking into account the metric and weights provided by \(M\) and \(D\). This is useful when different features in the data have different scales or when specific constraints are imposed on the data space.
4.5.2 Canonical Correlation Analysis (CCA)
Canonical Correlation Analysis finds linear relationships between two sets of variables. By using GSVD, we can decompose two datasets with different metrics or weightings, enabling us to understand their correlation structure in a generalized framework.
4.5.3 Weighted Data Clustering
In clustering algorithms, GSVD can be used to decompose data where different features or variables have different importance (weights). This allows clustering algorithms to operate in a space where some variables are more heavily weighted than others.
# Define the matrix A (n x p)A <-matrix(c(1, 2, 3, 4, 5, 6), nrow =3, byrow =TRUE)# Define the symmetric positive definite matrices D (n x n) and M (p x p)D <-matrix(c(2, 0, 0, 0, 3, 0, 0, 0, 1), nrow =3, byrow =TRUE)M <-matrix(c(4, 0, 0, 1), nrow =2, byrow =TRUE)# Step 1: Compute the square root of D and M using Cholesky decompositionD_sqrt <-chol(D)M_sqrt <-chol(M)# Step 2: Form the weighted matrix \tilde{A} = D^{1/2} A M^{1/2}A_tilde <- D_sqrt %*% A %*%t(M_sqrt)# Step 3: Compute the SVD of \tilde{A}svd_res <-svd(A_tilde)U_tilde <- svd_res$uSigma <- svd_res$dV_tilde <- svd_res$v# Step 4: Recover the generalized singular vectorsU <-solve(D_sqrt) %*% U_tildeV <-solve(M_sqrt) %*% V_tilde# Display the resultscat("Generalized Left Singular Vectors (U):\n")
The Generalized Singular Value Decomposition (GSVD) extends the classical SVD by incorporating row weights and column metrics defined by symmetric positive definite matrices. This decomposition is essential in scenarios where the data is subject to weighting or metric constraints, and it provides powerful tools for approximation, compression, and analysis in generalized spaces.