Weights \(D\in\mathbb{R}^{n\times n}\) (diagonal matrix with weights \(d_i\))
The key transformations are: \[
\begin{aligned}
X_c &= X-1_n\bar{x}^T\\
Y &= D^{1/2}X_cM^{1/2}\\
\end{aligned}
\]
Then the SVD of \(Z\) gives us all PCA elements:
\[Y = U\Sigma V^T\]
Let’s implement this transformation:
Code
import numpy as npfrom scipy import linalg# Example with standardized datadef standardize(X):"""Standardize data matrix""" mean = X.mean(axis=0) std = X.std(axis=0)return (X - mean) / stddef centering(X):"""Centering data matrix""" mean = X.mean(axis=0)return (X - mean)def transform_for_svd(X, M, D):""" Transform data matrix for SVD computation Parameters: ----------- X : array-like, shape (n_samples, n_features) Data matrix M : array-like, shape (n_features, n_features) Metric matrix D : array-like, shape (n_samples, n_samples) Weight matrix (diagonal) Returns: -------- Y : array-like Transformed matrix for SVD """# Compute matrix square roots M_sqrt = linalg.sqrtm(M) D_sqrt = np.sqrt(D)# Centering Xc = centering(X)# Transform X Y = np.diag(D_sqrt) @ Xc @ M_sqrtreturn Y# Example usagen, p =100, 5X = np.random.randn(n, p)M = np.eye(p) # Identity metricD = np.full(n, 1/n) # Uniform weightsY = transform_for_svd(X, M, D)
3.2 Recovering PCA Elements
From the SVD of \(Y\), we can recover all PCA elements:
Principal axes: \(A = M^{-1/2}V\)
Eigenvalues: \(\lambda_i = \sigma_i^2\)
Principal components: \(C = D^{-1/2}U\Sigma\)
Principal factors: \(F = D^{-1/2}U\)
Code
def compute_pca_from_svd(X, M, D):""" Compute PCA elements using SVD """# Transform data Y = transform_for_svd(X, M, D)# Compute SVD U, s, Vt = np.linalg.svd(Y, full_matrices=False)# Recover PCA elements M_sqrt = linalg.sqrtm(M) M_inv_sqrt = linalg.inv(M_sqrt) D_sqrt = np.sqrt(D)# Principal components C = np.diag(1/D_sqrt) @ U @ np.diag(s)# Principal factor F = np.diag(1/D_sqrt) @ U# Principal axes A = M_inv_sqrt @ Vt.T# Eigenvalues eigenvals = s**2return C, F, A, eigenvalsX_std = standardize(X)C, F, A, eigenvals = compute_pca_from_svd(X_std, M, D)
3.3 Properties and Verification
Let’s verify the key properties:
M-orthonormality of axes
D-orthogonality of components
Variance decomposition
Code
def verify_pca_properties(C, A, M, D):"""Verify PCA properties"""# M-orthonormality of axes axes_orthonormality = A.T @ M @ A# D-orthogonality of components components_orthogonality = C.T @ np.diag(D) @ Creturn axes_orthonormality, components_orthogonality# Verify propertiesaxes_ortho, comp_ortho = verify_pca_properties(C, A, M, D)import matplotlib.pyplot as pltimport seaborn as snsfig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))sns.heatmap(axes_ortho, annot=True, ax=ax1)ax1.set_title('M-orthonormality of axes')sns.heatmap(comp_ortho, annot=True, ax=ax2)ax2.set_title('D-orthogonality of components')plt.tight_layout()plt.show()
3.4 Special Cases
3.4.1 Standard PCA
When \(M = I_p\) and \(D = \dfrac{1}{n}I_n\):
Code
def standard_pca(X):"""Standard PCA with identity metric and uniform weights""" n = X.shape[0] M = np.eye(X.shape[1]) D = np.full(n, 1/n)return compute_pca_from_svd(X, M, D)# Compare with standard PCA implementationfrom sklearn.decomposition import PCApca_sklearn = PCA()pca_sklearn.fit(X_std)print("Eigenvalues from our implementation:", eigenvals)print("Eigenvalues from sklearn:", pca_sklearn.explained_variance_)
Eigenvalues from our implementation: [1.37874308 1.15645883 0.89165175 0.83267751 0.74046884]
Eigenvalues from sklearn: [1.39266978 1.16814023 0.90065833 0.84108839 0.74794832]
3.4.2 Normalized PCA
When \(M = \text{diag}(1/s_j^2)\) where \(s_j\) are variable standard deviations:
Code
def normalized_pca(X):"""PCA with variables normalized to unit variance""" std = X.std(axis=0) M = np.diag(1/std**2) n = X.shape[0] D = np.full(n, 1/n)return compute_pca_from_svd(X, M, D)