R/AllGenerics.R, R/getRPCA.R
getRPCA.RdThese functions implement robust principal component analysis (RPCA)
for single tables and joint RPCA for multiple tables.
*RPCA functions run RPCA for single table. *JointRPCA runs the
analysis jointly for multiple tables. get* functions return the
results of the analysis while add* adds them to the input object.
getRPCA(x, ...)
addRPCA(x, ...)
getJointRPCA(x, ...)
addJointRPCA(x, ...)
# S4 method for class 'SingleCellExperiment'
getRPCA(x, ...)
# S4 method for class 'SummarizedExperiment'
getRPCA(x, assay.type = "counts", ...)
# S4 method for class 'SummarizedExperiment'
addRPCA(x, name = "RPCA", ...)
# S4 method for class 'MultiAssayExperiment'
getJointRPCA(x, experiments, assay.types, ...)
# S4 method for class 'SingleCellExperiment'
getJointRPCA(x, experiments, assay.types, ...)
# S4 method for class 'SingleCellExperiment'
addJointRPCA(x, name = "JointRPCA", ...)
# S4 method for class 'MultiAssayExperiment'
addJointRPCA(x, name = "JointRPCA", ...)a SummarizedExperiment object.
additional arguments:
ncomponents:Integer scalar. The number of components
to estimate. (Default: 3)
max.iterations:Integer scalar. The number of iterations
run in OptSpace algorithm. (Default: 3)
tolerance:Numeric scalar. Accepted error between
lower rank representation obtained by OptSpace algorithm and original
table. (Default: 3)
Character scalar. Specifies the name of assay
used in calculation. (Default: "counts")
Character scalar. Name of results stored in x.
(Default: "RPCA" or "JointRPCA" depending on the method)
Character vector or integer scalar. Names
or indices of experiments selected from x.
Character vector. Names assays selected from
experiments.
matrix, TreeSummarizedExperiment or MultiAssayExperiment
object.
These functions perform robust principal component analysis (RPCA) using a low-rank matrix approximation followed by principal component analysis. Missing values are handled using matrix completion based on the OptSpace algorithm.
Single-table RPCA
For a single assay, the workflow is:
Extract the selected assay matrix
Estimate a low-rank approximation of the matrix using the OptSpace algorithm, which reconstructs missing values and reduces noise.
Apply double centering (row and column centering).
Apply PCA (via singular value decomposition) to the centered matrix.
Return the sample coordinates in PCA space together with additional information such as loadings, explained variance, and pairwise distances.
Joint RPCA
When multiple assays are provided, joint RPCA estimates a shared low-rank representation across all tables. Each table may contain different features, but they must share the same samples.
The workflow is:
Extract the selected experiments and assays.
Estimate a joint low-rank representation using a joint OptSpace algorithm that learns shared sample factors while allowing table-specific feature loadings.
Apply double centering (row and column centering).
Perform PCA on the centered matrix. to obtain the final sample coordinates.
Return the sample coordinates in PCA space together with additional information such as loadings, explained variance, and pairwise distances.
To assess generalization, the joint RPCA procedure optionally splits the samples into training and test sets. The low-rank model is learned using the training data and the test samples are projected into the resulting PCA space. Reconstruction error for the test set is returned as a measure of model fit.
The returned object contains the PCA sample scores as the main result, with additional information (e.g., rotation matrix, explained variance, reconstructed matrix, and reconstruction error) stored in attributes.
Martino, C. and Shenhav, L. et al. (2020) Context-aware dimensionality reduction deconvolutes gut microbial community dynamics. Nat. Biotechnol. doi:10.1038/s41587-020-0660-7
data("ibdmdb")
mae <- ibdmdb
# Apply data transformations
mae[[1]] <- transformAssay(mae[[1]], assay.type = "mgx", method = "rclr")
mae[[2]] <- transformAssay(mae[[2]], assay.type = "mtx", method = "rclr")
# Run joint-RPCA
res <- getJointRPCA(
mae,
experiments = c(1, 2),
assay.types = c("rclr", "rclr")
)
# Run RPCA for single experiment
res <- getRPCA(mae[[1]], assay.type = "rclr")