These functions implement robust principal component analysis (RPCA) for single tables and joint RPCA for multiple tables. *RPCA functions run RPCA for single table. *JointRPCA runs the analysis jointly for multiple tables. get* functions return the results of the analysis while add* adds them to the input object.

getRPCA(x, ...)

addRPCA(x, ...)

getJointRPCA(x, ...)

addJointRPCA(x, ...)

# S4 method for class 'SingleCellExperiment'
getRPCA(x, ...)

# S4 method for class 'SummarizedExperiment'
getRPCA(x, assay.type = "counts", ...)

# S4 method for class 'SummarizedExperiment'
addRPCA(x, name = "RPCA", ...)

# S4 method for class 'MultiAssayExperiment'
getJointRPCA(x, experiments, assay.types, ...)

# S4 method for class 'SingleCellExperiment'
getJointRPCA(x, experiments, assay.types, ...)

# S4 method for class 'SingleCellExperiment'
addJointRPCA(x, name = "JointRPCA", ...)

# S4 method for class 'MultiAssayExperiment'
addJointRPCA(x, name = "JointRPCA", ...)

Arguments

x

a SummarizedExperiment object.

...

additional arguments:

  • ncomponents:Integer scalar. The number of components to estimate. (Default: 3)

  • max.iterations:Integer scalar. The number of iterations run in OptSpace algorithm. (Default: 3)

  • tolerance:Numeric scalar. Accepted error between lower rank representation obtained by OptSpace algorithm and original table. (Default: 3)

assay.type

Character scalar. Specifies the name of assay used in calculation. (Default: "counts")

name

Character scalar. Name of results stored in x. (Default: "RPCA" or "JointRPCA" depending on the method)

experiments

Character vector or integer scalar. Names or indices of experiments selected from x.

assay.types

Character vector. Names assays selected from experiments.

Value

matrix, TreeSummarizedExperiment or MultiAssayExperiment object.

Details

These functions perform robust principal component analysis (RPCA) using a low-rank matrix approximation followed by principal component analysis. Missing values are handled using matrix completion based on the OptSpace algorithm.

Single-table RPCA

For a single assay, the workflow is:

  1. Extract the selected assay matrix

  2. Estimate a low-rank approximation of the matrix using the OptSpace algorithm, which reconstructs missing values and reduces noise.

  3. Apply double centering (row and column centering).

  4. Apply PCA (via singular value decomposition) to the centered matrix.

  5. Return the sample coordinates in PCA space together with additional information such as loadings, explained variance, and pairwise distances.

Joint RPCA

When multiple assays are provided, joint RPCA estimates a shared low-rank representation across all tables. Each table may contain different features, but they must share the same samples.

The workflow is:

  1. Extract the selected experiments and assays.

  2. Estimate a joint low-rank representation using a joint OptSpace algorithm that learns shared sample factors while allowing table-specific feature loadings.

  3. Apply double centering (row and column centering).

  4. Perform PCA on the centered matrix. to obtain the final sample coordinates.

  5. Return the sample coordinates in PCA space together with additional information such as loadings, explained variance, and pairwise distances.

To assess generalization, the joint RPCA procedure optionally splits the samples into training and test sets. The low-rank model is learned using the training data and the test samples are projected into the resulting PCA space. Reconstruction error for the test set is returned as a measure of model fit.

The returned object contains the PCA sample scores as the main result, with additional information (e.g., rotation matrix, explained variance, reconstructed matrix, and reconstruction error) stored in attributes.

References

Martino, C. and Shenhav, L. et al. (2020) Context-aware dimensionality reduction deconvolutes gut microbial community dynamics. Nat. Biotechnol. doi:10.1038/s41587-020-0660-7

See also

Examples

data("ibdmdb")
mae <- ibdmdb

# Apply data transformations
mae[[1]] <- transformAssay(mae[[1]], assay.type = "mgx", method = "rclr")
mae[[2]] <- transformAssay(mae[[2]], assay.type = "mtx", method = "rclr")

# Run joint-RPCA
res <- getJointRPCA(
    mae,
    experiments = c(1, 2),
    assay.types = c("rclr", "rclr")
)

# Run RPCA for single experiment
res <- getRPCA(mae[[1]], assay.type = "rclr")