Non-negative Matrix Factorization

These functions perform Non-negative Matrix Factorization on data stored in a TreeSummarizedExperiment object.

getNMF(x, ...)

addNMF(x, ...)

# S4 method for class 'SummarizedExperiment'
getNMF(x, k = 2, assay.type = "counts", eval.metric = "evar", ...)

# S4 method for class 'SummarizedExperiment'
addNMF(
  x,
  k = 2,
  assay.type = "counts",
  eval.metric = "evar",
  name = "NMF",
  ...
)

Arguments

x: a TreeSummarizedExperiment object.
...: optional arguments passed to nmf::NMF.
k: numeric vector. A number of latent vectors/topics. (Default: 2)
assay.type: Character scalar. Specifies which assay to use for NMF ordination. (Default: "counts")
eval.metric: Character scalar. Specifies the evaluation metric that will be used to select the model with the best fit. Must be one of the following options: "evar" (explained variance; maximized), "sparseness.basis" (degree of sparsity in the basis matrix; maximized), "sparseness.coef" (degree of sparsity in the coefficient matrix; maximized), "rss" (residual sum of squares; minimized), "silhouette.coef" (quality of clustering based on the coefficient matrix; maximized), "silhouette.basis" (quality of clustering based on the basis matrix; maximized), "cophenetic" (correlation between cophenetic distances and original distances; maximized), "dispersion" (spread of data points within clusters; minimized). (Default: "evar")
name: Character scalar. The name to be used to store the result in the reducedDims of the output. (Default: "NMF")

Value

For getNMF, the ordination matrix with feature loadings matrix as attribute "loadings".

For addNMF, a TreeSummarizedExperiment object is returned containing the ordination matrix in reducedDims(x, name) with the following attributes:

"loadings" which is a matrix containing the feature loadings
"NMF_output" which is the output of function nmf::NMF
"best_fit" which is the result of the best fit if k is a vector of integers

Details

The functions getNMF and addNMF internally use nmf::NMF compute the ordination matrix and feature loadings.

If k is a vector of integers, NMF output is calculated for all the rank values contained in k, and the best fit is selected based on eval.metric value.

Examples

data(GlobalPatterns)
tse <- GlobalPatterns

# Reduce the number of features
tse <- agglomerateByPrevalence(tse, rank = "Phylum")

# Run NMF and add the result to reducedDim(tse, "NMF").
tse <- addNMF(tse, k = 2, name = "NMF")
#> Loading required package: registry
#> Loading required package: rngtools
#> Loading required package: cluster
#> NMF - BioConductor layer [OK] | Shared memory capabilities [NO: bigmemory] | Cores 2/2
#>   To enable shared memory capabilities, try: install.extras('
#> NMF
#> ')
#> 
#> Attaching package: ‘NMF’
#> The following object is masked from ‘package:S4Vectors’:
#> 
#>     nrun
#> The following object is masked from ‘package:generics’:
#> 
#>     fit

# Extract feature loadings
loadings_NMF <- getReducedDimAttribute(tse, "NMF", "loadings")
head(loadings_NMF)
#>                         [,1]         [,2]
#> AD3             1.797648e-07 7.635452e-04
#> Acidobacteria   1.751856e-05 2.833056e-02
#> Actinobacteria  3.296424e-02 8.478666e-02
#> Armatimonadetes 2.220446e-16 4.113226e-04
#> BRC1            9.691581e-08 5.520033e-05
#> Bacteroidetes   2.487102e-01 7.497906e-02

# Estimate models with number of topics from 2 to 4. Perform 2 runs.
tse <- addNMF(tse, k = c(2, 3, 4), name = "NMF_4", nrun = 2)
#> NMF - BioConductor layer [OK] | Shared memory capabilities [NO: bigmemory] | Cores 2/2
#>   To enable shared memory capabilities, try: install.extras('
#> NMF
#> ')
#> 
#> Attaching package: ‘NMF’
#> The following object is masked from ‘package:S4Vectors’:
#> 
#>     nrun
#> The following object is masked from ‘package:generics’:
#> 
#>     fit

# Extract feature loadings
loadings_NMF_4 <- getReducedDimAttribute(tse, "NMF_4", "loadings")
head(loadings_NMF_4)
#>                         [,1]         [,2]         [,3]         [,4]
#> AD3             2.220446e-16 1.445435e-03 8.620202e-08 2.220446e-16
#> Acidobacteria   2.029072e-10 5.352125e-02 2.337403e-05 4.185550e-05
#> Actinobacteria  7.091942e-02 1.580967e-02 1.217112e-02 5.252424e-03
#> Armatimonadetes 5.026734e-05 4.160350e-04 2.220446e-16 1.123469e-04
#> BRC1            2.220446e-16 9.936617e-05 7.607841e-08 2.315111e-06
#> Bacteroidetes   7.744470e-02 1.831469e-02 1.654822e-01 1.460432e-02