R/calculateUnifrac.R
calculateUnifrac.Rd
This function calculates the Unifrac distance for all sample-pairs
in a TreeSummarizedExperiment
object. The function utilizes rbiom:unifrac()
.
calculateUnifrac(x, tree, ...)
# S4 method for class 'ANY,phylo'
calculateUnifrac(x, tree, weighted = FALSE, ...)
# S4 method for class 'TreeSummarizedExperiment,missing'
calculateUnifrac(
x,
assay.type = assay_name,
assay_name = exprs_values,
exprs_values = "counts",
tree.name = tree_name,
tree_name = "phylo",
transposed = FALSE,
...
)
runUnifrac(
x,
tree,
weighted = FALSE,
node.label = nodeLab,
nodeLab = NULL,
...
)
a numeric matrix with samples as rows or a
SummarizedExperiment
object.
if x
is a matrix, a
phylo
object matching the
matrix. This means that the phylo object and the columns should relate
to the same type of features (aka. microorganisms).
optional arguments not used.
Logical scalar
. Should use weighted-Unifrac
calculation? Weighted-Unifrac takes into account the relative abundance of
species/taxa shared between samples, whereas unweighted-Unifrac only
considers presence/absence. Default is FALSE
, meaning the
unweighted-Unifrac distance is calculated for all pairs of samples.
(Default: FALSE
)
Character scalar
. Specifies the name of the
assay used in calculation. (Default: "counts"
)
Deprecated. Use assay.type
instead.
Deprecated. Use assay.type
instead.
Character scalar
. Specifies the name of the
tree used in calculation. (Default: "phylo"
)
Deprecated. Use tree.name
instead.
Logical scalar
. Is x transposed with samples in rows?
(Default: FALSE
)
if x
is a matrix,
a character
vector specifying links between rows/columns and tips of tree
.
The length must equal the number of rows/columns of x
. Furthermore, all the
node labs must be present in tree
.
Deprecated. Use node.label
instead.
a sample-by-sample distance matrix, suitable for NMDS, etc.
Please note that if calculateUnifrac
is used as a FUN
for
runMDS
, the argument ntop
has to be set to nrow(x)
.
Please note that runUnifrac
expects a matrix with samples per row
and not per column. This is implemented to be compatible with other
distance calculations such as dist
as much as
possible.
http://bmf.colorado.edu/unifrac/
See also additional descriptions of Unifrac in the following articles:
Lozupone, Hamady and Knight, “Unifrac - An Online Tool for Comparing Microbial Community Diversity in a Phylogenetic Context.”, BMC Bioinformatics 2006, 7:371
Lozupone, Hamady, Kelley and Knight, “Quantitative and qualitative (beta) diversity measures lead to different insights into factors that structure microbial communities.” Appl Environ Microbiol. 2007
Lozupone C, Knight R. “Unifrac: a new phylogenetic method for comparing microbial communities.” Appl Environ Microbiol. 2005 71 (12):8228-35.
data(esophagus)
library(scater)
calculateUnifrac(esophagus, weighted = FALSE)
#> B C
#> C 0.5175550
#> D 0.5182284 0.5422394
calculateUnifrac(esophagus, weighted = TRUE)
#> B C
#> C 0.1050480
#> D 0.1401124 0.1422409
# for using calculateUnifrac in conjunction with runMDS the tree argument
# has to be given separately. In addition, subsetting using ntop must
# be disabled
esophagus <- runMDS(esophagus, FUN = calculateUnifrac, name = "Unifrac",
tree = rowTree(esophagus),
assay.type = "counts",
ntop = nrow(esophagus))
reducedDim(esophagus)
#> [,1] [,2]
#> B -0.003662032 0.2941472
#> C -0.269272297 -0.1500536
#> D 0.272934329 -0.1440936
#> attr(,"eig")
#> [1] 0.1470141 0.1298017 0.0000000
#> attr(,"GOF")
#> [1] 1 1