Principal Coordinate Analysis (PCoA) is a method to find the dimensions of the data that explain most of its variance. The diversity between samples can be expressed in terms of several ecological indices, such as Bray-Curtis and Aitchison dissimilarities. If Euclidean distance is used, PCoA becomes Principal Component Analysis (PCA). You can learn more about PCoA in OMA chapter 7.
The following packages are necessary to execute the code in this presentation:
scater: utils to visualise data stored in TreeSE objects
patchwork: framework to combine multiple ggplot objects
Example 1.1
To get started, we import Tengeler2020 from mia and store it into a variable.
# load dataset and store it into tsedata("Tengeler2020", package ="mia")tse <- Tengeler2020# Get summary about the object# What dimensions does the data have?tse
After that, we transform the counts assay to relative abundances and store the new assay back into the TreeSE.
# Transform counts to relative abundancetse <-transformAssay(tse, method ="relabundance")
Here, we run multi-dimensional scaling (another word for PCoA) on the relative abundance assay to reduce the data to fewer dimensions.
# Reduce number of dimensions from 151 to 3 by PCoAtse <-runMDS(tse,assay.type ="relabundance",FUN = vegan::vegdist,method ="bray",name ="Bray")# The new dimensions are stored in the reducedDim slothead(reducedDim(tse, "Bray"))
# The new dimensions can be used to visualise diversity among samplesp1 <-plotReducedDim(tse, "Bray",colour_by ="patient_status")p1
Figure 1: Ordination plots based on Bray-Curtis index. Samples are coloured by patient status.
Example 2.1
By default, Bray-Curtis dissimilarity is used. However, other metrics can be specified with the method argument.
# Reduce number of dimensions with a different metrictse <-runMDS(tse,assay.type ="relabundance",FUN = vegan::vegdist,method ="jaccard",name ="Jaccard")reducedDimNames(tse)
[1] "Bray" "Jaccard"
Example 2.2
# Visualise samples with the newly reduced dimensionsp3 <-plotReducedDim(tse, "Jaccard",colour_by ="patient_status")p3
Figure 2: Ordination plot based on Unifrac index. Samples are coloured by patient status.
Example 3.1
example with different distance function. (could be custom)