This notebook guides you through a basic alpha diversity analysis, where you first estimate alpha diversity in terms of a few indices, plot them for the different study groups and compare the results for the different indices.
The following packages are needed to succesfully run the examples in this notebook:
First of all, we import Tengeler2020 from the mia package and store it into a variable.
We calculate alpha diversity in terms of coverage, Shannon, inverse Simpson and Faith indices based on the counts assay. The first three indices differ from one another in how much weight they give to rare taxa: coverage considers all taxa equally important, whereas Shannon and - even more - Simpson give more importance to abundant taxa. Unlike all others, Faith index measures the phylogenetic diversity and thus requires a phylogenetic tree (stored as rowTree
in the TreeSE).
Next, we plot the four indices, with patient status on the x axis and alpha diversity on the y axis. We can also colour by cohort to check for batch effects.
The three metrics for alpha diversity follow different scales, but they seem to agree when comparing the distributions of the two patient groups.
library(patchwork)
# Calculate diversity metrics
tse <- estimateDiversity(tse, assay.type = "counts",
index = c("coverage", "inverse_simpson", "faith"))
# Generate a plot for each metric
plots <- lapply(c("coverage", "shannon", "inverse_simpson", "faith"),
plotColData, object = tse, x = "patient_status",
colour_by = "cohort", show_median = TRUE)
# Combine plots
wrap_plots(plots) +
plot_layout(guides = "collect") +
plot_annotation(tag_levels = "A")
Extra:
This two exercise could be explained in a second example.