This page provides examples for analysing alpha diversity. See a separate page for beta diversity.
Load example data:
library(microbiome)
library(knitr)
data(dietswap)
pseq <- dietswap
A comprehensive list of global indicators of the ecosystem state can be obtained as follows. This includes various measures of richness, evenness, diversity, dominance, and rarity with default parameters. See the individual functions for more options regarding parameter tuning.
tab <-microbiome::alpha(pseq, index = "all")
kable(head(tab))
observed | chao1 | diversity_inverse_simpson | diversity_gini_simpson | diversity_shannon | diversity_fisher | diversity_coverage | evenness_camargo | evenness_pielou | evenness_simpson | evenness_evar | evenness_bulla | dominance_dbp | dominance_dmn | dominance_absolute | dominance_relative | dominance_simpson | dominance_core_abundance | dominance_gini | rarity_log_modulo_skewness | rarity_low_abundance | rarity_rare_abundance | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Sample-1 | 104 | 113.2308 | 7.561722 | 0.8677550 | 2.940698 | 16.69360 | 4 | 0.2014036 | 0.6331719 | 0.0727089 | 0.1709714 | 0.3298916 | 0.3279347 | 0.4297198 | 2774 | 0.3279347 | 0.1322450 | 0.9276510 | 0.8625360 | 2.057691 | 0.0289632 | 0.0146589 |
Sample-2 | 110 | 120.0000 | 8.102943 | 0.8765881 | 2.822472 | 15.20257 | 3 | 0.2261992 | 0.6004646 | 0.0736631 | 0.1334372 | 0.2755652 | 0.2428626 | 0.4656170 | 5121 | 0.2428626 | 0.1234119 | 0.9330361 | 0.8843695 | 2.057552 | 0.0299725 | 0.0349047 |
Sample-3 | 103 | 106.6000 | 4.291085 | 0.7669587 | 2.407963 | 13.42077 | 2 | 0.2109176 | 0.5195476 | 0.0416610 | 0.1373098 | 0.2541325 | 0.4594744 | 0.5603989 | 13271 | 0.4594744 | 0.2330413 | 0.9514247 | 0.9083094 | 2.052324 | 0.0339646 | 0.0093134 |
Sample-4 | 105 | 110.6250 | 7.930799 | 0.8739093 | 2.992482 | 15.56061 | 4 | 0.3422309 | 0.6429969 | 0.0755314 | 0.1711334 | 0.3176017 | 0.3230653 | 0.3957720 | 4279 | 0.3230653 | 0.1260907 | 0.8621367 | 0.8569405 | 2.051175 | 0.0348056 | 0.0367686 |
Sample-5 | 103 | 109.5000 | 3.170738 | 0.6846160 | 2.106022 | 14.53671 | 1 | 0.1407817 | 0.4544002 | 0.0307839 | 0.1661081 | 0.2310307 | 0.5450144 | 0.6317579 | 9456 | 0.5450144 | 0.3153840 | 0.9535447 | 0.9212609 | 2.059634 | 0.0441499 | 0.0105476 |
Sample-6 | 105 | 112.5833 | 2.953753 | 0.6614476 | 2.071136 | 14.55619 | 1 | 0.1709221 | 0.4450266 | 0.0281310 | 0.1625409 | 0.2308364 | 0.5693811 | 0.6430163 | 11243 | 0.5693811 | 0.3385524 | 0.9258078 | 0.9192203 | 2.059421 | 0.0375772 | 0.0168642 |
This returns a table with selected diversity indicators. See a separate page on Beta diversity.
tab <-microbiome::alpha(pseq, index = "all")
kable(head(tab))
observed | chao1 | diversity_inverse_simpson | diversity_gini_simpson | diversity_shannon | diversity_fisher | diversity_coverage | evenness_camargo | evenness_pielou | evenness_simpson | evenness_evar | evenness_bulla | dominance_dbp | dominance_dmn | dominance_absolute | dominance_relative | dominance_simpson | dominance_core_abundance | dominance_gini | rarity_log_modulo_skewness | rarity_low_abundance | rarity_rare_abundance | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Sample-1 | 104 | 113.2308 | 7.561722 | 0.8677550 | 2.940698 | 16.69360 | 4 | 0.2014036 | 0.6331719 | 0.0727089 | 0.1709714 | 0.3298916 | 0.3279347 | 0.4297198 | 2774 | 0.3279347 | 0.1322450 | 0.9276510 | 0.8625360 | 2.057691 | 0.0289632 | 0.0146589 |
Sample-2 | 110 | 120.0000 | 8.102943 | 0.8765881 | 2.822472 | 15.20257 | 3 | 0.2261992 | 0.6004646 | 0.0736631 | 0.1334372 | 0.2755652 | 0.2428626 | 0.4656170 | 5121 | 0.2428626 | 0.1234119 | 0.9330361 | 0.8843695 | 2.057552 | 0.0299725 | 0.0349047 |
Sample-3 | 103 | 106.6000 | 4.291085 | 0.7669587 | 2.407963 | 13.42077 | 2 | 0.2109176 | 0.5195476 | 0.0416610 | 0.1373098 | 0.2541325 | 0.4594744 | 0.5603989 | 13271 | 0.4594744 | 0.2330413 | 0.9514247 | 0.9083094 | 2.052324 | 0.0339646 | 0.0093134 |
Sample-4 | 105 | 110.6250 | 7.930799 | 0.8739093 | 2.992482 | 15.56061 | 4 | 0.3422309 | 0.6429969 | 0.0755314 | 0.1711334 | 0.3176017 | 0.3230653 | 0.3957720 | 4279 | 0.3230653 | 0.1260907 | 0.8621367 | 0.8569405 | 2.051175 | 0.0348056 | 0.0367686 |
Sample-5 | 103 | 109.5000 | 3.170738 | 0.6846160 | 2.106022 | 14.53671 | 1 | 0.1407817 | 0.4544002 | 0.0307839 | 0.1661081 | 0.2310307 | 0.5450144 | 0.6317579 | 9456 | 0.5450144 | 0.3153840 | 0.9535447 | 0.9212609 | 2.059634 | 0.0441499 | 0.0105476 |
Sample-6 | 105 | 112.5833 | 2.953753 | 0.6614476 | 2.071136 | 14.55619 | 1 | 0.1709221 | 0.4450266 | 0.0281310 | 0.1625409 | 0.2308364 | 0.5693811 | 0.6430163 | 11243 | 0.5693811 | 0.3385524 | 0.9258078 | 0.9192203 | 2.059421 | 0.0375772 | 0.0168642 |
This returns observed richness with given detection threshold(s).
tab <- richness(pseq)
kable(head(tab))
observed | chao1 | |
---|---|---|
Sample-1 | 104 | 113.2308 |
Sample-2 | 110 | 120.0000 |
Sample-3 | 103 | 106.6000 |
Sample-4 | 105 | 110.6250 |
Sample-5 | 103 | 109.5000 |
Sample-6 | 105 | 112.5833 |
The dominance index refers to the abundance of the most abundant species. Various dominance indices are available (see the function help for a list of options).
# Absolute abundances for the single most abundant taxa in each sample
tab <- dominance(pseq, index = "all")
kable(head(tab))
dbp | dmn | absolute | relative | simpson | core_abundance | gini | |
---|---|---|---|---|---|---|---|
Sample-1 | 0.3279347 | 0.4297198 | 2774 | 0.3279347 | 0.1322450 | 0.9276510 | 0.8625360 |
Sample-2 | 0.2428626 | 0.4656170 | 5121 | 0.2428626 | 0.1234119 | 0.9330361 | 0.8843695 |
Sample-3 | 0.4594744 | 0.5603989 | 13271 | 0.4594744 | 0.2330413 | 0.9514247 | 0.9083094 |
Sample-4 | 0.3230653 | 0.3957720 | 4279 | 0.3230653 | 0.1260907 | 0.8621367 | 0.8569405 |
Sample-5 | 0.5450144 | 0.6317579 | 9456 | 0.5450144 | 0.3153840 | 0.9535447 | 0.9212609 |
Sample-6 | 0.5693811 | 0.6430163 | 11243 | 0.5693811 | 0.3385524 | 0.9258078 | 0.9192203 |
We also have a function to list the dominating (most abundant) taxa in each sample.
dominant(pseq)
The rarity indices quantify the concentration of rare or low abundance taxa. Various rarity indices are available (see the function help for a list of options).
tab <- rarity(pseq, index = "all")
kable(head(tab))
log_modulo_skewness | low_abundance | rare_abundance | |
---|---|---|---|
Sample-1 | 2.057691 | 0.0289632 | 0.0146589 |
Sample-2 | 2.057552 | 0.0299725 | 0.0349047 |
Sample-3 | 2.052324 | 0.0339646 | 0.0093134 |
Sample-4 | 2.051175 | 0.0348056 | 0.0367686 |
Sample-5 | 2.059634 | 0.0441499 | 0.0105476 |
Sample-6 | 2.059421 | 0.0375772 | 0.0168642 |
The coverage index gives the number of groups needed to have a given proportion of the ecosystem occupied (by default 0.5 ie 50%).
tab <- coverage(pseq, threshold = 0.5)
kable(head(tab))
The core_abundance function refers to the relative proportion of the core species. Non-core abundance provides the complement (1-x; see rare_abundance).
tab <- core_abundance(pseq, detection = .1/100, prevalence = 50/100)
Gini index is a common measure for inequality in economical income. The inverse gini index (1/x) can also be used as a community diversity measure.
tab <- inequality(pseq)
Various evenness measures are also available.
tab <- evenness(pseq, "all")
kable(head(tab))
camargo | pielou | simpson | evar | bulla | |
---|---|---|---|---|---|
Sample-1 | 0.2014036 | 0.6331719 | 0.0727089 | 0.1709714 | 0.3298916 |
Sample-2 | 0.2261992 | 0.6004646 | 0.0736631 | 0.1334372 | 0.2755652 |
Sample-3 | 0.2109176 | 0.5195476 | 0.0416610 | 0.1373098 | 0.2541325 |
Sample-4 | 0.3422309 | 0.6429969 | 0.0755314 | 0.1711334 | 0.3176017 |
Sample-5 | 0.1407817 | 0.4544002 | 0.0307839 | 0.1661081 | 0.2310307 |
Sample-6 | 0.1709221 | 0.4450266 | 0.0281310 | 0.1625409 | 0.2308364 |
To visualize diversity measures, the package provides a simple wrapper around ggplot2. Currently onnly one measure can be visualized at a time.
p.shannon <- boxplot_alpha(pseq,
index = "shannon",
x_var = "sex",
fill.colors = c(female="cyan4", male="deeppink4"))
p.shannon <- p.shannon + theme_minimal() +
labs(x="Sex", y="Shannon diversity") +
theme(axis.text = element_text(size=12),
axis.title = element_text(size=16),
legend.text = element_text(size=12),
legend.title = element_text(size=16))
p.shannon
We recommend the non-parametric Kolmogorov-Smirnov test for two-group comparisons when there are no relevant covariates.
# Construct the data
d <- meta(pseq)
d$diversity <- microbiome::diversity(pseq, "shannon")$shannon
# Split the values by group
spl <- split(d$diversity, d$sex)
# Kolmogorov-Smironv test
pv <- ks.test(spl$female, spl$male)$p.value
# Adjust the p-value
padj <- p.adjust(pv)
For tutorial on plotting the output of alpha diversity please check PlotDiversity