Global Ecosystem State Variables

This page provides examples for analysing alpha diversity. See a separate page for beta diversity.

Load example data:

library(microbiome)
library(knitr)
data(dietswap)
pseq <- dietswap

Global indicators

A comprehensive list of global indicators of the ecosystem state can be obtained as follows. This includes various measures of richness, evenness, diversity, dominance, and rarity with default parameters. See the individual functions for more options regarding parameter tuning.

tab <-microbiome::alpha(pseq, index = "all")
kable(head(tab))
observed chao1 diversity_inverse_simpson diversity_gini_simpson diversity_shannon diversity_fisher diversity_coverage evenness_camargo evenness_pielou evenness_simpson evenness_evar evenness_bulla dominance_dbp dominance_dmn dominance_absolute dominance_relative dominance_simpson dominance_core_abundance dominance_gini rarity_log_modulo_skewness rarity_low_abundance rarity_rare_abundance
Sample-1 104 113.2308 7.561722 0.8677550 2.940698 16.69360 4 0.2014036 0.6331719 0.0727089 0.1709714 0.3298916 0.3279347 0.4297198 2774 0.3279347 0.1322450 0.9276510 0.8625360 2.057691 0.0289632 0.0146589
Sample-2 110 120.0000 8.102943 0.8765881 2.822472 15.20257 3 0.2261992 0.6004646 0.0736631 0.1334372 0.2755652 0.2428626 0.4656170 5121 0.2428626 0.1234119 0.9330361 0.8843695 2.057552 0.0299725 0.0349047
Sample-3 103 106.6000 4.291085 0.7669587 2.407963 13.42077 2 0.2109176 0.5195476 0.0416610 0.1373098 0.2541325 0.4594744 0.5603989 13271 0.4594744 0.2330413 0.9514247 0.9083094 2.052324 0.0339646 0.0093134
Sample-4 105 110.6250 7.930799 0.8739093 2.992482 15.56061 4 0.3422309 0.6429969 0.0755314 0.1711334 0.3176017 0.3230653 0.3957720 4279 0.3230653 0.1260907 0.8621367 0.8569405 2.051175 0.0348056 0.0367686
Sample-5 103 109.5000 3.170738 0.6846160 2.106022 14.53671 1 0.1407817 0.4544002 0.0307839 0.1661081 0.2310307 0.5450144 0.6317579 9456 0.5450144 0.3153840 0.9535447 0.9212609 2.059634 0.0441499 0.0105476
Sample-6 105 112.5833 2.953753 0.6614476 2.071136 14.55619 1 0.1709221 0.4450266 0.0281310 0.1625409 0.2308364 0.5693811 0.6430163 11243 0.5693811 0.3385524 0.9258078 0.9192203 2.059421 0.0375772 0.0168642

Alpha diversity

This returns a table with selected diversity indicators. See a separate page on Beta diversity.

tab <-microbiome::alpha(pseq, index = "all")
kable(head(tab))
observed chao1 diversity_inverse_simpson diversity_gini_simpson diversity_shannon diversity_fisher diversity_coverage evenness_camargo evenness_pielou evenness_simpson evenness_evar evenness_bulla dominance_dbp dominance_dmn dominance_absolute dominance_relative dominance_simpson dominance_core_abundance dominance_gini rarity_log_modulo_skewness rarity_low_abundance rarity_rare_abundance
Sample-1 104 113.2308 7.561722 0.8677550 2.940698 16.69360 4 0.2014036 0.6331719 0.0727089 0.1709714 0.3298916 0.3279347 0.4297198 2774 0.3279347 0.1322450 0.9276510 0.8625360 2.057691 0.0289632 0.0146589
Sample-2 110 120.0000 8.102943 0.8765881 2.822472 15.20257 3 0.2261992 0.6004646 0.0736631 0.1334372 0.2755652 0.2428626 0.4656170 5121 0.2428626 0.1234119 0.9330361 0.8843695 2.057552 0.0299725 0.0349047
Sample-3 103 106.6000 4.291085 0.7669587 2.407963 13.42077 2 0.2109176 0.5195476 0.0416610 0.1373098 0.2541325 0.4594744 0.5603989 13271 0.4594744 0.2330413 0.9514247 0.9083094 2.052324 0.0339646 0.0093134
Sample-4 105 110.6250 7.930799 0.8739093 2.992482 15.56061 4 0.3422309 0.6429969 0.0755314 0.1711334 0.3176017 0.3230653 0.3957720 4279 0.3230653 0.1260907 0.8621367 0.8569405 2.051175 0.0348056 0.0367686
Sample-5 103 109.5000 3.170738 0.6846160 2.106022 14.53671 1 0.1407817 0.4544002 0.0307839 0.1661081 0.2310307 0.5450144 0.6317579 9456 0.5450144 0.3153840 0.9535447 0.9212609 2.059634 0.0441499 0.0105476
Sample-6 105 112.5833 2.953753 0.6614476 2.071136 14.55619 1 0.1709221 0.4450266 0.0281310 0.1625409 0.2308364 0.5693811 0.6430163 11243 0.5693811 0.3385524 0.9258078 0.9192203 2.059421 0.0375772 0.0168642

Richness

This returns observed richness with given detection threshold(s).

tab <- richness(pseq)
kable(head(tab))
observed chao1
Sample-1 104 113.2308
Sample-2 110 120.0000
Sample-3 103 106.6000
Sample-4 105 110.6250
Sample-5 103 109.5000
Sample-6 105 112.5833

Dominance

The dominance index refers to the abundance of the most abundant species. Various dominance indices are available (see the function help for a list of options).

# Absolute abundances for the single most abundant taxa in each sample
tab <- dominance(pseq, index = "all")
kable(head(tab))
dbp dmn absolute relative simpson core_abundance gini
Sample-1 0.3279347 0.4297198 2774 0.3279347 0.1322450 0.9276510 0.8625360
Sample-2 0.2428626 0.4656170 5121 0.2428626 0.1234119 0.9330361 0.8843695
Sample-3 0.4594744 0.5603989 13271 0.4594744 0.2330413 0.9514247 0.9083094
Sample-4 0.3230653 0.3957720 4279 0.3230653 0.1260907 0.8621367 0.8569405
Sample-5 0.5450144 0.6317579 9456 0.5450144 0.3153840 0.9535447 0.9212609
Sample-6 0.5693811 0.6430163 11243 0.5693811 0.3385524 0.9258078 0.9192203

We also have a function to list the dominating (most abundant) taxa in each sample.

dominant(pseq)

Rarity and low abundance

The rarity indices quantify the concentration of rare or low abundance taxa. Various rarity indices are available (see the function help for a list of options).

tab <- rarity(pseq, index = "all")
kable(head(tab))
log_modulo_skewness low_abundance rare_abundance
Sample-1 2.057691 0.0289632 0.0146589
Sample-2 2.057552 0.0299725 0.0349047
Sample-3 2.052324 0.0339646 0.0093134
Sample-4 2.051175 0.0348056 0.0367686
Sample-5 2.059634 0.0441499 0.0105476
Sample-6 2.059421 0.0375772 0.0168642

Coverage

The coverage index gives the number of groups needed to have a given proportion of the ecosystem occupied (by default 0.5 ie 50%).

tab <- coverage(pseq, threshold = 0.5)
kable(head(tab))

Core abundance

The core_abundance function refers to the relative proportion of the core species. Non-core abundance provides the complement (1-x; see rare_abundance).

tab <- core_abundance(pseq, detection = .1/100, prevalence = 50/100)

Gini index

Gini index is a common measure for inequality in economical income. The inverse gini index (1/x) can also be used as a community diversity measure.

tab <- inequality(pseq)

Evenness

Various evenness measures are also available.

tab <- evenness(pseq, "all")
kable(head(tab))
camargo pielou simpson evar bulla
Sample-1 0.2014036 0.6331719 0.0727089 0.1709714 0.3298916
Sample-2 0.2261992 0.6004646 0.0736631 0.1334372 0.2755652
Sample-3 0.2109176 0.5195476 0.0416610 0.1373098 0.2541325
Sample-4 0.3422309 0.6429969 0.0755314 0.1711334 0.3176017
Sample-5 0.1407817 0.4544002 0.0307839 0.1661081 0.2310307
Sample-6 0.1709221 0.4450266 0.0281310 0.1625409 0.2308364

Visualization

To visualize diversity measures, the package provides a simple wrapper around ggplot2. Currently onnly one measure can be visualized at a time.

p.shannon <- boxplot_alpha(pseq, 
                           index = "shannon",
                           x_var = "sex",
                           fill.colors = c(female="cyan4", male="deeppink4"))

p.shannon <- p.shannon + theme_minimal() + 
  labs(x="Sex", y="Shannon diversity") +
  theme(axis.text = element_text(size=12),
        axis.title = element_text(size=16),
        legend.text = element_text(size=12),
        legend.title = element_text(size=16))
p.shannon

Testing differences in alpha diversity

We recommend the non-parametric Kolmogorov-Smirnov test for two-group comparisons when there are no relevant covariates.

# Construct the data
d <- meta(pseq)
d$diversity <- microbiome::diversity(pseq, "shannon")$shannon
# Split the values by group
spl <- split(d$diversity, d$sex)
# Kolmogorov-Smironv test
pv <- ks.test(spl$female, spl$male)$p.value
# Adjust the p-value
padj <- p.adjust(pv)

For tutorial on plotting the output of alpha diversity please check PlotDiversity