This function calculates community dominance indices. This includes the ‘Absolute’, ‘Berger-Parker’, ‘Core abundance’, ‘Gini’, ‘McNaughton’s’, ‘Relative’, and ‘Simpson's’ indices.

estimateDominance(
  x,
  assay.type = assay_name,
  assay_name = "counts",
  index = c("absolute", "dbp", "core_abundance", "gini", "dmn", "relative",
    "simpson_lambda"),
  ntaxa = 1,
  aggregate = TRUE,
  name = index,
  ...,
  BPPARAM = SerialParam()
)

# S4 method for SummarizedExperiment
estimateDominance(
  x,
  assay.type = assay_name,
  assay_name = "counts",
  index = c("absolute", "dbp", "core_abundance", "gini", "dmn", "relative",
    "simpson_lambda"),
  ntaxa = 1,
  aggregate = TRUE,
  name = index,
  ...,
  BPPARAM = SerialParam()
)

Arguments

x

a SummarizedExperiment object

assay.type

A single character value for selecting the assay to calculate the sample-wise estimates.

assay_name

a single character value for specifying which assay to use for calculation. (Please use assay.type instead. At some point assay_name will be disabled.)

index

a character vector, specifying the indices to be calculated.

ntaxa

Optional and only used for the Absolute and Relative dominance indices: The n-th position of the dominant taxa to consider (default: ntaxa = 1). Disregarded for the indices “dbp”, “core_abundance”, “Gini”, “dmn”, and “Simpson”.

aggregate

Optional and only used for the Absolute, dbp, Relative, and dmn dominance indices: Aggregate the values for top members selected by ntaxa or not. If TRUE, then the sum of relative abundances is returned. Otherwise the relative abundance is returned for the single taxa with the indicated rank (default: aggregate = TRUE). Disregarded for the indices “core_abundance”, “gini”, “dmn”, and “simpson”.

name

A name for the column(s) of the colData where the calculated Dominance indices should be stored in.

...

additional arguments currently not used.

BPPARAM

A BiocParallelParam object specifying whether calculation of estimates should be parallelized. (Currently not used)

Value

x with additional colData named *name*

Details

A dominance index quantifies the dominance of one or few species in a community. Greater values indicate higher dominance.

Dominance indices are in general negatively correlated with alpha diversity indices (species richness, evenness, diversity, rarity). More dominant communities are less diverse.

estimateDominance calculates the following community dominance indices:

  • 'absolute' Absolute index equals to the absolute abundance of the most dominant n species of the sample (specify the number with the argument ntaxa). Index gives positive integer values.

  • 'dbp' Berger-Parker index (See Berger & Parker 1970) calculation is a special case of the 'relative' index. dbp is the relative abundance of the most abundant species of the sample. Index gives values in interval 0 to 1, where bigger value represent greater dominance.$$dbp = \frac{N_1}{N_{tot}}$$ where \(N_1\) is the absolute abundance of the most dominant species and \(N_{tot}\) is the sum of absolute abundances of all species.

  • 'core_abundance' Core abundance index is related to core species. Core species are species that are most abundant in all samples, i.e., in whole data set. Core species are defined as those species that have prevalence over 50\ species must be prevalent in 50\ calculate the core abundance index. Core abundance index is sum of relative abundances of core species in the sample. Index gives values in interval 0 to 1, where bigger value represent greater dominance.$$core_abundance = \frac{N_{core}}{N_{tot}}$$ where \(N_{core}\) is the sum of absolute abundance of the core species and \(N_{tot}\) is the sum of absolute abundances of all species.

  • 'gini' Gini index is probably best-known from socio-economic contexts (Gini 1921). In economics, it is used to measure, for example, how unevenly income is distributed among population. Here, Gini index is used similarly, but income is replaced with abundance.If there is small group of species that represent large portion of total abundance of microbes, the inequality is large and Gini index closer to 1. If all species has equally large abundances, the equality is perfect and Gini index equals 0. This index should not be confused with Gini-Simpson index, which quantifies diversity.

  • 'dmn' McNaughton’s index is the sum of relative abundances of the two most abundant species of the sample (McNaughton & Wolf, 1970). Index gives values in the unit interval:$$dmn = (N_1 + N_2)/N_tot$$where \(N_1\) and \(N_2\) are the absolute abundances of the two most dominant species and \(N_{tot}\) is the sum of absolute abundances of all species.

  • 'relative' Relative index equals to the relative abundance of the most dominant n species of the sample (specify the number with the argument ntaxa). This index gives values in interval 0 to 1.$$relative = N_1/N_tot$$where \(N_1\) is the absolute abundance of the most dominant species and \(N_{tot}\) is the sum of absolute abundances of all species.

  • 'simpson_lambda' Simpson's (dominance) index or Simpson's lambda is the sum of squared relative abundances. This index gives values in the unit interval. This value equals the probability that two randomly chosen individuals belongs to the same species. The higher the probability, the greater the dominance (See e.g. Simpson 1949).$$lambda = \sum(p^2)$$where p refers to relative abundances.There is also a more advanced Simpson dominance index (Simpson 1949). However, this is not provided and the simpler squared sum of relative abundances is used instead as the alternative index is not in the unit interval and it is highly correlated with the simpler variant implemented here.

References

Berger WH & Parker FL (1970) Diversity of Planktonic Foraminifera in Deep-Sea Sediments. Science 168(3937):1345-1347. doi: 10.1126/science.168.3937.1345

Gini C (1921) Measurement of Inequality of Incomes. The Economic Journal 31(121): 124-126. doi: 10.2307/2223319

McNaughton, SJ and Wolf LL. (1970). Dominance and the niche in ecological systems. Science 167:13, 1--139

Simpson EH (1949) Measurement of Diversity. Nature 163(688). doi: 10.1038/163688a0

Author

Leo Lahti and Tuomas Borman. Contact: microbiome.github.io

Examples

data(esophagus)

# Calculates Simpson's lambda (can be used as a dominance index)
esophagus <- estimateDominance(esophagus, index="simpson_lambda")

# Shows all indices
colData(esophagus)
#> DataFrame with 3 rows and 1 column
#>   simpson_lambda
#>        <numeric>
#> B      0.1686282
#> C      0.0966551
#> D      0.3342507

# Indices must be written correctly (e.g. dbp, not dbp), otherwise an error
# gets thrown
esophagus <- estimateDominance(esophagus, index="dbp")
# Calculates dbp and Core Abundance indices
esophagus <- estimateDominance(esophagus, index=c("dbp", "core_abundance"))
#> Warning: The 'getPrevalentTaxa' function is deprecated. Use 'getPrevalentFeatures' instead.
#> Warning: The following values are already present in `colData` and will be overwritten: 'dbp'. Consider using the 'name' argument to specify alternative names.
# Shows all indices
colData(esophagus)
#> DataFrame with 3 rows and 3 columns
#>   simpson_lambda       dbp core_abundance
#>        <numeric> <numeric>      <numeric>
#> B      0.1686282  0.256158       0.960591
#> C      0.0966551  0.164706       0.898039
#> D      0.3342507  0.566210       0.908676
# Shows dbp index
colData(esophagus)$dbp
#>         B         C         D 
#> 0.2561576 0.1647059 0.5662100 
# Deletes dbp index
colData(esophagus)$dbp <- NULL
# Shows all indices, dbp is deleted
colData(esophagus)
#> DataFrame with 3 rows and 2 columns
#>   simpson_lambda core_abundance
#>        <numeric>      <numeric>
#> B      0.1686282       0.960591
#> C      0.0966551       0.898039
#> D      0.3342507       0.908676
# Deletes all indices
colData(esophagus) <- NULL

# Calculates all indices
esophagus <- estimateDominance(esophagus)
#> Warning: The 'getPrevalentTaxa' function is deprecated. Use 'getPrevalentFeatures' instead.
# Shows all indices
colData(esophagus)
#> DataFrame with 3 rows and 7 columns
#>    absolute       dbp core_abundance      gini       dmn  relative
#>   <numeric> <numeric>      <numeric> <numeric> <numeric> <numeric>
#> B        52  0.256158       0.960591  0.861474  0.502463  0.256158
#> C        42  0.164706       0.898039  0.787086  0.325490  0.164706
#> D       124  0.566210       0.908676  0.834278  0.648402  0.566210
#>   simpson_lambda
#>        <numeric>
#> B      0.1686282
#> C      0.0966551
#> D      0.3342507
# Deletes all indices
colData(esophagus) <- NULL

# Calculates all indices with explicitly specified names
esophagus <- estimateDominance(esophagus,
    index = c("dbp", "dmn", "absolute", "relative",
              "simpson_lambda", "core_abundance", "gini"),
    name  = c("BergerParker", "McNaughton", "Absolute", "Relative",
              "SimpsonLambda", "CoreAbundance", "Gini")
)
#> Warning: The 'getPrevalentTaxa' function is deprecated. Use 'getPrevalentFeatures' instead.
# Shows all indices
colData(esophagus)
#> DataFrame with 3 rows and 7 columns
#>   BergerParker McNaughton  Absolute  Relative SimpsonLambda CoreAbundance
#>      <numeric>  <numeric> <numeric> <numeric>     <numeric>     <numeric>
#> B     0.256158   0.502463        52  0.256158     0.1686282      0.960591
#> C     0.164706   0.325490        42  0.164706     0.0966551      0.898039
#> D     0.566210   0.648402       124  0.566210     0.3342507      0.908676
#>        Gini
#>   <numeric>
#> B  0.861474
#> C  0.787086
#> D  0.834278