Differential abundance analysis (DAA)

Thursday, February 5, 2026

Overview of Day 4

Time Theme
12-14 Association testing (alpha & beta diversity)
14-16 Differential abundance analysis

Differential abundance

  • Alpha diversity: how diverse the community is?
  • Beta diversity: how similar the microbial communities are?
  • Differential abundance: how individual taxa differ between conditions?

Differential abundance analysis (DAA)

Goal: identify features (e.g., species) whose abundance varies with external factors: groups, gradients

Differential abundance analysis (DAA)

Approaches:

  • Classical statistical tests
  • Microbiome-specific methods

Challenges: compositionality, sparsity, multiple testing

Simple example

Significance (p-value) for the selected feature:

[1] 0.002581178

Note: data has 308 OTUs -> multiple testing

Assumptions about the data

Taxonomic profiling data:

  • Sparse
  • Non-Gaussian
  • Zero-inflated
  • Overdispersed
  • etc.

-> Usual statistical tests (Wilcoxon, t-test) are potentially problematic.

Elementary methods provide more replicable results in microbial differential abundance analysis

  • Relative abundances with a Wilcoxon test
  • Log-transformed relative abundances with a t-test
  • Presence/absence of taxa with logistic regression

Pelto et al. 2025 show that simple methods often give more replicable results.

Classical tests are linear models

  • Relative abundances with a Wilcoxon test
    • Non-parametric, but can be expressed as a rank-based linear model

\[ \mathrm{Rank}(\mathrm{Relative\ abundance}) \sim \mathrm{Group} \]

  • Log-transformed relative abundances with a t-test
    • Standard linear model

\[ \log(\mathrm{Relative\ abundance}) \sim \mathrm{Group} \]

  • Presence/absence of taxa with logistic regression
    • Generalized linear model with logit link

\[ \mathrm{Presence} \sim \mathrm{Group}, \quad \mathrm{Presence} \in \{0,1\} \]

Wilcoxon vs t-test

  • Wilcoxon: compares medians, robust to outliers, non-parametric
  • t-test: compares means, assumes normality, sensitive to outliers

Demonstration

library(maaslin3)

maaslin3_out <- maaslin3(
    input_data = tse,
    formula = "~ patient_status",
    normalization = "TSS",
    transform = "LOG",
)

Summary on differential abundance

To identify individual significant taxa (w.r.t. age, diet etc.).

  • make useful statistical assumptions for taxonomic profiling data
  • control for multiple testing
  • provide automated tools to summarize results (significances, effect sizes, visualizations)

-> Complements community level beta diversity analyses.

Exercises

From OMA online book, Chapter 16: Differential abundance

  • All exercises

References

A Critique of Differential Abundance Analysis, and Advocacy for an Alternative. Thomas P. Quinn, Elliott Gordon-Rodriguez, Ionas Erb. arXiv:2104.07266 [stat.ME]

What is a suitable unit for analysis?

  • Individual taxa vs. total community composition?

    -> Consider broader subecosystems as an intermediate between these two extremes.

Visualization techniques

Heatmaps and other visualization techniques

Task: spend a moment testing and understanding the example visualizing taxonomic abundances on heatmaps in OMA Chapter 9