Differential abundance analysis (DAA)
Thursday, February 5, 2026
Overview of Day 4
12-14
Association testing (alpha & beta diversity)
14-16
Differential abundance analysis
Differential abundance
Alpha diversity: how diverse the community is?
Beta diversity: how similar the microbial communities are?
Differential abundance: how individual taxa differ between conditions?
Differential abundance analysis (DAA)
Goal: identify features (e.g., species) whose abundance varies with external factors: groups, gradients
Differential abundance analysis (DAA)
Approaches:
Classical statistical tests
Microbiome-specific methods
Challenges: compositionality, sparsity, multiple testing
DAA is central in microbiome studies, but requires caution due to unique properties of sequencing data.
Simple example
Significance (p-value) for the selected feature:
Note: data has 308 OTUs -> multiple testing
Assumptions about the data
Taxonomic profiling data:
Sparse
Non-Gaussian
Zero-inflated
Overdispersed
etc.
-> Usual statistical tests (Wilcoxon, t-test) are potentially problematic.
Elementary methods provide more replicable results in microbial differential abundance analysis
Relative abundances with a Wilcoxon test
Log-transformed relative abundances with a t-test
Presence/absence of taxa with logistic regression
Pelto et al. 2025 show that simple methods often give more replicable results.
Instead of complex pipelines, simple classical tests may be more reproducible across studies.
Classical tests are linear models
Relative abundances with a Wilcoxon test
Non-parametric, but can be expressed as a rank-based linear model
\[
\mathrm{Rank}(\mathrm{Relative\ abundance}) \sim \mathrm{Group}
\]
Log-transformed relative abundances with a t-test
\[
\log(\mathrm{Relative\ abundance}) \sim \mathrm{Group}
\]
Presence/absence of taxa with logistic regression
Generalized linear model with logit link
\[
\mathrm{Presence} \sim \mathrm{Group}, \quad \mathrm{Presence} \in \{0,1\}
\]
Instead of complex pipelines, simple classical tests may be more reproducible across studies.
Wilcoxon vs t-test
Wilcoxon: compares medians, robust to outliers, non-parametric
t-test: compares means, assumes normality, sensitive to outliers
This plot illustrates how outliers affect the t-test but not the Wilcoxon test. Blue line = mean, red line = median.
library (maaslin3)
maaslin3_out <- maaslin3 (
input_data = tse,
formula = "~ patient_status" ,
normalization = "TSS" ,
transform = "LOG" ,
)
Summary on differential abundance
To identify individual significant taxa (w.r.t. age, diet etc.).
make useful statistical assumptions for taxonomic profiling data
control for multiple testing
provide automated tools to summarize results (significances, effect sizes, visualizations)
-> Complements community level beta diversity analyses.
A Critique of Differential Abundance Analysis, and Advocacy for an Alternative. Thomas P. Quinn, Elliott Gordon-Rodriguez, Ionas Erb. arXiv:2104.07266 [stat.ME]
What is a suitable unit for analysis?
Heatmaps and other visualization techniques
Task: spend a moment testing and understanding the example visualizing taxonomic abundances on heatmaps in OMA Chapter 9