miaTime
is a package in the mia family,
providing tools for time series manipulation using the TreeSummarizedExperiment
data container.
miaTime
is hosted on Bioconductor, and can be installed
using via BiocManager
.
BiocManager::install("miaTime")
Divergence refers to the quantification of dissimilarity or difference between microbiome samples taken at different time points. This concept is crucial for tracking how microbial communities change or evolve over time, either in individuals or experimental conditions.
miaTime
provides several functions for calculating
divergences:
Baseline divergence: Measures the dissimilarity between each sample and a reference (typically the first time point) within a group (such as per subject).
Stepwise (rolling) divergence: Assesses the change between consecutive (or defined interval) time points by comparing each sample to the previous ith sample in the series.
The divergence is typically calculated using a beta diversity metric, such as Bray-Curtis dissimilarity, which quantifies how different two communities are in terms of their composition.
For rolling or stepwise divergences, the calculation measures how much the community shifts from one sampling point to the next.
Quantifying divergence allows researchers to:
Track temporal instability or stability of the microbiome.
Compare change rates between individuals or treatments.
Identify significant shifts or stable periods in the microbiome composition.
For example, in a human gut microbiome time series, calculating divergence helps determine how resistant the microbial ecosystem is to perturbations (such as antibiotics or diet changes) and how long it takes to return (if at all) to its original state.
Divergences can be calculated with *Divergence()
functions. In the example below, for each subject, we calculate the
divergence of their samples by comparing them to the first time
point.
data(hitchip1006)
tse <- hitchip1006
res <- getBaselineDivergence(
tse, time.col = "time", group = "sample",
name = c("baseline", "time_diff", "ref_samples"))
res |> head()
#> DataFrame with 6 rows and 3 columns
#> baseline time_diff ref_samples
#> <numeric> <numeric> <character>
#> Sample-1 0 0 Sample-1
#> Sample-2 0 0 Sample-2
#> Sample-3 0 0 Sample-3
#> Sample-4 0 0 Sample-4
#> Sample-5 0 0 Sample-5
#> Sample-6 0 0 Sample-6
A more convenient and preferred approach is to store the values
directly in colData
using the add*Divergence()
functions. In the example below, we calculate stepwise divergences with
a lag of 1, meaning that for each sample, the divergence is calculated
by comparing it to the previous time point for the same subject.
tse <- addStepwiseDivergence(tse, time.col = "time")
colData(tse)
#> DataFrame with 1151 rows and 13 columns
#> age sex nationality DNA_extraction_method project
#> <integer> <factor> <factor> <factor> <factor>
#> Sample-1 28 male US NA 1
#> Sample-2 24 female US NA 1
#> Sample-3 52 male US NA 1
#> Sample-4 22 female US NA 1
#> Sample-5 25 female US NA 1
#> ... ... ... ... ... ...
#> Sample-1168 50 female Scandinavia r 40
#> Sample-1169 31 female Scandinavia r 40
#> Sample-1170 31 female Scandinavia r 40
#> Sample-1171 52 male Scandinavia r 40
#> Sample-1172 52 male Scandinavia r 40
#> diversity bmi_group subject time sample divergence
#> <numeric> <factor> <factor> <numeric> <character> <numeric>
#> Sample-1 5.76 severeobese 1 0 Sample-1 NA
#> Sample-2 6.06 obese 2 0 Sample-2 NA
#> Sample-3 5.50 lean 3 0 Sample-3 NA
#> Sample-4 5.87 underweight 4 0 Sample-4 NA
#> Sample-5 5.89 lean 5 0 Sample-5 NA
#> ... ... ... ... ... ... ...
#> Sample-1168 5.87 severeobese 244 8.1 Sample-1168 0.479394
#> Sample-1169 5.87 overweight 245 2.3 Sample-1169 0.356508
#> Sample-1170 5.92 overweight 245 8.2 Sample-1170 0.434298
#> Sample-1171 6.04 overweight 246 2.1 Sample-1171 0.403581
#> Sample-1172 5.74 overweight 246 7.9 Sample-1172 0.485427
#> time_diff ref_samples
#> <numeric> <list>
#> Sample-1 NA NA
#> Sample-2 NA NA
#> Sample-3 NA NA
#> Sample-4 NA NA
#> Sample-5 NA NA
#> ... ... ...
#> Sample-1168 0.2 Sample-1172
#> Sample-1169 0.2 Sample-1171
#> Sample-1170 0.1 Sample-1127,Sample-1168
#> Sample-1171 0.7 Sample-1016,Sample-1022,Sample-1027,...
#> Sample-1172 0.1 Sample-1162
We can visualize time series data with miaViz. Below, we visualize 5 most abundant taxa.
library(miaViz)
p <- plotSeries(tse, assay.type = "counts",
time.col = "time",
features = getTop(tse, 5),
colour.by = "rownames")
p
sessionInfo()
#> R version 4.5.1 (2025-06-13)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] miaViz_1.17.9 ggraph_2.2.1
#> [3] ggplot2_3.5.2 miaTime_0.99.9
#> [5] mia_1.17.5 TreeSummarizedExperiment_2.17.1
#> [7] Biostrings_2.77.2 XVector_0.49.0
#> [9] SingleCellExperiment_1.31.1 MultiAssayExperiment_1.35.6
#> [11] SummarizedExperiment_1.39.1 Biobase_2.69.0
#> [13] GenomicRanges_1.61.1 Seqinfo_0.99.2
#> [15] IRanges_2.43.0 S4Vectors_0.47.0
#> [17] BiocGenerics_0.55.1 generics_0.1.4
#> [19] MatrixGenerics_1.21.0 matrixStats_1.5.0
#> [21] knitr_1.50 BiocStyle_2.37.0
#>
#> loaded via a namespace (and not attached):
#> [1] RColorBrewer_1.1-3 jsonlite_2.0.0
#> [3] magrittr_2.0.3 estimability_1.5.1
#> [5] ggbeeswarm_0.7.2 farver_2.1.2
#> [7] rmarkdown_2.29 fs_1.6.6
#> [9] ragg_1.4.0 vctrs_0.6.5
#> [11] memoise_2.0.1 DelayedMatrixStats_1.31.0
#> [13] ggtree_3.17.1 htmltools_0.5.8.1
#> [15] S4Arrays_1.9.1 BiocBaseUtils_1.11.2
#> [17] BiocNeighbors_2.3.1 janeaustenr_1.0.0
#> [19] cellranger_1.1.0 gridGraphics_0.5-1
#> [21] SparseArray_1.9.1 sass_0.4.10
#> [23] parallelly_1.45.1 bslib_0.9.0
#> [25] tokenizers_0.3.0 htmlwidgets_1.6.4
#> [27] desc_1.4.3 plyr_1.8.9
#> [29] DECIPHER_3.5.0 emmeans_1.11.2
#> [31] cachem_1.1.0 igraph_2.1.4
#> [33] lifecycle_1.0.4 pkgconfig_2.0.3
#> [35] rsvd_1.0.5 Matrix_1.7-3
#> [37] R6_2.6.1 fastmap_1.2.0
#> [39] tidytext_0.4.3 aplot_0.2.8
#> [41] digest_0.6.37 ggnewscale_0.5.2
#> [43] patchwork_1.3.1 scater_1.37.0
#> [45] irlba_2.3.5.1 SnowballC_0.7.1
#> [47] textshaping_1.0.1 vegan_2.7-1
#> [49] beachmat_2.25.3 labeling_0.4.3
#> [51] polyclip_1.10-7 abind_1.4-8
#> [53] mgcv_1.9-3 compiler_4.5.1
#> [55] withr_3.0.2 BiocParallel_1.43.4
#> [57] viridis_0.6.5 DBI_1.2.3
#> [59] ggforce_0.5.0 MASS_7.3-65
#> [61] DelayedArray_0.35.2 bluster_1.19.0
#> [63] permute_0.9-8 tools_4.5.1
#> [65] vipor_0.4.7 beeswarm_0.4.0
#> [67] ape_5.8-1 glue_1.8.0
#> [69] nlme_3.1-168 gridtext_0.1.5
#> [71] grid_4.5.1 cluster_2.1.8.1
#> [73] reshape2_1.4.4 gtable_0.3.6
#> [75] fillpattern_1.0.2 tzdb_0.5.0
#> [77] tidyr_1.3.1 hms_1.1.3
#> [79] BiocSingular_1.25.0 tidygraph_1.3.1
#> [81] ScaledMatrix_1.17.0 xml2_1.3.8
#> [83] ggrepel_0.9.6 pillar_1.11.0
#> [85] stringr_1.5.1 yulab.utils_0.2.0
#> [87] splines_4.5.1 tweenr_2.0.3
#> [89] dplyr_1.1.4 ggtext_0.1.2
#> [91] treeio_1.33.0 lattice_0.22-7
#> [93] tidyselect_1.2.1 DirichletMultinomial_1.51.0
#> [95] scuttle_1.19.0 gridExtra_2.3
#> [97] bookdown_0.43 xfun_0.52
#> [99] graphlayouts_1.2.2 rbiom_2.2.1
#> [101] stringi_1.8.7 ggfun_0.2.0
#> [103] lazyeval_0.2.2 yaml_2.3.10
#> [105] evaluate_1.0.4 codetools_0.2-20
#> [107] tibble_3.3.0 BiocManager_1.30.26
#> [109] ggplotify_0.1.2 cli_3.6.5
#> [111] xtable_1.8-4 systemfonts_1.2.3
#> [113] jquerylib_0.1.4 Rcpp_1.1.0
#> [115] readxl_1.4.5 parallel_4.5.1
#> [117] pkgdown_2.1.3 readr_2.1.5
#> [119] sparseMatrixStats_1.21.0 decontam_1.29.0
#> [121] viridisLite_0.4.2 mvtnorm_1.3-3
#> [123] slam_0.1-55 tidytree_0.4.6
#> [125] scales_1.4.0 purrr_1.1.0
#> [127] crayon_1.5.3 rlang_1.1.6