microbiomeDataSets.Rmd
library(microbiomeDataSets)
The data sets are primarily named by the first author of the associated publication, together with a descriptive suffix. Aliases are provided for some of the data sets.
A table of the available data sets is available through the
availableDataSets
function.
availableDataSets()
#> Dataset
#> 1 GrieneisenTSData
#> 2 LahtiMLData
#> 3 LahtiMData
#> 4 OKeefeDSData
#> 5 SilvermanAGutData
#> 6 SongQAData
#> 7 SprockettTHData
All data are downloaded from ExperimentHub and cached for local re-use. Check the man pages of each function for a detailed documentation of the data contents and original source.
The microbiome data is usually loaded as a TreeSummarizedExperiment. If other associated data tables (metabolomic, biomarker..) are provided, the integrated data collection is provided as MultiAssayExperiment.
For more information on how to use these objects, please refer to the vignettes of those packages.
#rebook::prettySessionInfo()
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: UTC
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] microbiomeDataSets_1.13.3 MultiAssayExperiment_1.31.3
#> [3] TreeSummarizedExperiment_2.1.4 Biostrings_2.73.1
#> [5] XVector_0.45.0 SingleCellExperiment_1.27.2
#> [7] SummarizedExperiment_1.35.1 Biobase_2.65.0
#> [9] GenomicRanges_1.57.1 GenomeInfoDb_1.41.1
#> [11] IRanges_2.39.0 S4Vectors_0.43.0
#> [13] BiocGenerics_0.51.0 MatrixGenerics_1.17.0
#> [15] matrixStats_1.3.0 BiocStyle_2.33.1
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.2.1 dplyr_1.1.4 blob_1.2.4
#> [4] filelock_1.0.3 fastmap_1.2.0 lazyeval_0.2.2
#> [7] BiocFileCache_2.13.0 digest_0.6.36 lifecycle_1.0.4
#> [10] KEGGREST_1.45.1 tidytree_0.4.6 RSQLite_2.3.7
#> [13] magrittr_2.0.3 compiler_4.4.1 rlang_1.1.4
#> [16] sass_0.4.9 tools_4.4.1 utf8_1.2.4
#> [19] yaml_2.3.8 knitr_1.47 S4Arrays_1.5.1
#> [22] htmlwidgets_1.6.4 curl_5.2.1 bit_4.0.5
#> [25] DelayedArray_0.31.3 abind_1.4-5 BiocParallel_1.39.0
#> [28] purrr_1.0.2 desc_1.4.3 grid_4.4.1
#> [31] fansi_1.0.6 ExperimentHub_2.13.0 cli_3.6.3
#> [34] rmarkdown_2.27 crayon_1.5.3 ragg_1.3.2
#> [37] treeio_1.29.0 generics_0.1.3 httr_1.4.7
#> [40] DBI_1.2.3 ape_5.8 cachem_1.1.0
#> [43] zlibbioc_1.51.1 parallel_4.4.1 AnnotationDbi_1.67.0
#> [46] BiocManager_1.30.23 vctrs_0.6.5 yulab.utils_0.1.4
#> [49] Matrix_1.7-0 jsonlite_1.8.8 bookdown_0.39
#> [52] bit64_4.0.5 systemfonts_1.1.0 jquerylib_0.1.4
#> [55] tidyr_1.3.1 glue_1.7.0 pkgdown_2.0.9
#> [58] codetools_0.2-20 BiocVersion_3.20.0 UCSC.utils_1.1.0
#> [61] tibble_3.2.1 pillar_1.9.0 rappdirs_0.3.3
#> [64] htmltools_0.5.8.1 GenomeInfoDbData_1.2.12 dbplyr_2.5.0
#> [67] R6_2.5.1 textshaping_0.4.0 evaluate_0.24.0
#> [70] lattice_0.22-6 AnnotationHub_3.13.0 png_0.1-8
#> [73] memoise_2.0.1 bslib_0.7.0 Rcpp_1.0.12
#> [76] SparseArray_1.5.10 nlme_3.1-165 xfun_0.45
#> [79] fs_1.6.4 pkgconfig_2.0.3