library(microbiomeDataSets)

Microbiome example data sets

The data sets are primarily named by the first author of the associated publication, together with a descriptive suffix. Aliases are provided for some of the data sets.

A table of the available data sets is available through the availableDataSets function.

availableDataSets()
#>             Dataset
#> 1  GrieneisenTSData
#> 2       LahtiMLData
#> 3        LahtiMData
#> 4      OKeefeDSData
#> 5 SilvermanAGutData
#> 6        SongQAData
#> 7   SprockettTHData

All data are downloaded from ExperimentHub and cached for local re-use. Check the man pages of each function for a detailed documentation of the data contents and original source.

The microbiome data is usually loaded as a TreeSummarizedExperiment. If other associated data tables (metabolomic, biomarker..) are provided, the integrated data collection is provided as MultiAssayExperiment.

For more information on how to use these objects, please refer to the vignettes of those packages.

#rebook::prettySessionInfo()
sessionInfo()
#> R version 4.4.1 (2024-06-14)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.4 LTS
#> 
#> Matrix products: default
#> BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
#> LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: UTC
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#>  [1] microbiomeDataSets_1.13.3      MultiAssayExperiment_1.31.3   
#>  [3] TreeSummarizedExperiment_2.1.4 Biostrings_2.73.1             
#>  [5] XVector_0.45.0                 SingleCellExperiment_1.27.2   
#>  [7] SummarizedExperiment_1.35.1    Biobase_2.65.0                
#>  [9] GenomicRanges_1.57.1           GenomeInfoDb_1.41.1           
#> [11] IRanges_2.39.0                 S4Vectors_0.43.0              
#> [13] BiocGenerics_0.51.0            MatrixGenerics_1.17.0         
#> [15] matrixStats_1.3.0              BiocStyle_2.33.1              
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.2.1        dplyr_1.1.4             blob_1.2.4             
#>  [4] filelock_1.0.3          fastmap_1.2.0           lazyeval_0.2.2         
#>  [7] BiocFileCache_2.13.0    digest_0.6.36           lifecycle_1.0.4        
#> [10] KEGGREST_1.45.1         tidytree_0.4.6          RSQLite_2.3.7          
#> [13] magrittr_2.0.3          compiler_4.4.1          rlang_1.1.4            
#> [16] sass_0.4.9              tools_4.4.1             utf8_1.2.4             
#> [19] yaml_2.3.8              knitr_1.47              S4Arrays_1.5.1         
#> [22] htmlwidgets_1.6.4       curl_5.2.1              bit_4.0.5              
#> [25] DelayedArray_0.31.3     abind_1.4-5             BiocParallel_1.39.0    
#> [28] purrr_1.0.2             desc_1.4.3              grid_4.4.1             
#> [31] fansi_1.0.6             ExperimentHub_2.13.0    cli_3.6.3              
#> [34] rmarkdown_2.27          crayon_1.5.3            ragg_1.3.2             
#> [37] treeio_1.29.0           generics_0.1.3          httr_1.4.7             
#> [40] DBI_1.2.3               ape_5.8                 cachem_1.1.0           
#> [43] zlibbioc_1.51.1         parallel_4.4.1          AnnotationDbi_1.67.0   
#> [46] BiocManager_1.30.23     vctrs_0.6.5             yulab.utils_0.1.4      
#> [49] Matrix_1.7-0            jsonlite_1.8.8          bookdown_0.39          
#> [52] bit64_4.0.5             systemfonts_1.1.0       jquerylib_0.1.4        
#> [55] tidyr_1.3.1             glue_1.7.0              pkgdown_2.0.9          
#> [58] codetools_0.2-20        BiocVersion_3.20.0      UCSC.utils_1.1.0       
#> [61] tibble_3.2.1            pillar_1.9.0            rappdirs_0.3.3         
#> [64] htmltools_0.5.8.1       GenomeInfoDbData_1.2.12 dbplyr_2.5.0           
#> [67] R6_2.5.1                textshaping_0.4.0       evaluate_0.24.0        
#> [70] lattice_0.22-6          AnnotationHub_3.13.0    png_0.1-8              
#> [73] memoise_2.0.1           bslib_0.7.0             Rcpp_1.0.12            
#> [76] SparseArray_1.5.10      nlme_3.1-165            xfun_0.45              
#> [79] fs_1.6.4                pkgconfig_2.0.3