Import data

Monday, August 25, 2025

Microbiome data science workflow

Microbiome data science workflow

Importers and converters

  • mia includes importers and converters for standard data formats
    • Importer: Import file
    • Converter: Convert R object to other format
  • You can also create TreeSE object manually

Open databases

Data resource Package # studies or datasets # samples
Curated metagenomic data (Pasolli et al. 2017) curatedMetagenomicData (Pasolli et al. 2017) 93 22,588
HoloFood (Rogers et al. 2025) HoloFoodR (Borman, Sannikov, and Lahti 2025) - 9,990
MGnify (Gurbich et al. 2023) MGnifyR (Borman, Allen, and Lahti 2025) 5129 616,138
Microbiome benchmark data (Gamboa-Tuz et al. 2025) MicrobiomeBenchmarkData (Gamboa-Tuz et al. 2025) 6 1,125
microbiomeDataSets (Lahti, Ernst, and Shetty 2025) microbiomeDataSets (Lahti, Ernst, and Shetty 2025) 6 19,100

Demonstration

library(mia)

tse <- importBIOM("file_path.biom")
print(tse)
class: TreeSummarizedExperiment 
dim: 19216 26 
metadata(0):
assays(1): counts
rownames(19216): 549322 522457 ... 200359 271582
rowData names(7): Kingdom Phylum ... Genus Species
colnames(26): CL3 CC1 ... Even2 Even3
colData names(7): X.SampleID Primer ... SampleType Description
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
rowLinks: a LinkDataFrame (19216 rows)
rowTree: 1 phylo tree(s) (19216 leaves)
colLinks: NULL
colTree: NULL

Exercises

From OMA online book, Chapter 4: Import

  • Exercise 1

References

Borman, Tuomas, Ben Allen, and Leo Lahti. 2025. MGnifyR: R Interface to EBI MGnify Metagenomics Resource. https://github.com/EBI-Metagenomics/MGnifyR.
Borman, Tuomas, Artur Sannikov, and Leo Lahti. 2025. HoloFoodR: R Interface to EBI HoloFood Resource. https://github.com/EBI-Metagenomics/HoloFoodR.
Gamboa-Tuz, Samuel D., Marcel Ramos, Eric Franzosa, Curtis Huttenhower, Nicola Segata, Sehyun Oh, and Levi Waldron. 2025. “Commonly Used Compositional Data Analysis Implementations Are Not Advantageous in Microbial Differential Abundance Analyses Benchmarked Against Biological Ground Truth.” bioRxiv. https://doi.org/10.1101/2025.02.13.638109.
Gurbich, Tatiana A., Alexandre Almeida, Martin Beracochea, Tony Burdett, Josephine Burgin, Guy Cochrane, Shriya Raj, et al. 2023. MGnify Genomes: A Resource for Biome-Specific Microbial Genome Catalogues.” Journal of Molecular Biology, Computation resources for molecular biology, 435 (14): 168016. https://doi.org/10.1016/j.jmb.2023.168016.
Lahti, Leo, Felix G. M. Ernst, and Sudarshan Shetty. 2025. microbiomeDataSets: Experiment Hub Based Microbiome Datasets. https://doi.org/10.18129/B9.bioc.microbiomeDataSets.
Pasolli, Edoardo, Lucas Schiffer, Paolo Manghi, Audrey Renson, Valerie Obenchain, Duy Tin Truong, Francesco Beghini, et al. 2017. “Accessible, Curated Metagenomic Data Through ExperimentHub.” Nature Methods 14 (11): 1023–24. https://doi.org/10.1038/nmeth.4468.
Rogers, Alexander B, Varsha Kale, Germana Baldi, Antton Alberdi, M Thomas P Gilbert, Dipayan Gupta, Morten T Limborg, et al. 2025. “HoloFood Data Portal: Holo-Omic Datasets for Analysing Host–Microbiota Interactions in Animal Production.” Database (Oxford) 2025: baae112. https://doi.org/10.1093/database/baae112.