Microbiome data science

  • Interactions between microbes and host
  • Sequencing data - Other omics regularly incorporated

Challenges

Lack of…

  • Standardization
  • Scalability, support for multi-table data
  • Interoperability with other fields

Bioconductor

  • Community-driven open-source project
  1. Training programs & workshops
  2. Conferences & community support
  3. Bioinformatics software


Software

  • ~2,300 R packages
  • Review, testing, documentation

Data containers

  • The core of software
  • Structured, standardized way to manage complex data
  • Enables modular, efficient workflows

TreeSummarizedExperiment

(Huang et al. 2021)

TreeSummarizedExperiment class

Microbiome Analysis (mia)

  • Microbiome data science ecosystem
  • Distributed through several R packages
  • mia package top 8.2% Bioconductor downloads

Bioconductor sticker mia logo

Community-driven ecosystem of tools

mia logo. MGnifyR logo. HoloFoodR logo. iSEE logo. MAE logo. SE logo. SCE logo. scater logo. benchdamic logo. image/svg+xml Created by Stefanie Peschel radEmu logo. DESeq2 logo. Biobakery logo.

Demonstration

library(mia)

library(mia)

tse <- importBIOM("path_to_your_file.biom")

library(mia)

tse <- importBIOM("path_to_your_file.biom")

print(tse)
class: TreeSummarizedExperiment 
dim: 308 60 
metadata(0):
assays(1): counts
rownames(308): species:Escherichia coli species:Alistipes putredinis
  ... species:Campylobacter ureolyticus species:Prevotella sp. oral
  taxon 376
rowData names(7): superkingdom phylum ... genus species
colnames(60): GupDM_A_11 GupDM_A_15 ... GupDM_JO GupDM_JP
colData names(27): study_name subject_id ... disease_stage
  disease_location
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
rowLinks: a LinkDataFrame (308 rows)
rowTree: 1 phylo tree(s) (10430 leaves)
colLinks: NULL
colTree: NULL

# Agglomerate data
tse <- agglomerateByRanks(tse)

# Agglomerate data
tse <- agglomerateByRanks(tse)

# Apply transformation
tse <- transformAssay(tse, method = "relabundance")

# Agglomerate data
tse <- agglomerateByRanks(tse)

# Apply transformation
tse <- transformAssay(tse, method = "relabundance")

print(tse)
class: TreeSummarizedExperiment 
dim: 308 60 
metadata(0):
assays(2): counts relabundance
rownames(308): species:Escherichia coli species:Alistipes putredinis
  ... species:Campylobacter ureolyticus species:Prevotella sp. oral
  taxon 376
rowData names(7): superkingdom phylum ... genus species
colnames(60): GupDM_A_11 GupDM_A_15 ... GupDM_JO GupDM_JP
colData names(27): study_name subject_id ... disease_stage
  disease_location
reducedDimNames(0):
mainExpName: NULL
altExpNames(7): superkingdom phylum ... genus species
rowLinks: a LinkDataFrame (308 rows)
rowTree: 1 phylo tree(s) (10430 leaves)
colLinks: NULL
colTree: NULL

library(miaViz)

library(miaViz)

plotAbundance(
    tse,
    rank = "phylum",
    assay.type = "relabundance",
    col.var = "disease"
)

library(miaViz)

plotAbundance(
    tse,
    rank = "phylum",
    assay.type = "relabundance",
    col.var = "disease"
)

# Calculate alpha diversity
tse <- addAlpha(tse)

# Calculate alpha diversity
tse <- addAlpha(tse)

# Visualize results
library(scater)
plotColData(tse, x = "disease", y = "faith_diversity")

# Calculate alpha diversity
tse <- addAlpha(tse)

# Visualize results
library(scater)
plotColData(tse, x = "disease", y = "faith_diversity")

# Calculate MDS
tse <- addMDS(tse, method = "unifrac")

# Calculate MDS
tse <- addMDS(tse, method = "unifrac")

# Visualize results
plotReducedDim(tse, dimred = "MDS", colour_by = "disease")

# Calculate MDS
tse <- addMDS(tse, method = "unifrac")

# Visualize results
plotReducedDim(tse, dimred = "MDS", colour_by = "disease")

Online book

  • Resources and tutorials for microbiome analysis
  • Community-built best practices
  • Open to contributions!

Acknowledgements

Leo Lahti, Felix M. Ernst, Giulio Benedetti, Sudarshan Shetty, Muluh Geraldson, Akewak Jeba, Thomaz Bastiaanssen, Aura Raulo, Levi Waldron, Henrik Eckermann, Chouaib Benchraka, Yağmur Şimşek, Basil Courbayre, Matti Ruuskanen, Stefanie Peschel, Christian L. Müller, Aki Havulinna, Shigdel Rajesh, Artur Sannikov, Himmi Lindgren, Lu Yang, Katariina Pärnänen, Noah de Gunst, Axel Dagnaud, Ely Seraidarian, Théotime Pralas, Jiya Chaudhary, Elina Chiesa, Pande Erawijantari, Shadman Ishraq, Sam Hillman, Matteo Calgaro, Basil Courbayre Dussau, Yang Cao, Eineje Ameh, Domenick J. Braccia, Renuka Potbhare, Hervé Pagès, Moritz E. Beber, Vivian Ikeh, Yu Gao, Daniel Garza, Karoline Faust, Jacques Serizay, Himel Mallick, Yihan Liu, Danielle Callan, Ben Allen, Teo Dallier, Elliot Gaudron-Parry, Inès Benseddik, Jesse Pasanen, Benjamin Valderrama

University of Turku logo

UTU logo

Research Council of Finland logo

CompLifeSci logo

Turun yliopistosäätiö logo

Thank you for your time!

Orchestrating Microbiome Analysis online book

References

Huang, Ruizhu, Charlotte Soneson, Felix G. M. Ernst, et al. 2021. “TreeSummarizedExperiment: A S4 Class for Data with Hierarchical Structure.” F1000Research 9: 1246. https://doi.org/10.12688/f1000research.26669.2.
Moreno-Indias, Isabel, Leo Lahti, Miroslava Nedyalkova, Ilze Elbere, Gennady V. Roshchupkin, Muhamed Adilovic, Onder Aydemir, et al. 2021. “Statistical and Machine Learning Techniques in Human Microbiome Studies: Contemporary Challenges and Solutions.” Frontiers in Microbiology 12: 277. https://doi.org/10.3389/fmicb.2021.635781.