2  miaverse

This chapter provides an overview of the miaverse ecosystem. Section 2.1 aims to describe the relationship between data containers utilized in miaverse. Section 2.2 details the packages involved, while Section 2.3 provides guidance on installing these packages.

miaverse (MIcrobiome Analysis uniVERSE) is an actively developed R/Bioconductor framework for microbiome downstream analysis. It becomes particularly relevant when working with abundance tables derived from sequencing data, whether from shotgun metagenomics or 16S rRNA sequencing. Before utilizing miaverse, sequencing data must undergo preprocessing to convert raw sequence reads into abundance tables.

miaverse consists of multiple R/Bioc packages and this online book that you are reading. The idea is not only to offer tools for microbiome downstream analysis but also to serve as a resource for valuable insights, offering guidance on conducting microbiome data analysis and developing effective microbiome data science workflows.

The key concept of miaverse lies in its utilization of SummarizedExperiment-based data containers. This design choice enhances interoperability and versatility within the broader Bioconductor framework, facilitating access to an expanding array of tools. In practice, this approach allows for the integration of promising methods from related fields, such as single-cell sequencing.

miaverse framework.

2.1 Data containers

As discussed, miaverse is built upon TreeSummarizedExperiment (TreeSE) data container. TreeSummarizedExperiment is expanded from SingleCellExperiment (SCE) by incorporating additional slots tailored for microbiome analysis. SingleCellExperiment class is designed for single-cell sequencing (Lun and Risso 2020). Bioconductor offers wide variety of tools for this field including online book Orchestrating Single-Cell Analysis in Bioconductor (OSCA) (Amezquita et al. 2020). SingleCellExperiment, on the other hand, is further derived from SummarizedExperiment (SE) class. This hierarchical relationship among data containers means that all methods applicable to SingleCellExperiment and SummarizedExperiment objects can also be applied to TreeSummarizedExperiment objects.

Amezquita, Robert, Aaron Lun, Stephanie Hicks, and Raphael Gottardo. 2020. Orchestrating Single-Cell Analysis with Bioconductor. Bioconductor. https://bioconductor.org/books/release/OSCA/.
  • SummarizedExperiment (SE) (Morgan et al. 2020) is a generic and highly optimized container for complex data structures. It has become a common choice for analyzing various types of biomedical profiling data, such as RNAseq, ChIp-Seq, microarrays, flow cytometry, proteomics, and single-cell sequencing.

  • SingeCellExperiment (SCE) (Lun and Risso 2020) was developed as an extension to store copies of data to same data container.

  • TreeSummarizedExperiment (TreeSE) (R. Huang 2020) was developed as an extension to incorporate hierarchical information (such as phylogenetic trees and sample hierarchies) and reference sequences.

Morgan, Martin, Valerie Obenchain, Jim Hester, and Hervé Pagès. 2020. SummarizedExperiment: SummarizedExperiment Container. https://bioconductor.org/packages/SummarizedExperiment.
Lun, Aaron, and Davide Risso. 2020. SingleCellExperiment: S4 Classes for Single Cell Data.
Huang, Ruizhu. 2020. TreeSummarizedExperiment: A S4 Class for Data with Tree Structures.

SummarizedExperiment is extended to SingleCellExperiment and it is further extended to TreeSummarizedExperiment.

MultiAssayExperiment (MAE) (Ramos et al. 2017) provides an organized way to bind several different data containers together in a single object. For example, we can bind microbiome data (in TreeSE container) with metabolomic profiling data (in SE) container, with (partially) shared sample metadata. This is convenient and robust for instance, in subsetting and other data manipulation tasks. Microbiome data can be part of multiomics experiments and analysis strategies. We highlight how the methods used throughout in this book relate to this data framework by using the TreeSE, MAE, and classes beyond.

Ramos, Marcel, Lucas Schiffer, Angela Re, Rimsha Azhar, Azfar Basunia, Carmen Rodriguez Cabrera, Tiffany Chan, et al. 2017. “Software for the Integration of Multiomics Experiments in Bioconductor.” Cancer Research. https://doi.org/10.1158/0008-5472.CAN-17-0344.

2.2 Package ecosystem

Methods for the(Tree)SummarizedExperiment and MultiAssayExperiment data containers are provided by multiple independent developers through R/Bioconductor packages. Some of these are listed below (tips on new packages are welcome).

Especially, Bioconductor packages include comprehensive manuals as they are required. Follow the links below to find package vignettes and other materials showing the utilization of packages and their methods.

2.2.1 mia package family

The mia package family provides general methods for microbiome data wrangling, analysis and visualization.

Ernst, Felix G. M., Sudarshan Shetty, and Leo Lahti. 2020. Mia: Microbiome Analysis.
Ernst, Felix G. M., Tuomas Borman, and Leo Lahti. 2022. miaViz: Microbiome Analysis Plotting and Visualization.
Simsek, Yagmur, Leo Lahti, Daniel Garza, and Karoline Faust. 2021. “miaSim r Package.” microbiome.github.io/miaSim.
Lahti, L. 2021. miaTime: Time Series Analysis.

2.2.2 SE supporting packages

The following DA methods support (Tree)SummarizedExperiment.

Lin, Huang, and Shyamal Das Peddada. 2020. “Analysis of Compositions of Microbiomes with Bias Correction.” Nature Communications 11 (1): 1–11. https://doi.org/https://doi.org/10.1038/s41467-020-17041-7.
Calgaro, Matteo, Chiara Romualdi, Davide Risso, and Nicola Vitulo. 2022. “Benchdamic: Benchmarking of Differential Abundance Methods for Microbiome Data.” Bioinformatics 39 (1). https://doi.org/10.1093/bioinformatics/btac778.
Gloor, Gregory B., Jean M. Macklaim, and Andrew D. Fernandes. 2016. “Displaying Variation in Large Datasets: Plotting a Visual Summary of Effect Sizes.” Journal of Computational and Graphical Statistics 25 (3): 971–79. https://doi.org/10.1080/10618600.2015.1131161.

2.2.3 Other relevant packages

Zhou, Huijuan, Kejun He, Jun Chen, and Xianyang Zhang. 2022. LinDA: Linear Models for Differential Abundance Analysis of Microbiome Compositional Data.” Genome Biology 23 (1): 95. https://doi.org/10.1186/s13059-022-02655-5.
Oksanen, Jari, F. Guillaume Blanchet, Michael Friendly, Roeland Kindt, Pierre Legendre, Dan McGlinn, Peter R. Minchin, et al. 2020. Vegan: Community Ecology Package. https://CRAN.R-project.org/package=vegan.
Nguyen QP, Frost HR, Hoen AG. n.d. CBEA: Competitive balances for taxonomic enrichment analysis.” PLoS Comput Biol 18 (5). https://doi.org/10.1371/journal.pcbi.1010091.
Sánchez-Sánchez, Pedro, Francisco J Santonja, and Alfonso Benítez-Páez. 2022. “Assessment of Human Microbiota Stability Across Longitudinal Samples Using Iteratively Growing-Partitioned Clustering.” Briefings in Bioinformatics 23 (2): bbac055. https://doi.org/10.1093/bib/bbac055.
Wang, Yiwen, and Kim-Anh Lê Cao. 2023. PLSDA-batch: A Multivariate Framework to Correct for Batch Effects in Microbiome Data.” Briefings in Bioinformatics 24 (2): bbac622. https://doi.org/10.1093/bib/bbac622.
Huang, Soneson, Ruizhu. 2021. treeclimbR Pinpoints the Data-Dependent Resolution of Hierarchical Hypotheses.” Genome Biology 22 (2). https://doi.org/10.1186/s13059-021-02368-1.
Silverman, Justin D, Alex D Washburne, Sayan Mukherjee, and Lawrence A David. 2017. “A Phylogenetic Transform Enhances Analysis of Compositional Microbiota Data.” eLife 6. https://doi.org/10.7554/eLife.21887.
Xu, Shuangbin, Li Zhan, Wenli Tang, Qianwen Wang, Zehan Dai, Lang Zhou, Tingze Feng, et al. 2023. MicrobiotaProcess: A Comprehensive R Package for Deep Mining Microbiome.” The Innovation 4 (2): 100388. https://doi.org/10.1016/j.xinn.2023.100388.

2.2.4 Open microbiome data

Hundreds of published microbiome datasets are readily available in these data containers (see Section 4.2).

2.3 Installation

2.3.1 Installing all packages

You can install all packages that are required to run every example in this book via the following command:

remotes::install_github("microbiome/OMA", dependencies = TRUE, upgrade = TRUE)

Optionally, you can install all packages or just certain ones with the following script.

#|

# URL of the raw CSV file on GitHub. It includes all packages needed.
url <- "https://raw.githubusercontent.com/microbiome/OMA/devel/oma_packages/oma_packages.csv"

# Read the CSV file directly into R
df <- read.csv(url)
packages <- df[[1]]

# Get packages that are already installed installed
packages_already_installed <- packages[ packages %in% installed.packages() ]

# Get packages that need to be installed
packages_need_to_install <- setdiff( packages, packages_already_installed )

# Loads BiocManager into the session. Install it if it not already installed.
if( !require("BiocManager") ){
    install.packages("BiocManager")
    library("BiocManager")
}

# If there are packages that need to be installed, installs them with BiocManager
# Updates old packages.
if( length(packages_need_to_install) > 0 ) {
   install(packages_need_to_install, ask = FALSE)
}

# Load all packages into session. Stop if there are packages that were not
# successfully loaded
pkgs_not_loaded <- !sapply(packages, require, character.only = TRUE)
pkgs_not_loaded <- names(pkgs_not_loaded)[ pkgs_not_loaded ]
if( length(pkgs_not_loaded) > 0 ){
    stop("Error in loading the following packages into the session: '", paste0(pkgs_not_loaded, collapse = "', '"), "'")
}

2.3.2 Installing specific packages

You can install R packages of your choice with the following procedures.

Bioconductor release version is the most stable and tested version but may miss some of the latest methods and updates.

BiocManager::install("microbiome/mia")

Bioconductor development version requires the installation of the latest R beta version. This is primarily recommended for those who already have experience with R/Bioconductor and need access to the latest updates.

BiocManager::install("microbiome/mia", version = "devel")

Github development version provides access to the latest but potentially unstable features. This is useful when you want access to all available tools.

devtools::install_github("microbiome/mia")

2.3.3 Troubleshoot in installing

If you encounter installation issue related to package dependencies please see the troubleshoot page here and Chapter 30.

Summary
  • TreeSummarizedExperiment is derived from SummarizedExperiment class.
  • miaverse is based on TreeSummarizedExperiment data container.
  • We can borrow methods from packages utilizing SingleCellExperiment and SummarizedExperiment.
Back to top