Chapter 1 Overview

Welcome to the multi-omics data analysis workshop

ML4microbiome

Figure source: Moreno-Indias et al. (2021) Statistical and Machine Learning Techniques in Human Microbiome Studies: Contemporary Challenges and Solutions. Frontiers in Microbiology 12:11.

1.1 Introduction

This course is based on miaverse (mia = MIcrobiome Analysis) is an R/Bioconductor framework for microbiome data science. It extends another popular framework, phyloseq.

The miaverse consists of an efficient data structure, an associated package ecosystem, demonstration data sets, and open documentation. These are explained in more detail in the online book Orchestrating Microbiome Analysis.

This workshop material walks you through example workflows for multi-omics data analysis covering data access, exploration, analysis, visualization and reproducible reporting. You can run the workflow by simply copy-pasting the examples. For advanced material, you can test and modify further examples from the OMA book, or try to apply the techniques to your own data.

1.2 Learning goals

This workshop provides an overview of analytical tools for multi-omics studies in R. A particular focus is on multi-omics tools and techniques required to process microbial community data in combination with other omics.

After the workshop the participants should be able to preprocess and manipulate data, perform simple visualizations and statistical analyses, apply unsupervised and supervised machine learning, and produce robust and reproducible results.

Target audience Advanced students and applied researchers who wish to develop their skills in multi-omics analysis.

Venue The course is organized fully remotely in Zoom. The meeting requires a passcode (sent by email to the participants).

1.3 Acknowledgments

Citation “Introduction to microbiome data science (2021). URL: https://microbiome.github.io.”

Borman et al. (2022)

We thank Felix Ernst, Sudarshan Shetty, and other miaverse developers who have contributed open resources that supported the development of the training material.

Contact Leo Lahti, University of Turku, Finland

License All material is released under the open CC BY-NC-SA 3.0 License.

Source code

The source code of this repository is fully reproducible and contains the Rmd files with executable code. All files can be rendered at one go by running the file main.R. You can check the file for details on how to clone the repository and convert it into a gitbook, although this is not necessary for the training.

References

Borman, Tuomas, Henrik Eckermann, Chouaib Benchraka, Matti Ruuskanen, and Leo Lahti. 2022. Introduction to Multi-Omics Data Analysis. microbiome.github.io/course_2022_FindingPheno/.