8  QC & preprocessing

As a first step after importing the data into TreeSE, one should explore the data and perform quality control (QC). This is important because data quality affects the final results, and failing to assess it accurately can lead to erroneous interpretations. QC and exploration are discussed in Chapter 9.

Based on the QC results, researchers usually apply sample and feature filtering to improve the robustness of the analysis. To focus on a specific taxonomic rank, data agglomeration is commonly performed. Filtering and agglomeration are discussed in detail in Chapter 10 and Chapter 11.

Data transformations, covered in Chapter 12, are applied after filtering. For more information on preprocessing, you can refer to (Zhou et al. 2023), for instance.

Zhou, Ruwen, Siu Kin Ng, Joseph Jao Yiu Sung, Wilson Wen Bin Goh, and Sunny Hei Wong. 2023. “Data Pre-Processing for Analyzing Microbiome Data – a Mini Review.” Computational and Structural Biotechnology Journal 21: 4804–15. https://doi.org/10.1016/j.csbj.2023.10.001.
Back to top