Import HUMAnN results to TreeSummarizedExperiment

Arguments

file

a single character value defining the file path of the HUMAnN file. The file must be in merged HUMAnN format.

colData

a DataFrame-like object that includes sample names in rownames, or a single character value defining the file path of the sample metadata file. The file must be in tsv format (default: colData = NULL).

...

additional arguments:

  • assay.type: A single character value for naming assay (default: assay.type = "counts")

  • removeTaxaPrefixes: TRUE or FALSE: Should taxonomic prefixes be removed? (default: removeTaxaPrefixes = FALSE)

  • remove.suffix: TRUE or FALSE: Should suffixes of sample names be removed? HUMAnN pipeline adds suffixes to sample names. Suffixes are formed from file names. By selecting remove.suffix = TRUE, you can remove pattern from end of sample names that is shared by all. (default: remove.suffix = FALSE)

Value

A TreeSummarizedExperiment

object

Details

Import HUMAnN (currently version 3.0 supported) results of functional predictions based on metagenome composition (e.g. pathways or gene families). The input must be in merged HUMAnN format. (See the HUMAnN documentation and humann_join_tables method.)

The function parses gene/pathway information along with taxonomy information from the input file. This information is stored to rowData. Abundances are stored to assays.

Usually the workflow includes also taxonomy data from Metaphlan. See loadFromMetaphlan to load the data to TreeSE.

References

Beghini F, McIver LJ, Blanco-Míguez A, Dubois L, Asnicar F, Maharjan S, Mailyan A, Manghi P, Scholz M, Thomas AM, Valles-Colomer M, Weingart G, Zhang Y, Zolfo M, Huttenhower C, Franzosa EA, & Segata N (2021) Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3. eLife. 10:e65088.

Author

Leo Lahti and Tuomas Borman. Contact: microbiome.github.io

Examples

# File path
file_path <- system.file("extdata", "humann_output.tsv", package = "mia")
# Import data
tse <- loadFromHumann(file_path)
tse
#> class: TreeSummarizedExperiment 
#> dim: 12 3 
#> metadata(0):
#> assays(1): counts
#> rownames(12): UNMAPPED UniRef50_unknown ... UniRef50_O83668:
#>   Fructose-bisphosphate
#>   aldolase|g__Bacteroides.s__Bacteroides_thetaiotaomicron
#>   UniRef50_O83668: Fructose-bisphosphate
#>   aldolase|g__Bacteroides.s__Bacteroides_stercoris
#> rowData names(9): Gene_Family_long Gene_Family ... Genus Species
#> colnames(3): sample1 sample2 sample3
#> colData names(0):
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> rowLinks: NULL
#> rowTree: NULL
#> colLinks: NULL
#> colTree: NULL