Import QIIME2 results to TreeSummarizedExperiment

Results exported from QIMME2 can be imported as a TreeSummarizedExperiment using importQIIME2. Except for the assay.file, the other data types, row.file, refseq.file and tree.file, are optional, but are highly encouraged to be provided.

Import the QIIME2 artifacts to R.

importQIIME2(
  assay.file = featureTableFile,
  featureTableFile,
  row.file = taxonomyTableFile,
  taxonomyTableFile = NULL,
  col.file = sampleMetaFile,
  sampleMetaFile = NULL,
  as.refseq = featureNamesAsRefSeq,
  featureNamesAsRefSeq = TRUE,
  refseq.file = refSeqFile,
  refSeqFile = NULL,
  tree.file = phyTreeFile,
  phyTreeFile = NULL,
  ...
)

importQZA(file, temp.dir = temp, temp = tempdir(), ...)

Arguments

assay.file

Character scalar. Defines the file path of the feature table to be imported.

featureTableFile

Deprecated. use assay.file instead.

row.file

Character scalar or NULL. Defines the file path of the taxonomy table to be imported. (default: NULL).

taxonomyTableFile

Deprecated. use row.file instead.

col.file

Character scalar or NULL. Defines the file path of the sample metadata to be imported. The file has to be in tsv format. (Default: NULL).

sampleMetaFile

Deprecated. Use col.file instead.

as.refseq

Logical scalar or NULL. Should the feature names of the feature table be regarded as reference sequences? This setting will be disregarded, if refseq.file is not NULL. If the feature names do not contain valid DNA characters only, the reference sequences will not be set.

featureNamesAsRefSeq

Deprecated. Use as.refseq instead.

refseq.file

Character scalar or NULL. Defines the file path of the reference sequences for each feature. (Default: NULL).

refSeqFile

Deprecated. Use refseq.file instead.

tree.file

Character scalar. Defines the file path of the phylogenetic tree. (Default: NULL).

phyTreeFile

Deprecated. Use tree.file instead.

...

additional arguments:

temp.dir: the temporary directory used for decompressing the data. (default: tempdir())
prefix.rm: TRUE or FALSE: Should taxonomic prefixes be removed? (default: prefix.rm = FALSE)

file

character, path of the input qza file. Only files in format of BIOMV210DirFmt (feature table), TSVTaxonomyDirectoryFormat (taxonomic table), NewickDirectoryFormat (phylogenetic tree ) and DNASequencesDirectoryFormat (representative sequences) are supported right now.

temp.dir

character, a temporary directory in which the qza file will be decompressed to, default tempdir().

temp

Deprecated. Use temp.dir instead.

Value

A TreeSummarizedExperiment object

matrix object for feature table, DataFrame for taxonomic table, ape::phylo object for phylogenetic tree, Biostrings::DNAStringSet for representative sequences of taxa.

Details

Both arguments as.refseq and refseq.file can be used to define reference sequences of features. as.refseq is only taken into account, if refseq.file is NULL. No reference sequences are tried to be created, if featureNameAsRefSeq is FALSE and refseq.file is NULL.

References

Bolyen E et al. 2019: Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. Nature Biotechnology 37: 852–857. https://doi.org/10.1038/s41587-019-0209-9

https://qiime2.org

Author

Yang Cao

Examples

assay.file <- system.file("extdata", "table.qza", package = "mia")
row.file <- system.file("extdata", "taxonomy.qza", package = "mia")
col.file <- system.file("extdata", "sample-metadata.tsv", package = "mia")
tree.file <- system.file("extdata", "tree.qza", package = "mia")
refseq.file <- system.file("extdata", "refseq.qza", package = "mia")
tse <- importQIIME2(
  assay.file = assay.file,
  row.file = row.file,
  col.file = col.file,
  refseq.file = refseq.file,
  tree.file = tree.file
)

tse
#> class: TreeSummarizedExperiment 
#> dim: 770 34 
#> metadata(0):
#> assays(1): counts
#> rownames(770): 4b5eeb300368260019c1fbc7a3c718fc
#>   fe30ff0f71a38a39cf1717ec2be3a2fc ... 98d250a339a635f20e26397dafc6ced3
#>   1830c14ead81ad012f1db0e12f8ab6a4
#> rowData names(8): kingdom phylum ... species Confidence
#> colnames(34): L1S105 L1S140 ... L6S68 L6S93
#> colData names(9): sample.id barcode.sequence ...
#>   reported.antibiotic.usage days.since.experiment.start
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> rowLinks: a LinkDataFrame (770 rows)
#> rowTree: 1 phylo tree(s) (770 leaves)
#> colLinks: NULL
#> colTree: NULL
#> referenceSeq: a DNAStringSet (770 sequences)
# Read individual files
assay.file <- system.file("extdata", "table.qza", package = "mia")
row.file <- system.file("extdata", "taxonomy.qza", package = "mia")
col.file <- system.file("extdata", "sample-metadata.tsv", package = "mia")

assay <- importQZA(assay.file)
rowdata <- importQZA(row.file, prefix.rm = TRUE)
coldata <- read.table(col.file, header = TRUE, sep = "\t", comment.char = "")

# Assign rownames 
rownames(coldata) <- coldata[, 1]
coldata[, 1] <- NULL

# Order coldata based on assay
coldata <- coldata[match(colnames(assay), rownames(coldata)), ]

# Create SE from individual files
se <- SummarizedExperiment(
    assays = list(assay), rowData = rowdata, colData = coldata)
se
#> class: SummarizedExperiment 
#> dim: 770 34 
#> metadata(0):
#> assays(1): ''
#> rownames(770): 4b5eeb300368260019c1fbc7a3c718fc
#>   fe30ff0f71a38a39cf1717ec2be3a2fc ... 98d250a339a635f20e26397dafc6ced3
#>   1830c14ead81ad012f1db0e12f8ab6a4
#> rowData names(8): kingdom phylum ... species Confidence
#> colnames(34): L1S105 L1S140 ... L6S68 L6S93
#> colData names(8): barcode.sequence body.site ...
#>   reported.antibiotic.usage days.since.experiment.start

Import QIIME2 results to `TreeSummarizedExperiment`