Merge SE objects into single SE object.

mergeSEs(x, ...)

# S4 method for SimpleList
mergeSEs(
  x,
  assay.type = "counts",
  assay_name = NULL,
  join = "full",
  missing_values = NA,
  collapse_samples = FALSE,
  collapse_features = TRUE,
  verbose = TRUE,
  ...
)

# S4 method for SummarizedExperiment
mergeSEs(x, y = NULL, ...)

# S4 method for list
mergeSEs(x, ...)

Arguments

x

a SummarizedExperiment object or a list of SummarizedExperiment objects.

...

optional arguments (not used).

assay.type

A character value for selecting the assay to be merged. (By default: assay.type = "counts")

assay_name

(Deprecated) alias for assay.type.

join

A single character value for selecting the joining method. Must be 'full', 'inner', 'left', or 'right'. 'left' and 'right' are disabled when more than two objects are being merged. (By default: join = "full")

missing_values

NA, 0, or a single character values specifying the notation of missing values. (By default: missing_values = NA)

collapse_samples

A boolean value for selecting whether to collapse identically named samples to one. (By default: collapse_samples = FALSE)

collapse_features

A boolean value for selecting whether to collapse identically named features to one. Since all taxonomy information is taken into account, this concerns rownames-level (usually strain level) comparison. Often OTU or ASV level is just an arbitrary number series from sequencing machine meaning that the OTU information is not comparable between studies. With this option, it is possible to specify whether these strains are combined if their taxonomy information along with OTU number matches. (By default: collapse_features = TRUE)

verbose

A single boolean value to choose whether to show messages. (By default: verbose = TRUE)

y

a SummarizedExperiment object when x is a SummarizedExperiment object. Disabled when x is a list.

Value

A single SummarizedExperiment object.

Details

This function merges multiple SummarizedExperiment objects. It combines rowData, assays, and colData so that the output includes each unique row and column ones. The merging is done based on rownames and colnames. rowTree and colTree are preserved if linkage between rows/cols and the tree is found.

Equally named rows are interpreted as equal. Further matching based on rowData is not done. For samples, collapsing is disabled by default meaning that equally named samples that are stored in different objects are interpreted as unique. Collapsing can be enabled with collapse_samples = TRUE when equally named samples describe the same sample.

If, for example, all rows are not shared with individual objects, there are missing values in assays. The notation of missing can be specified with the missing_values argument. If input consists of TreeSummarizedExperiment objects, also rowTree, colTree, and referenceSeq are preserved if possible. The data is preserved if all the rows or columns can be found from it.

Compared to cbind and rbind mergeSEs allows more freely merging since cbind and rbind expect that rows and columns are matching, respectively.

You can choose joining methods from 'full', 'inner', 'left', and 'right'. In all the methods, all the samples are included in the result object. However, with different methods, it is possible to choose which rows are included.

  • full -- all unique features

  • inner -- all shared features

  • left -- all the features of the first object

  • right -- all the features of the second object

The output depends on the input. If the input contains SummarizedExperiment object, then the output will be SummarizedExperiment. When all the input objects belong to TreeSummarizedExperiment, the output will be TreeSummarizedExperiment.

See also

Author

Leo Lahti and Tuomas Borman. Contact: microbiome.github.io

Examples

data(GlobalPatterns)
data(esophagus)
data(enterotype)

# Take only subsets so that it wont take so long
tse1 <- GlobalPatterns[1:100, ]
tse2 <- esophagus
tse3 <- enterotype[1:100, ]

# Merge two TreeSEs
tse <- mergeSEs(tse1, tse2)
#> Merging with full join...
#> 1/2
#> 
2/2
#> 
#> Adding rowTree(s)...

# Merge a list of TreeSEs
list <- SimpleList(tse1, tse2, tse3)
tse <- mergeSEs(list, assay.type = "counts", missing_values = 0)
#> Merging with full join...
#> 1/3
#> 
2/3
#> 
3/3
#> 
#> Adding rowTree(s)...
#> Warning: rowTree(s) does not match with the data so it is discarded.
tse
#> class: TreeSummarizedExperiment 
#> dim: 258 309 
#> metadata(0):
#> assays(1): counts
#> rownames(258): Abiotrophia Achromobacter ... 9_7_25 Bacteria
#> rowData names(7): Kingdom Phylum ... Genus Species
#> colnames(309): AM.AD.1 AM.AD.2 ... TS99.2_V2 TS9_V2
#> colData names(16): X.SampleID Primer ... Age ClinicalStatus
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> rowLinks: NULL
#> rowTree: NULL
#> colLinks: NULL
#> colTree: NULL

# With 'join', it is possible to specify the merging method. Subsets are used
# here just to show the functionality
tse_temp <- mergeSEs(tse[1:10, 1:10], tse[5:100, 11:20], join = "left")
#> Merging with left join...
#> 1/2
#> 
2/2
#> 
tse_temp
#> class: TreeSummarizedExperiment 
#> dim: 10 20 
#> metadata(0):
#> assays(1): counts
#> rownames(10): Abiotrophia Achromobacter ... Azospirillum Bartonella
#> rowData names(7): Kingdom Phylum ... Genus Species
#> colnames(20): AQC1cm AQC4cm ... C D
#> colData names(16): X.SampleID Primer ... Age ClinicalStatus
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> rowLinks: NULL
#> rowTree: NULL
#> colLinks: NULL
#> colTree: NULL

# If your objects contain samples that describe one and same sample,
# you can collapse equally named samples to one by specifying 'collapse_samples'
tse_temp <- mergeSEs(list(tse[1:10, 1], tse[1:20, 1], tse[1:5, 1]), 
                       collapse_samples = TRUE,
                       join = "inner")
#> Merging with inner join...
#> 1/3
#> 
2/3
#> 
3/3
#> 
tse_temp
#> class: TreeSummarizedExperiment 
#> dim: 5 1 
#> metadata(0):
#> assays(1): counts
#> rownames(5): Abiotrophia Achromobacter Acidobacteria_Gp1_Gp1 Acidovorax
#>   Aerococcus
#> rowData names(7): Kingdom Phylum ... Genus Species
#> colnames(1): AM.AD.1
#> colData names(16): X.SampleID Primer ... Age ClinicalStatus
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> rowLinks: NULL
#> rowTree: NULL
#> colLinks: NULL
#> colTree: NULL

# Merge all available assays
tse <- transformAssay(tse, method="relabundance")
ts1 <- transformAssay(tse1, method="relabundance")
tse_temp <- mergeSEs(tse, tse1, assay.type = assayNames(tse))
#> Warning: The following assay(s) was not found from all the objects so it is dropped from the output: 'relabundance'
#> Merging with full join...
#> 1/2
#> 
2/2
#> 
#> Adding rowTree(s)...
#> Warning: rowTree(s) does not match with the data so it is discarded.