Split TreeSummarizedExperiment column-wise or row-wise based on grouping variable

splitOn(x, ...)

# S4 method for SummarizedExperiment
splitOn(x, f = NULL, ...)

# S4 method for SingleCellExperiment
splitOn(x, f = NULL, ...)

# S4 method for TreeSummarizedExperiment
splitOn(x, f = NULL, update_rowTree = FALSE, ...)

unsplitOn(x, ...)

# S4 method for list
unsplitOn(x, update_rowTree = FALSE, ...)

# S4 method for SimpleList
unsplitOn(x, update_rowTree = FALSE, ...)

# S4 method for SingleCellExperiment
unsplitOn(x, altExpNames = names(altExps(x)), keep_reducedDims = FALSE, ...)



A SummarizedExperiment object or a list of SummarizedExperiment objects.


Arguments passed to mergeRows/mergeCols function for SummarizedExperiment objects and other functions. See mergeRows for more details.

  • use_names A single boolean value to select whether to name elements of list by their group names.


A single character value for selecting the grouping variable from rowData or colData or a factor or vector with the same length as one of the dimensions. If f matches with both dimensions, MARGIN must be specified. Split by cols is not encouraged, since this is not compatible with storing the results in altExps.


TRUE or FALSE: Should the rowTree be updated based on splitted data? Option is enabled when x is a TreeSummarizedExperiment object or a list of such objects. (By default: update_rowTree = FALSE)


a character vector specifying the alternative experiments to be unsplit. (By default: altExpNames = names(altExps(x)))


TRUE or FALSE: Should the reducedDims(x) be transferred to the result? Please note, that this breaks the link between the data used to calculate the reduced dims. (By default: keep_reducedDims = FALSE)


For splitOn: SummarizedExperiment objects in a SimpleList.

For unsplitOn: x, with rowData and assay

data replaced by the unsplit data. colData of x is kept as well and any existing rowTree is dropped as well, since existing rowLinks are not valid anymore.


splitOn split data based on grouping variable. Splitting can be done column-wise or row-wise. The returned value is a list of SummarizedExperiment objects; each element containing members of each group.


Leo Lahti and Tuomas Borman. Contact: microbiome.github.io


tse <- GlobalPatterns
# Split data based on SampleType. 
se_list <- splitOn(tse, f = "SampleType")

# List of SE objects is returned. 
#> List of length 9
#> names(9): Soil Feces Skin Tongue ... Ocean Sediment (estuary) Mock

# Create arbitrary groups
rowData(tse)$group <- sample(1:3, nrow(tse), replace = TRUE)
colData(tse)$group <- sample(1:3, ncol(tse), replace = TRUE)

# Split based on rows
# Each element is named based on their group name. If you don't want to name
# elements, use use_name = FALSE. Since "group" can be found from rowdata and colData
# you must use MARGIN.
se_list <- splitOn(tse, f = "group", use_names = FALSE, MARGIN = 1)

# When column names are shared between elements, you can store the list to altExps
altExps(tse) <- se_list
#> Warning: 'names(value)' is NULL, replacing with 'unnamed'

#> List of length 3
#> names(3): unnamed1 unnamed2 unnamed3

# If you want to split on columns and update rowTree, you can do
se_list <- splitOn(tse, f = colData(tse)$group, update_rowTree = TRUE)
#> Warning: 'keep.nodes' does specify all the tips from 'tree'. The tree is not agglomerated.
#> Warning: 'keep.nodes' does specify all the tips from 'tree'. The tree is not agglomerated.
#> Warning: 'keep.nodes' does specify all the tips from 'tree'. The tree is not agglomerated.

# If you want to combine groups back together, you can use unsplitBy
#> class: TreeSummarizedExperiment 
#> dim: 19216 26 
#> metadata(0):
#> assays(1): counts
#> rownames(19216): 549322 522457 ... 200359 271582
#> rowData names(8): Kingdom Phylum ... Species group
#> colnames(26): CL3 M11Fcsw ... Even1 Even3
#> colData names(8): X.SampleID Primer ... Description group
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> rowLinks: a LinkDataFrame (19216 rows)
#> rowTree: 1 phylo tree(s) (19216 leaves)
#> colLinks: NULL
#> colTree: NULL