TreeSummarizedExperiment
column-wise or row-wise based on grouping variableR/splitOn.R
splitOn.Rd
Split TreeSummarizedExperiment
column-wise or row-wise based on grouping variable
splitOn(x, ...)
# S4 method for SummarizedExperiment
splitOn(x, f = NULL, ...)
# S4 method for SingleCellExperiment
splitOn(x, f = NULL, ...)
# S4 method for TreeSummarizedExperiment
splitOn(x, f = NULL, update_rowTree = FALSE, ...)
unsplitOn(x, ...)
# S4 method for list
unsplitOn(x, update_rowTree = FALSE, ...)
# S4 method for SimpleList
unsplitOn(x, update_rowTree = FALSE, ...)
# S4 method for SingleCellExperiment
unsplitOn(x, altExpNames = names(altExps(x)), keep_reducedDims = FALSE, ...)
A
SummarizedExperiment
object or a list of
SummarizedExperiment
objects.
Arguments passed to mergeRows
/mergeCols
function for
SummarizedExperiment
objects and other functions.
See mergeRows
for more details.
use_names
A single boolean value to select whether to name elements of
list by their group names.
A single character value for selecting the grouping variable
from rowData
or colData
or a factor
or vector
with the same length as one of the dimensions. If f
matches with both
dimensions, MARGIN
must be specified.
Split by cols is not encouraged, since this is not compatible with
storing the results in altExps
.
TRUE
or FALSE
: Should the rowTree be updated
based on splitted data? Option is enabled when x
is a
TreeSummarizedExperiment
object or a list of such objects.
(By default: update_rowTree = FALSE
)
a character
vector specifying the alternative experiments
to be unsplit. (By default: altExpNames = names(altExps(x))
)
TRUE
or FALSE
: Should the
reducedDims(x)
be transferred to the result? Please note, that this
breaks the link between the data used to calculate the reduced dims.
(By default: keep_reducedDims = FALSE
)
For splitOn
: SummarizedExperiment
objects in a SimpleList
.
For unsplitOn
: x
, with rowData
and assay
data replaced by the unsplit data. colData
of x is kept as well
and any existing rowTree
is dropped as well, since existing
rowLinks
are not valid anymore.
splitOn
split data based on grouping variable. Splitting can be done
column-wise or row-wise. The returned value is a list of
SummarizedExperiment
objects; each element containing members of each
group.
data(GlobalPatterns)
tse <- GlobalPatterns
# Split data based on SampleType.
se_list <- splitOn(tse, f = "SampleType")
# List of SE objects is returned.
se_list
#> List of length 9
#> names(9): Soil Feces Skin Tongue ... Ocean Sediment (estuary) Mock
# Create arbitrary groups
rowData(tse)$group <- sample(1:3, nrow(tse), replace = TRUE)
colData(tse)$group <- sample(1:3, ncol(tse), replace = TRUE)
# Split based on rows
# Each element is named based on their group name. If you don't want to name
# elements, use use_name = FALSE. Since "group" can be found from rowdata and colData
# you must use MARGIN.
se_list <- splitOn(tse, f = "group", use_names = FALSE, MARGIN = 1)
# When column names are shared between elements, you can store the list to altExps
altExps(tse) <- se_list
#> Warning: 'names(value)' is NULL, replacing with 'unnamed'
altExps(tse)
#> List of length 3
#> names(3): unnamed1 unnamed2 unnamed3
# If you want to split on columns and update rowTree, you can do
se_list <- splitOn(tse, f = colData(tse)$group, update_rowTree = TRUE)
#> Warning: 'keep.nodes' does specify all the tips from 'tree'. The tree is not agglomerated.
#> Warning: 'keep.nodes' does specify all the tips from 'tree'. The tree is not agglomerated.
#> Warning: 'keep.nodes' does specify all the tips from 'tree'. The tree is not agglomerated.
# If you want to combine groups back together, you can use unsplitBy
unsplitOn(se_list)
#> class: TreeSummarizedExperiment
#> dim: 19216 26
#> metadata(0):
#> assays(1): counts
#> rownames(19216): 549322 522457 ... 200359 271582
#> rowData names(8): Kingdom Phylum ... Species group
#> colnames(26): CL3 M11Fcsw ... Even1 Even3
#> colData names(8): X.SampleID Primer ... Description group
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> rowLinks: a LinkDataFrame (19216 rows)
#> rowTree: 1 phylo tree(s) (19216 leaves)
#> colLinks: NULL
#> colTree: NULL