Variety of transformations for abundance data, stored in assay
.
See details for options.
transformAssay(x, ...)
# S4 method for class 'SummarizedExperiment'
transformAssay(
x,
assay.type = "counts",
assay_name = NULL,
method = c("alr", "chi.square", "clr", "css", "frequency", "hellinger", "log", "log10",
"log2", "max", "normalize", "pa", "range", "rank", "rclr", "relabundance", "rrank",
"standardize", "total", "z"),
MARGIN = "samples",
name = method,
pseudocount = FALSE,
...
)
# S4 method for class 'SingleCellExperiment'
transformAssay(x, altexp = altExpNames(x), ...)
additional arguments passed e.g. on to vegan:decostand
.
reference
: Character scalar
. Used to
to fill reference sample's column in returned assay when calculating alr.
(Default: NA
)
ref_vals
Deprecated. Use reference
instead.
percentile
: Numeric scalar
or NULL
(css). Used
to set the percentile value that calculates the scaling factors in the css
normalization. If NULL
, percentile is estimated from the data by
calculating the portion of samples that exceed the threshold
.
(Default: NULL
)
scaling
: Numeric scalar
. Adjusts the normalization
scale by dividing the calculated scaling factors, effectively changing
the magnitude of the normalized counts. (Default: 1000
).
threshold
: Numeric scalar
. Specifies relative
difference threshold and determines the first point where the relative
change in differences between consecutive quantiles exceeds this
threshold. (Default: 0.1
).
Character scalar
. Specifies which assay to use for
calculation. (Default: "counts"
)
Deprecated. Use assay.type
instead.
Character scalar
. Specifies the transformation
method.
Character scalar
. Determines whether the
transformation is applied sample (column) or feature (row) wise.
(Default: "samples"
)
Character scalar
. A name for the column of the
colData
where results will be stored. (Default: "method"
)
Logical scalar
or numeric scalar
.
When TRUE
, automatically adds half of the minimum positive
value of assay.type
(missing values ignored by default:
na.rm = TRUE
).
When FALSE, does not add any pseudocount (pseudocount = 0).
Alternatively, a user-specified numeric value can be added as pseudocount.
(Default: FALSE
).
Character vector
or NULL
. Specifies the names
of alternative experiments to which the transformation should also be
applied. If NULL
, the transformation is only applied to the main
experiment. (Default: altExpNames(x)
).
transformAssay
returns the input object x
, with a new
transformed abundance table named name
added in the
assay
.
transformAssay
function provides a variety of options for
transforming abundance data. The transformed data is calculated and stored
in a new assay
.
The transformAssay
provides sample-wise (column-wise) or feature-wise
(row-wise) transformation to the abundance table
(assay) based on specified MARGIN
.
The available transformation methods include:
'alr', 'chi.square', 'clr', 'frequency', 'hellinger', 'log',
'normalize', 'pa', 'rank', 'rclr' relabundance', 'rrank', 'standardize',
'total': please refer to
decostand
for details.
'css': Cumulative Sum Scaling (CSS) can be used to normalize count data
by accounting for differences in library sizes. By default, the function
determines the normalization percentile for summing and scaling
counts. If you want to specify the percentile value, good default value
might be 0.5
.The method is inspired by the CSS methods in
metagenomeSeq
package.
'log10': log10 transformation can be used for reducing the skewness of the data. $$log10 = \log_{10} x$$ where \(x\) is a single value of data.
'log2': log2 transformation can be used for reducing the skewness of the data. $$log2 = \log_{2} x$$ where \(x\) is a single value of data.
Paulson, J., Stine, O., Bravo, H. et al. (2013) Differential abundance analysis for microbial marker-gene surveys Nature Methods 10, 1200–1202. doi:10.1038/nmeth.2658
data(GlobalPatterns)
tse <- GlobalPatterns
# By specifying 'method', it is possible to apply different transformations,
# e.g. compositional transformation.
tse <- transformAssay(tse, method = "relabundance")
# The target of transformation can be specified with "assay.type"
# Pseudocount can be added by specifying 'pseudocount'.
# Perform CLR with smallest positive value as pseudocount
tse <- transformAssay(
tse, assay.type = "relabundance", method = "clr",
pseudocount = TRUE
)
#> A pseudocount of 2.12117779669868e-07 was applied.
head(assay(tse, "clr"))
#> CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr M11Plmr
#> 549322 -1.503009 -1.629094 -1.246806 -0.4159582 -0.3502969 -0.626809 -0.9297687
#> 522457 -1.503009 -1.629094 -1.246806 -0.4159582 -0.3502969 -0.626809 -0.9297687
#> 951 -1.503009 -1.629094 -1.246806 -0.4159582 -0.3502969 -0.626809 1.5438443
#> 244423 -1.503009 -1.629094 -1.246806 -0.4159582 -0.3502969 -0.626809 -0.9297687
#> 586076 -1.503009 -1.629094 -1.246806 -0.4159582 -0.3502969 -0.626809 -0.9297687
#> 246140 -1.503009 -1.629094 -1.246806 -0.4159582 -0.3502969 -0.626809 -0.9297687
#> F21Plmr M31Tong M11Tong LMEpi24M SLEpi20M AQC1cm
#> 549322 -0.6794681 -0.3541412 -0.524686 -0.4745231 1.0715840 3.607222
#> 522457 -0.6794681 -0.3541412 -0.524686 -0.4745231 -0.5120773 -1.093283
#> 951 -0.6794681 -0.3541412 -0.524686 -0.4745231 -0.5120773 -1.093283
#> 244423 -0.6794681 -0.3541412 -0.524686 -0.4745231 -0.5120773 -1.093283
#> 586076 -0.6794681 -0.3541412 -0.524686 -0.4745231 -0.5120773 -1.093283
#> 246140 -0.6794681 -0.3541412 -0.524686 -0.4745231 -0.5120773 -1.093283
#> AQC4cm AQC7cm NP2 NP3 NP5 TRRsed1
#> 549322 4.27795825 4.8045323 1.8130749 -0.6256967 -0.5083015 -0.8942877
#> 522457 0.58409126 1.7843282 -0.4898264 -0.6256967 -0.5083015 -0.8942877
#> 951 -1.02534666 -1.0861723 -0.4898264 -0.6256967 -0.5083015 -0.8942877
#> 244423 2.78131583 3.3138775 -0.4898264 -0.6256967 -0.5083015 -0.8942877
#> 586076 0.58409126 0.2420447 -0.4898264 -0.6256967 -0.5083015 -0.8942877
#> 246140 0.07326563 1.1463040 -0.4898264 -0.6256967 -0.5083015 -0.8942877
#> TRRsed2 TRRsed3 TS28 TS29 Even1 Even2
#> 549322 -0.9741225 -1.029362 -0.4540262 -0.4062524 -0.5829973 -0.4578325
#> 522457 -0.9741225 -1.029362 -0.4540262 -0.4062524 -0.5829973 -0.4578325
#> 951 -0.9741225 -1.029362 -0.4540262 -0.4062524 -0.5829973 -0.4578325
#> 244423 -0.9741225 -1.029362 -0.4540262 -0.4062524 -0.5829973 -0.4578325
#> 586076 -0.9741225 -1.029362 -0.4540262 -0.4062524 -0.5829973 -0.4578325
#> 246140 -0.9741225 -1.029362 -0.4540262 -0.4062524 -0.5829973 -0.4578325
#> Even3
#> 549322 -0.4074094
#> 522457 -0.4074094
#> 951 -0.4074094
#> 244423 -0.4074094
#> 586076 -0.4074094
#> 246140 -0.4074094
# Perform CSS normalization.
tse <- transformAssay(tse, method = "css")
#> 'percentile' set to: 0.369449147024352
head(assay(tse, "css"))
#> CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr M11Plmr F21Plmr M31Tong M11Tong
#> 549322 0 0 0 0 0 0 0.0000000 0 0 0
#> 522457 0 0 0 0 0 0 0.0000000 0 0 0
#> 951 0 0 0 0 0 0 0.4995005 0 0 0
#> 244423 0 0 0 0 0 0 0.0000000 0 0 0
#> 586076 0 0 0 0 0 0 0.0000000 0 0 0
#> 246140 0 0 0 0 0 0 0.0000000 0 0 0
#> LMEpi24M SLEpi20M AQC1cm AQC4cm AQC7cm NP2 NP3 NP5
#> 549322 0 0.4887586 8.110544 22.4870699 31.0781736 0.9823183 0 0
#> 522457 0 0.0000000 0.000000 0.4497414 1.4343772 0.0000000 0 0
#> 951 0 0.0000000 0.000000 0.0000000 0.0000000 0.0000000 0 0
#> 244423 0 0.0000000 0.000000 4.9471554 6.9328233 0.0000000 0 0
#> 586076 0 0.0000000 0.000000 0.4497414 0.2390629 0.0000000 0 0
#> 246140 0 0.0000000 0.000000 0.2248707 0.7171886 0.0000000 0 0
#> TRRsed1 TRRsed2 TRRsed3 TS28 TS29 Even1 Even2 Even3
#> 549322 0 0 0 0 0 0 0 0
#> 522457 0 0 0 0 0 0 0 0
#> 951 0 0 0 0 0 0 0 0
#> 244423 0 0 0 0 0 0 0 0
#> 586076 0 0 0 0 0 0 0 0
#> 246140 0 0 0 0 0 0 0 0
# With MARGIN, you can specify the if transformation is done for samples or
# for features. Here Z-transformation is done feature-wise.
tse <- transformAssay(tse, method = "standardize", MARGIN = "features")
#> Warning: result contains NaN, perhaps due to impossible mathematical
#> operation
head(assay(tse, "standardize"))
#> CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr
#> 549322 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.3146909
#> 522457 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#> 951 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161
#> 244423 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242
#> 586076 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311
#> 246140 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#> M11Plmr F21Plmr M31Tong M11Tong LMEpi24M SLEpi20M
#> 549322 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.2831003
#> 522457 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#> 951 4.9029034 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161
#> 244423 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242
#> 586076 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311
#> 246140 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#> AQC1cm AQC4cm AQC7cm NP2 NP3 NP5
#> 549322 0.5382551 2.8443686 3.7920864 -0.2831003 -0.3146909 -0.3146909
#> 522457 -0.2511010 1.3810554 4.6453681 -0.2511010 -0.2511010 -0.2511010
#> 951 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161
#> 244423 -0.2802242 2.8626823 3.8626980 -0.2802242 -0.2802242 -0.2802242
#> 586076 -0.2674311 4.3680412 2.0503050 -0.2674311 -0.2674311 -0.2674311
#> 246140 -0.2511010 1.3810554 4.6453681 -0.2511010 -0.2511010 -0.2511010
#> TRRsed1 TRRsed2 TRRsed3 TS28 TS29 Even1
#> 549322 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.3146909
#> 522457 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#> 951 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161
#> 244423 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242
#> 586076 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311
#> 246140 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#> Even2 Even3
#> 549322 -0.3146909 -0.3146909
#> 522457 -0.2511010 -0.2511010
#> 951 -0.1961161 -0.1961161
#> 244423 -0.2802242 -0.2802242
#> 586076 -0.2674311 -0.2674311
#> 246140 -0.2511010 -0.2511010
# Name of the stored table can be specified.
tse <- transformAssay(tse, method="hellinger", name="test")
head(assay(tse, "test"))
#> CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr M11Plmr F21Plmr M31Tong M11Tong
#> 549322 0 0 0 0 0 0 0.000000000 0 0 0
#> 522457 0 0 0 0 0 0 0.000000000 0 0 0
#> 951 0 0 0 0 0 0 0.001518127 0 0 0
#> 244423 0 0 0 0 0 0 0.000000000 0 0 0
#> 586076 0 0 0 0 0 0 0.000000000 0 0 0
#> 246140 0 0 0 0 0 0 0.000000000 0 0 0
#> LMEpi24M SLEpi20M AQC1cm AQC4cm AQC7cm NP2
#> 549322 0 0.0009063565 0.004808474 0.0065133368 0.0087465653 0.00138193
#> 522457 0 0.0000000000 0.000000000 0.0009211249 0.0018790636 0.00000000
#> 951 0 0.0000000000 0.000000000 0.0000000000 0.0000000000 0.00000000
#> 244423 0 0.0000000000 0.000000000 0.0030550257 0.0041310920 0.00000000
#> 586076 0 0.0000000000 0.000000000 0.0009211249 0.0007671245 0.00000000
#> 246140 0 0.0000000000 0.000000000 0.0006513337 0.0013286986 0.00000000
#> NP3 NP5 TRRsed1 TRRsed2 TRRsed3 TS28 TS29 Even1 Even2 Even3
#> 549322 0 0 0 0 0 0 0 0 0 0
#> 522457 0 0 0 0 0 0 0 0 0 0
#> 951 0 0 0 0 0 0 0 0 0 0
#> 244423 0 0 0 0 0 0 0 0 0 0
#> 586076 0 0 0 0 0 0 0 0 0 0
#> 246140 0 0 0 0 0 0 0 0 0 0
# pa returns presence absence table.
tse <- transformAssay(tse, method = "pa")
head(assay(tse, "pa"))
#> CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr M11Plmr F21Plmr M31Tong M11Tong
#> 549322 0 0 0 0 0 0 0 0 0 0
#> 522457 0 0 0 0 0 0 0 0 0 0
#> 951 0 0 0 0 0 0 1 0 0 0
#> 244423 0 0 0 0 0 0 0 0 0 0
#> 586076 0 0 0 0 0 0 0 0 0 0
#> 246140 0 0 0 0 0 0 0 0 0 0
#> LMEpi24M SLEpi20M AQC1cm AQC4cm AQC7cm NP2 NP3 NP5 TRRsed1 TRRsed2
#> 549322 0 1 1 1 1 1 0 0 0 0
#> 522457 0 0 0 1 1 0 0 0 0 0
#> 951 0 0 0 0 0 0 0 0 0 0
#> 244423 0 0 0 1 1 0 0 0 0 0
#> 586076 0 0 0 1 1 0 0 0 0 0
#> 246140 0 0 0 1 1 0 0 0 0 0
#> TRRsed3 TS28 TS29 Even1 Even2 Even3
#> 549322 0 0 0 0 0 0
#> 522457 0 0 0 0 0 0
#> 951 0 0 0 0 0 0
#> 244423 0 0 0 0 0 0
#> 586076 0 0 0 0 0 0
#> 246140 0 0 0 0 0 0
# rank returns ranks of taxa.
tse <- transformAssay(tse, method = "rank")
head(assay(tse, "rank"))
#> CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr M11Plmr F21Plmr M31Tong M11Tong
#> 549322 0 0 0 0 0 0 0.0 0 0 0
#> 522457 0 0 0 0 0 0 0.0 0 0 0
#> 951 0 0 0 0 0 0 532.5 0 0 0
#> 244423 0 0 0 0 0 0 0.0 0 0 0
#> 586076 0 0 0 0 0 0 0.0 0 0 0
#> 246140 0 0 0 0 0 0 0.0 0 0 0
#> LMEpi24M SLEpi20M AQC1cm AQC4cm AQC7cm NP2 NP3 NP5 TRRsed1 TRRsed2
#> 549322 0 580.5 4941 5710.0 5673.0 509.5 0 0 0 0
#> 522457 0 0.0 0 1978.5 3329.5 0.0 0 0 0 0
#> 951 0 0.0 0 0.0 0.0 0.0 0 0 0 0
#> 244423 0 0.0 0 4691.0 4780.0 0.0 0 0 0 0
#> 586076 0 0.0 0 1978.5 800.0 0.0 0 0 0 0
#> 246140 0 0.0 0 804.5 2479.5 0.0 0 0 0 0
#> TRRsed3 TS28 TS29 Even1 Even2 Even3
#> 549322 0 0 0 0 0 0
#> 522457 0 0 0 0 0 0
#> 951 0 0 0 0 0 0
#> 244423 0 0 0 0 0 0
#> 586076 0 0 0 0 0 0
#> 246140 0 0 0 0 0 0
# In order to use other ranking variants, modify the chosen assay directly:
assay(tse, "rank_average", withDimnames = FALSE) <- colRanks(
assay(tse, "counts"), ties.method = "average", preserveShape = TRUE)
# Using altexp parameter. First agglomerate the data and then apply
# transformation.
tse <- GlobalPatterns
tse <- agglomerateByRanks(tse)
tse <- transformAssay(tse, method = "relabundance")
# The transformation is applied to all alternative experiments
altExp(tse, "Species")
#> class: TreeSummarizedExperiment
#> dim: 944 26
#> metadata(1): agglomerated_by_rank
#> assays(2): counts relabundance
#> rownames(944): Abiotrophiadefectiva Achromatiumoxaliferum ...
#> proteobacteriumsymbiontofOsedaxsp.MB4 symbiontofNoeetapupillata
#> rowData names(7): Kingdom Phylum ... Genus Species
#> colnames(26): CL3 CC1 ... Even2 Even3
#> colData names(7): X.SampleID Primer ... SampleType Description
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> rowLinks: a LinkDataFrame (944 rows)
#> rowTree: 1 phylo tree(s) (19216 leaves)
#> colLinks: NULL
#> colTree: NULL