Transform assay

Variety of transformations for abundance data, stored in assay. See details for options.

transformAssay(x, ...)

# S4 method for class 'SummarizedExperiment'
transformAssay(
  x,
  method,
  assay.type = "counts",
  assay_name = NULL,
  MARGIN = "samples",
  name = method,
  pseudocount = FALSE,
  ...
)

# S4 method for class 'SingleCellExperiment'
transformAssay(x, altexp = NULL, ...)

Arguments

x

TreeSummarizedExperiment.

...

additional arguments passed e.g. on to vegan:decostand or philr::philr.

reference: Character scalar. Used to to fill reference sample's column in returned assay when calculating alr. (Default: NA)
ref_vals Deprecated. Use reference instead.
percentile: Numeric scalar or NULL (css). Used to set the percentile value that calculates the scaling factors in the css normalization. If NULL, percentile is estimated from the data by calculating the portion of samples that exceed the threshold. (Default: NULL)
scaling: Numeric scalar. Adjusts the normalization scale by dividing the calculated scaling factors, effectively changing the magnitude of the normalized counts. (Default: 1000).
threshold: Numeric scalar. For "css", specifies relative difference threshold and determines the first point where the relative change in differences between consecutive quantiles exceeds this threshold. (Default: 0.1) For "cutoff", values less than or equal to the threshold are replaced with NA. (Default: 0)
tree: phylo. Phylogeny used in PhILR transformation. If NULL, the tree is retrieved from x. (Default: NULL).
node.labels: Character vector. Linkages between tree and x. Used in PhILR transformation. (Default: NULL).

method

Character scalar. Specifies the transformation method.

assay.type

Character scalar. Specifies the name of assay used in calculation. (Default: "counts")

assay_name

Deprecated. Use assay.type instead.

MARGIN

Character scalar. Determines whether the transformation is applied sample (column) or feature (row) wise. (Default: "samples")

name

Character scalar. The name for the transformed assay to be stored. (Default: method)

pseudocount

Logical scalar or numeric scalar. When TRUE, automatically adds half of the minimum positive value of assay.type (missing values ignored by default: na.rm = TRUE). When FALSE, does not add any pseudocount (pseudocount = 0). Alternatively, a user-specified numeric value can be added as pseudocount. (Default: FALSE).

altexp

Character vector or NULL. Specifies the names of alternative experiments to which the transformation should also be applied. If NULL, the transformation is only applied to the main experiment. (Default: NULL).

Value

transformAssay returns the input object x, with a new transformed abundance table named name added in the assays.

Details

transformAssay function provides a variety of options for transforming abundance data. The transformed data is calculated and stored in a new assay.

The transformAssay provides sample-wise (column-wise) or feature-wise (row-wise) transformation to the abundance table (assay) based on specified MARGIN.

The available transformation methods include:

'alr', 'chi.square', 'clr', 'frequency', 'hellinger', 'log', 'normalize', 'pa', 'rank', 'rclr' relabundance', 'rrank', 'standardize', 'total': please refer to decostand for details.
'philr': please refer to philr for details.
'css': Cumulative Sum Scaling (CSS) can be used to normalize count data by accounting for differences in library sizes. By default, the function determines the normalization percentile for summing and scaling counts. If you want to specify the percentile value, good default value might be 0.5. The method is inspired by the CSS methods in metagenomeSeq package.
'log10': log10 transformation can be used for reducing the skewness of the data. $$log10 = \log_{10} x$$ where $x$ is a single value of data.
'log2': log2 transformation can be used for reducing the skewness of the data. $$log2 = \log_{2} x$$ where $x$ is a single value of data.
'pseudocount': Adds only pseudocount.
'cutoff': In some ecological studies, only strictly positive values are taken into account. This method keeps only values greater than threshold and replaces all other values with NA.

References

Paulson, J., Stine, O., Bravo, H. et al. (2013) Differential abundance analysis for microbial marker-gene surveys Nature Methods 10, 1200–1202. doi:10.1038/nmeth.2658

Examples

data(GlobalPatterns)
tse <- GlobalPatterns

# By specifying 'method', it is possible to apply different transformations,
# e.g. compositional transformation.
tse <- transformAssay(tse, method = "relabundance")

# The target of transformation can be specified with "assay.type"
# Pseudocount can be added by specifying 'pseudocount'.

# Perform CLR with half of the smallest positive value as pseudocount
tse <- transformAssay(
    tse, assay.type = "counts", method = "clr",
    pseudocount = TRUE
    )
#> A pseudocount of 0.5 was applied.

head(assay(tse, "clr"))
#>              CL3       CC1        SV1    M31Fcsw    M11Fcsw    M31Plmr
#> 549322 -1.168523 -1.361753 -0.9147168 -0.3661082 -0.3363033 -0.4485277
#> 522457 -1.168523 -1.361753 -0.9147168 -0.3661082 -0.3363033 -0.4485277
#> 951    -1.168523 -1.361753 -0.9147168 -0.3661082 -0.3363033 -0.4485277
#> 244423 -1.168523 -1.361753 -0.9147168 -0.3661082 -0.3363033 -0.4485277
#> 586076 -1.168523 -1.361753 -0.9147168 -0.3661082 -0.3363033 -0.4485277
#> 246140 -1.168523 -1.361753 -0.9147168 -0.3661082 -0.3363033 -0.4485277
#>           M11Plmr   F21Plmr    M31Tong    M11Tong   LMEpi24M   SLEpi20M
#> 549322 -0.5934365 -0.341834 -0.3337211 -0.2111913 -0.4580681  0.6839692
#> 522457 -0.5934365 -0.341834 -0.3337211 -0.2111913 -0.4580681 -0.4146431
#> 951     0.5051758 -0.341834 -0.3337211 -0.2111913 -0.4580681 -0.4146431
#> 244423 -0.5934365 -0.341834 -0.3337211 -0.2111913 -0.4580681 -0.4146431
#> 586076 -0.5934365 -0.341834 -0.3337211 -0.2111913 -0.4580681 -0.4146431
#> 246140 -0.5934365 -0.341834 -0.3337211 -0.2111913 -0.4580681 -0.4146431
#>            AQC1cm      AQC4cm     AQC7cm        NP2        NP3        NP5
#> 549322  3.1171586  4.27795825  4.5741075  0.7860696 -0.5442038 -0.4552516
#> 522457 -0.8901746  0.58409126  1.5745365 -0.3125427 -0.5442038 -0.4552516
#> 951    -0.8901746 -1.02534666 -0.9904129 -0.3125427 -0.5442038 -0.4552516
#> 244423 -0.8901746  2.78131583  3.0871246 -0.3125427 -0.5442038 -0.4552516
#> 586076 -0.8901746  0.58409126  0.1081994 -0.3125427 -0.5442038 -0.4552516
#> 246140 -0.8901746  0.07326563  0.9554973 -0.3125427 -0.5442038 -0.4552516
#>           TRRsed1    TRRsed2   TRRsed3       TS28      TS29      Even1
#> 549322 -0.3507832 -0.6188603 -0.565259 -0.3427261 -0.328767 -0.4609279
#> 522457 -0.3507832 -0.6188603 -0.565259 -0.3427261 -0.328767 -0.4609279
#> 951    -0.3507832 -0.6188603 -0.565259 -0.3427261 -0.328767 -0.4609279
#> 244423 -0.3507832 -0.6188603 -0.565259 -0.3427261 -0.328767 -0.4609279
#> 586076 -0.3507832 -0.6188603 -0.565259 -0.3427261 -0.328767 -0.4609279
#> 246140 -0.3507832 -0.6188603 -0.565259 -0.3427261 -0.328767 -0.4609279
#>             Even2     Even3
#> 549322 -0.3357284 -0.312044
#> 522457 -0.3357284 -0.312044
#> 951    -0.3357284 -0.312044
#> 244423 -0.3357284 -0.312044
#> 586076 -0.3357284 -0.312044
#> 246140 -0.3357284 -0.312044

# Perform CSS normalization.
tse <- transformAssay(tse, method = "css")
#> 'percentile' set to: 0.369449147024352
head(assay(tse, "css"))
#>        CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr   M11Plmr F21Plmr M31Tong M11Tong
#> 549322   0   0   0       0       0       0 0.0000000       0       0       0
#> 522457   0   0   0       0       0       0 0.0000000       0       0       0
#> 951      0   0   0       0       0       0 0.4995005       0       0       0
#> 244423   0   0   0       0       0       0 0.0000000       0       0       0
#> 586076   0   0   0       0       0       0 0.0000000       0       0       0
#> 246140   0   0   0       0       0       0 0.0000000       0       0       0
#>        LMEpi24M  SLEpi20M   AQC1cm     AQC4cm     AQC7cm       NP2 NP3 NP5
#> 549322        0 0.4887586 8.110544 22.4870699 31.0781736 0.9823183   0   0
#> 522457        0 0.0000000 0.000000  0.4497414  1.4343772 0.0000000   0   0
#> 951           0 0.0000000 0.000000  0.0000000  0.0000000 0.0000000   0   0
#> 244423        0 0.0000000 0.000000  4.9471554  6.9328233 0.0000000   0   0
#> 586076        0 0.0000000 0.000000  0.4497414  0.2390629 0.0000000   0   0
#> 246140        0 0.0000000 0.000000  0.2248707  0.7171886 0.0000000   0   0
#>        TRRsed1 TRRsed2 TRRsed3 TS28 TS29 Even1 Even2 Even3
#> 549322       0       0       0    0    0     0     0     0
#> 522457       0       0       0    0    0     0     0     0
#> 951          0       0       0    0    0     0     0     0
#> 244423       0       0       0    0    0     0     0     0
#> 586076       0       0       0    0    0     0     0     0
#> 246140       0       0       0    0    0     0     0     0

# With MARGIN, you can specify the if transformation is done for samples or
# for features. Here Z-transformation is done feature-wise.
tse <- transformAssay(tse, method = "standardize", MARGIN = "features")
#> Warning: result contains NaN, perhaps due to impossible mathematical
#>                  operation
head(assay(tse, "standardize"))
#>               CL3        CC1        SV1    M31Fcsw    M11Fcsw    M31Plmr
#> 549322 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.3146909
#> 522457 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#> 951    -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161
#> 244423 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242
#> 586076 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311
#> 246140 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#>           M11Plmr    F21Plmr    M31Tong    M11Tong   LMEpi24M   SLEpi20M
#> 549322 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.2831003
#> 522457 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#> 951     4.9029034 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161
#> 244423 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242
#> 586076 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311
#> 246140 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#>            AQC1cm     AQC4cm     AQC7cm        NP2        NP3        NP5
#> 549322  0.5382551  2.8443686  3.7920864 -0.2831003 -0.3146909 -0.3146909
#> 522457 -0.2511010  1.3810554  4.6453681 -0.2511010 -0.2511010 -0.2511010
#> 951    -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161
#> 244423 -0.2802242  2.8626823  3.8626980 -0.2802242 -0.2802242 -0.2802242
#> 586076 -0.2674311  4.3680412  2.0503050 -0.2674311 -0.2674311 -0.2674311
#> 246140 -0.2511010  1.3810554  4.6453681 -0.2511010 -0.2511010 -0.2511010
#>           TRRsed1    TRRsed2    TRRsed3       TS28       TS29      Even1
#> 549322 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.3146909 -0.3146909
#> 522457 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#> 951    -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161 -0.1961161
#> 244423 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242 -0.2802242
#> 586076 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311 -0.2674311
#> 246140 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010 -0.2511010
#>             Even2      Even3
#> 549322 -0.3146909 -0.3146909
#> 522457 -0.2511010 -0.2511010
#> 951    -0.1961161 -0.1961161
#> 244423 -0.2802242 -0.2802242
#> 586076 -0.2674311 -0.2674311
#> 246140 -0.2511010 -0.2511010

# Name of the stored table can be specified.
tse <- transformAssay(tse, method="hellinger", name="test")
head(assay(tse, "test"))
#>        CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr     M11Plmr F21Plmr M31Tong M11Tong
#> 549322   0   0   0       0       0       0 0.000000000       0       0       0
#> 522457   0   0   0       0       0       0 0.000000000       0       0       0
#> 951      0   0   0       0       0       0 0.001518127       0       0       0
#> 244423   0   0   0       0       0       0 0.000000000       0       0       0
#> 586076   0   0   0       0       0       0 0.000000000       0       0       0
#> 246140   0   0   0       0       0       0 0.000000000       0       0       0
#>        LMEpi24M     SLEpi20M      AQC1cm       AQC4cm       AQC7cm        NP2
#> 549322        0 0.0009063565 0.004808474 0.0065133368 0.0087465653 0.00138193
#> 522457        0 0.0000000000 0.000000000 0.0009211249 0.0018790636 0.00000000
#> 951           0 0.0000000000 0.000000000 0.0000000000 0.0000000000 0.00000000
#> 244423        0 0.0000000000 0.000000000 0.0030550257 0.0041310920 0.00000000
#> 586076        0 0.0000000000 0.000000000 0.0009211249 0.0007671245 0.00000000
#> 246140        0 0.0000000000 0.000000000 0.0006513337 0.0013286986 0.00000000
#>        NP3 NP5 TRRsed1 TRRsed2 TRRsed3 TS28 TS29 Even1 Even2 Even3
#> 549322   0   0       0       0       0    0    0     0     0     0
#> 522457   0   0       0       0       0    0    0     0     0     0
#> 951      0   0       0       0       0    0    0     0     0     0
#> 244423   0   0       0       0       0    0    0     0     0     0
#> 586076   0   0       0       0       0    0    0     0     0     0
#> 246140   0   0       0       0       0    0    0     0     0     0

# pa returns presence absence table.
tse <- transformAssay(tse, method = "pa")
head(assay(tse, "pa"))
#>        CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr M11Plmr F21Plmr M31Tong M11Tong
#> 549322   0   0   0       0       0       0       0       0       0       0
#> 522457   0   0   0       0       0       0       0       0       0       0
#> 951      0   0   0       0       0       0       1       0       0       0
#> 244423   0   0   0       0       0       0       0       0       0       0
#> 586076   0   0   0       0       0       0       0       0       0       0
#> 246140   0   0   0       0       0       0       0       0       0       0
#>        LMEpi24M SLEpi20M AQC1cm AQC4cm AQC7cm NP2 NP3 NP5 TRRsed1 TRRsed2
#> 549322        0        1      1      1      1   1   0   0       0       0
#> 522457        0        0      0      1      1   0   0   0       0       0
#> 951           0        0      0      0      0   0   0   0       0       0
#> 244423        0        0      0      1      1   0   0   0       0       0
#> 586076        0        0      0      1      1   0   0   0       0       0
#> 246140        0        0      0      1      1   0   0   0       0       0
#>        TRRsed3 TS28 TS29 Even1 Even2 Even3
#> 549322       0    0    0     0     0     0
#> 522457       0    0    0     0     0     0
#> 951          0    0    0     0     0     0
#> 244423       0    0    0     0     0     0
#> 586076       0    0    0     0     0     0
#> 246140       0    0    0     0     0     0

# rank returns ranks of taxa.
tse <- transformAssay(tse, method = "rank")
head(assay(tse, "rank"))
#>        CL3 CC1 SV1 M31Fcsw M11Fcsw M31Plmr M11Plmr F21Plmr M31Tong M11Tong
#> 549322   0   0   0       0       0       0     0.0       0       0       0
#> 522457   0   0   0       0       0       0     0.0       0       0       0
#> 951      0   0   0       0       0       0   532.5       0       0       0
#> 244423   0   0   0       0       0       0     0.0       0       0       0
#> 586076   0   0   0       0       0       0     0.0       0       0       0
#> 246140   0   0   0       0       0       0     0.0       0       0       0
#>        LMEpi24M SLEpi20M AQC1cm AQC4cm AQC7cm   NP2 NP3 NP5 TRRsed1 TRRsed2
#> 549322        0    580.5   4941 5710.0 5673.0 509.5   0   0       0       0
#> 522457        0      0.0      0 1978.5 3329.5   0.0   0   0       0       0
#> 951           0      0.0      0    0.0    0.0   0.0   0   0       0       0
#> 244423        0      0.0      0 4691.0 4780.0   0.0   0   0       0       0
#> 586076        0      0.0      0 1978.5  800.0   0.0   0   0       0       0
#> 246140        0      0.0      0  804.5 2479.5   0.0   0   0       0       0
#>        TRRsed3 TS28 TS29 Even1 Even2 Even3
#> 549322       0    0    0     0     0     0
#> 522457       0    0    0     0     0     0
#> 951          0    0    0     0     0     0
#> 244423       0    0    0     0     0     0
#> 586076       0    0    0     0     0     0
#> 246140       0    0    0     0     0     0

# In order to use other ranking variants, modify the chosen assay directly:
assay(tse, "rank_average", withDimnames = FALSE) <- colRanks(
    assay(tse, "counts"), ties.method = "average", preserveShape = TRUE)

# Using altexp parameter. First agglomerate the data and then apply
# transformation.
tse <- GlobalPatterns
tse <- agglomerateByRanks(tse)
tse <- transformAssay(
    tse, method = "relabundance", altexp = altExpNames(tse))
# The transformation is applied to all alternative experiments
altExp(tse, "Species")
#> class: TreeSummarizedExperiment 
#> dim: 944 26 
#> metadata(1): agglomerated_by_rank
#> assays(2): counts relabundance
#> rownames(944): Abiotrophiadefectiva Achromatiumoxaliferum ...
#>   proteobacteriumsymbiontofOsedaxsp.MB4 symbiontofNoeetapupillata
#> rowData names(7): Kingdom Phylum ... Genus Species
#> colnames(26): CL3 CC1 ... Even2 Even3
#> colData names(7): X.SampleID Primer ... SampleType Description
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> rowLinks: a LinkDataFrame (944 rows)
#> rowTree: 1 phylo tree(s) (944 leaves)
#> colLinks: NULL
#> colTree: NULL

if (FALSE) { # \dontrun{
# philr transformation can be applied if the philr package is installed.
# Subset data b taking only prevalent taxa
tse <- subsetByPrevalent(tse)
# Apply transformation
tse <- transformAssay(tse, method = "philr", pseudocount = 1, MARGIN = 1L)
# The transformed data is added to altExp
altExp(tse, "philr")
} # }

Arguments

Value

Details

References

See also

Examples