To query a SummarizedExperiment
for interesting features, several
functions are available.
getTop(
x,
top = 5L,
method = c("mean", "sum", "median"),
assay.type = assay_name,
assay_name = "counts",
na.rm = TRUE,
...
)
# S4 method for class 'SummarizedExperiment'
getTop(
x,
top = 5L,
method = c("mean", "sum", "median", "prevalence"),
assay.type = assay_name,
assay_name = "counts",
na.rm = TRUE,
...
)
getUnique(x, ...)
# S4 method for class 'SummarizedExperiment'
getUnique(x, rank = NULL, ...)
summarizeDominance(x, group = NULL, name = "dominant_taxa", ...)
# S4 method for class 'SummarizedExperiment'
summarizeDominance(x, group = NULL, name = "dominant_taxa", ...)
# S4 method for class 'SummarizedExperiment'
summary(object, assay.type = assay_name, assay_name = "counts")
Numeric scalar
. Determines how many top taxa to return. Default is
to return top five taxa. (Default: 5
)
Character scalar
. Specify the method to determine top taxa. Either
sum, mean, median or prevalence. (Default: "mean"
)
Character scalar
. Specifies the name of the
assay used in calculation. (Default: "counts"
)
Deprecated. Use assay.type
instead.
Logical scalar
. Should NA values be omitted?
(Default: TRUE
)
Additional arguments passed on to agglomerateByRank()
when
rank
is specified for summarizeDominance
.
Character scalar
. Defines a taxonomic rank. Must be a value of
the output of taxonomyRanks()
. (Default: NULl
)
With group, it is possible to group the observations in an
overview. Must be one of the column names of colData
.
Character scalar
. A name for the column of the
colData
where results will be stored. (Default: "dominant_taxa"
)
A
SummarizedExperiment
object.
The getTop
returns a vector of the most top
abundant
“FeatureID”s
The getUnique
returns a vector of unique taxa present at a
particular rank
The summarizeDominance
returns an overview in a tibble. It contains dominant taxa
in a column named *name*
and its abundance in the data set.
The summary
returns a list with two tibble
s
The getTop
extracts the most top
abundant “FeatureID”s
in a SummarizedExperiment
object.
The getUnique
is a basic function to access different taxa at a
particular taxonomic rank.
summarizeDominance
returns information about most dominant
taxa in a tibble. Information includes their absolute and relative
abundances in whole data set.
The summary
will return a summary of counts for all samples and
features in
SummarizedExperiment
object.
data(GlobalPatterns)
top_taxa <- getTop(GlobalPatterns,
method = "mean",
top = 5,
assay.type = "counts")
top_taxa
#> [1] "549656" "331820" "279599" "360229" "317182"
# Use 'detection' to select detection threshold when using prevalence method
top_taxa <- getTop(GlobalPatterns,
method = "prevalence",
top = 5,
assay_name = "counts",
detection = 100)
top_taxa
#> [1] "549656" "331820" "94166" "317182" "279599"
# Top taxa os specific rank
getTop(agglomerateByRank(GlobalPatterns,
rank = "Genus",
na.rm = TRUE))
#> [1] "Bacteroides" "Dolichospermum" "Faecalibacterium" "Neisseria"
#> [5] "Haemophilus"
# Gets the overview of dominant taxa
dominant_taxa <- summarizeDominance(GlobalPatterns,
rank = "Genus")
dominant_taxa
#> # A tibble: 17 × 3
#> dominant_taxa n rel_freq
#> <chr> <int> <dbl>
#> 1 Bacteroides 5 0.192
#> 2 Crenothrix 3 0.115
#> 3 Faecalibacterium 2 0.0769
#> 4 Prochlorococcus 2 0.0769
#> 5 Streptococcus 2 0.0769
#> 6 CandidatusNitrososphaera 1 0.0385
#> 7 CandidatusPortiera 1 0.0385
#> 8 CandidatusSolibacter 1 0.0385
#> 9 Corynebacterium 1 0.0385
#> 10 Desulfuromonas 1 0.0385
#> 11 Dolichospermum 1 0.0385
#> 12 Luteolibacter 1 0.0385
#> 13 MC18 1 0.0385
#> 14 Neisseria 1 0.0385
#> 15 Nitrosopumilus 1 0.0385
#> 16 Polaribacter 1 0.0385
#> 17 Veillonella 1 0.0385
# With group, it is possible to group observations based on specified groups
# Gets the overview of dominant taxa
dominant_taxa <- summarizeDominance(GlobalPatterns,
rank = "Genus",
group = "SampleType",
na.rm = TRUE)
dominant_taxa
#> # A tibble: 20 × 4
#> # Groups: SampleType [9]
#> SampleType dominant_taxa n rel_freq
#> <fct> <chr> <int> <dbl>
#> 1 Mock Bacteroides 3 1
#> 2 Feces Bacteroides 2 0.5
#> 3 Feces Faecalibacterium 2 0.5
#> 4 Freshwater (creek) Crenothrix 2 0.667
#> 5 Skin Streptococcus 2 0.667
#> 6 Freshwater Dolichospermum 1 0.5
#> 7 Freshwater Prochlorococcus 1 0.5
#> 8 Freshwater (creek) Luteolibacter 1 0.333
#> 9 Ocean CandidatusPortiera 1 0.333
#> 10 Ocean Polaribacter 1 0.333
#> 11 Ocean Prochlorococcus 1 0.333
#> 12 Sediment (estuary) Crenothrix 1 0.333
#> 13 Sediment (estuary) Desulfuromonas 1 0.333
#> 14 Sediment (estuary) Nitrosopumilus 1 0.333
#> 15 Skin Corynebacterium 1 0.333
#> 16 Soil CandidatusNitrososphaera 1 0.333
#> 17 Soil CandidatusSolibacter 1 0.333
#> 18 Soil MC18 1 0.333
#> 19 Tongue Neisseria 1 0.5
#> 20 Tongue Veillonella 1 0.5
# Get an overview of sample and taxa counts
summary(GlobalPatterns, assay.type= "counts")
#> $samples
#> # A tibble: 1 × 6
#> total_counts min_counts max_counts median_counts mean_counts stdev_counts
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 28216678 58688 2357181 1106849 1085257. 650145.
#>
#> $features
#> # A tibble: 1 × 3
#> total singletons per_sample_avg
#> <int> <int> <dbl>
#> 1 19216 2134 4022.
#>
# Get unique taxa at a particular taxonomic rank
# sort = TRUE means that output is sorted in alphabetical order
# With na.rm = TRUE, it is possible to remove NAs
# sort and na.rm can also be used in function getTop
getUnique(GlobalPatterns, "Phylum", sort = TRUE)
#> [1] "ABY1_OD1" "AC1" "AD3" "Acidobacteria"
#> [5] "Actinobacteria" "Armatimonadetes" "BRC1" "Bacteroidetes"
#> [9] "CCM11b" "Caldiserica" "Caldithrix" "Chlamydiae"
#> [13] "Chlorobi" "Chloroflexi" "Crenarchaeota" "Cyanobacteria"
#> [17] "Elusimicrobia" "Euryarchaeota" "Fibrobacteres" "Firmicutes"
#> [21] "Fusobacteria" "GAL15" "GN02" "GN04"
#> [25] "GN06" "GN12" "GOUTA4" "Gemmatimonadetes"
#> [29] "Hyd24-12" "KSB1" "LCP-89" "LD1"
#> [33] "Lentisphaerae" "MVP-15" "NC10" "NKB19"
#> [37] "Nitrospirae" "OP11" "OP3" "OP8"
#> [41] "OP9" "PAUC34f" "Planctomycetes" "Proteobacteria"
#> [45] "SAR406" "SBR1093" "SC3" "SC4"
#> [49] "SM2F11" "SPAM" "SR1" "Spirochaetes"
#> [53] "Synergistetes" "TG3" "TM6" "TM7"
#> [57] "Tenericutes" "Thermi" "Thermotogae" "Verrucomicrobia"
#> [61] "WPS-2" "WS1" "WS2" "WS3"
#> [65] "ZB2" "ZB3" NA