Basic Exploration

Overview

The following packages are needed to succesfully run the examples in this notebook:

  • mia: tools for microbiome data analysis

  • scater: plotting data from TreeSummarizedExperiments

Importing Data as TreeSE

Mia datasets

The mia package comes with several pre-installed datasets. In this course, we will be using Tengeler2020, a study on gut microbiome effects on ADHD in humanised mice (check this presentation for further details about this study).

To get started, we import Tengeler2020 from the mia package and store it into a variable, on which we will work for the rest of the tutorial.

# load dataset and store it into tse
data("Tengeler2020", package = "mia")
tse <- Tengeler2020

Exploring TreeSE

tse
class: TreeSummarizedExperiment 
dim: 151 27 
metadata(0):
assays(1): counts
rownames(151): 1726470 1726471 ... 17264756 17264757
rowData names(6): Kingdom Phylum ... Family Genus
colnames(27): A110 A12 ... A35 A38
colData names(4): patient_status cohort patient_status_vs_cohort
  sample_name
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
rowLinks: a LinkDataFrame (151 rows)
rowTree: 1 phylo tree(s) (151 leaves)
colLinks: NULL
colTree: NULL
dim(tse)
[1] 151  27
colnames(tse)
 [1] "A110" "A12"  "A15"  "A19"  "A21"  "A23"  "A25"  "A28"  "A29"  "A34" 
[11] "A36"  "A37"  "A39"  "A111" "A13"  "A14"  "A16"  "A17"  "A18"  "A210"
[21] "A22"  "A24"  "A26"  "A27"  "A33"  "A35"  "A38" 
rownames(tse)
  [1] "1726470"   "1726471"   "17264731"  "17264726"  "1726472"   "17264724" 
  [7] "17264747"  "17264725"  "17264727"  "17264748"  "17264729"  "172647189"
 [13] "17264753"  "172647167" "172647166" "17264734"  "172647190" "17264719" 
 [19] "1726478"   "172647145" "17264761"  "17264722"  "17264740"  "172647132"
 [25] "17264718"  "172647213" "1726479"   "17264715"  "172647170" "17264738" 
 [31] "172647108" "1726475"   "17264728"  "17264771"  "17264710"  "17264733" 
 [37] "17264744"  "1726476"   "172647169" "172647171" "172647113" "17264766" 
 [43] "17264732"  "172647156" "17264723"  "172647172" "17264769"  "17264745" 
 [49] "172647111" "17264730"  "17264741"  "172647230" "17264759"  "17264772" 
 [55] "17264770"  "17264762"  "17264784"  "1726473"   "172647157" "17264746" 
 [61] "172647147" "17264778"  "17264788"  "172647173" "17264750"  "17264767" 
 [67] "172647220" "17264775"  "172647117" "17264735"  "17264737"  "17264749" 
 [73] "17264736"  "17264711"  "17264712"  "17264743"  "17264780"  "1726474"  
 [79] "172647180" "172647133" "172647211" "172647192" "17264751"  "172647201"
 [85] "172647168" "172647128" "17264768"  "172647204" "17264782"  "172647208"
 [91] "172647214" "172647177" "172647142" "172647120" "17264781"  "172647219"
 [97] "172647195" "172647114" "1726477"   "172647100" "17264779"  "172647267"
[103] "172647216" "172647126" "17264798"  "172647228" "17264777"  "172647175"
[109] "172647139" "17264739"  "17264752"  "172647215" "172647223" "17264721" 
[115] "17264799"  "17264717"  "172647137" "172647146" "17264792"  "172647116"
[121] "17264786"  "172647136" "172647222" "17264774"  "17264760"  "172647412"
[127] "17264794"  "172647181" "172647176" "172647243" "172647138" "172647206"
[133] "172647266" "172647140" "172647198" "172647179" "17264754"  "17264716" 
[139] "17264720"  "172647289" "172647135" "172647283" "172647303" "17264755" 
[145] "17264714"  "172647217" "17264742"  "172647407" "172647186" "17264756" 
[151] "17264757" 

Assays

assays(tse)
List of length 1
names(1): counts
head(assay(tse, "counts"))
          A110   A12  A15  A19  A21  A23  A25  A28 A29  A34  A36   A37  A39
1726470  17722 11630    0 8806 1740 1791 2368 1316 252 5702 2889 12036    0
1726471  12052     0 2679 2776  540  229    0    0   0 6347 2977     0    0
17264731     0   970    0  549  145    0  109  119  31    0    0  3326 9477
17264726     0  1911    0 5497  659    0  588  542 141    0  219 10430    0
1726472   1143  1891 1212  584   84  700  440  244  25 1611  399   835 1178
17264724     0  6498    0 4455  610    0  522  511 352    0    0     0    0
         A111  A13  A14  A16  A17  A18 A210  A22  A24  A26  A27  A33  A35  A38
1726470  9933 1217 3478 5351 4738 8425 4052 1838 3085 1570 3621 4464  719 3250
1726471  7871    0  876    0    0 4879 1762    0 2190    0 1480  599    0 2606
17264731    0 7454    0 2321 1426    0    0 5415    0 3531    0    0 3421    0
17264726  560  449    0 2106 2304    0    0  796   84  135  293  580  314  557
1726472   278 1159 1422 2069 2231  626 2456  976  316 2420 1129  337  931  726
17264724    0    0    0    0    0    0    0    0   70    0  322  435    0  252

colData

names(colData(tse))
[1] "patient_status"           "cohort"                  
[3] "patient_status_vs_cohort" "sample_name"             
head(colData(tse)$patient_status)
[1] "ADHD" "ADHD" "ADHD" "ADHD" "ADHD" "ADHD"

rowData

names(rowData(tse))
[1] "Kingdom" "Phylum"  "Class"   "Order"   "Family"  "Genus"  
head(rowData(tse)$Genus)
[1] "Bacteroides"     "Bacteroides"     "Parabacteroides" "Bacteroides"    
[5] "Akkermansia"     "Bacteroides"    

Other elements

altExp(tse, "my_alt_exp") <- tse[1:10, ]
altExps(tse)
List of length 1
names(1): my_alt_exp
rowTree(tse)

Phylogenetic tree with 151 tips and 149 internal nodes.

Tip labels:
  172647198, 1726478, 1726479, 172647201, 172647222, 17264798, ...
Node labels:
  , 0.789, 0.810, 0.844, 0.973, 0.685, ...

Unrooted; includes branch lengths.
reducedDims(tse)
List of length 0
names(0): 
metadata(tse)
list()