1CL, command line
Figure 1. Primer map of the rRNA operon internal transcribed
spacer (ITS) region. Primers developed for metabarcoding studies are in
bold; fungi-specific primers are underlined. Primers used for global
mycobiome projects are indicated in red (all fungi), purple
(Glomeromycota ) and blue (Earth Microbiome project).
<I>, common intron sites. Updated from Nilsson et
al. (2018).
Figure 2. Common and perspective library preparation strategies
for Illumina sequencing: (a) adding indexes and adapters using
amplification with fusion primers; (b) adding indexes with amplification
and adapters by ligation; (c) amplification and then adding indexes and
adapters in second PCR step; (d) indexing samples with combinations of
Illumina indices (Holm et al., 2020); and (e) incorporating unique
molecular identifiers (UMIs) in the first amplification step
(modification of Karst et al., 2021). Libraries of other HTS platforms
require more specific protocols.
Figure 3. Non-metric multidimensional scaling (NMDS) graphs
illustrating relative performance of various methods and dissimilarity
(B-C, Bray-Curtis or Aitchison) measures in recovering trends in
microbial eukaryote composition using untransformed and
Hellinger-transformed data matrices in plant roots (filled circles) and
leaves (open circles) in terrestrial (orange) and aquatic (blue)
habitats: (a, b) non-rarefied data; (c, d) rarefied data; (e, f) scaling
with ranked subsampling (SRS) normalised data; (g, h) centered log-ratio
(CLR) transformed data. Numbers on symbols indicate plant species
(separate numbering for terrestrial and aquatic plants); ellipses depict
95% CI for tissue and habitat combinations. Explained variation (%) as
revealed from Permanova+ analysis is indicated (t x h, tissue and
habitat interaction; seqs, sequencing depth). Plant species effects are
not analysed here for simplicity. Data from A. Azadnia, V. Mikryukov, L.
Tedersoo (unpublished).
Box1. Trade-offs in Sample Pooling
To improve representativeness of the samples at minimum extra cost,
pooling statistically non-independent subsamples is a widely used
option. The number and spatial distance of subsamples may be of great
importance to provide a representative view of the microbial diversity
in heterogeneous habitats; less inclusive subsampling designs are likely
to result in underestimating diversity (Figure Box1). The number of
subsamples to be pooled depends on the research question and the size of
the area, with 7-25 being optimal in most cases (Schwarzenbach et al.,
2007). Both physical and analytical pooling improve richness and
composition assessments of soil fungi (Schwarzenbach et al., 2007, Song
et al., 2015) and reduce estimate variance (Dickie et al., 2018).
However, pooling of physical samples may result in the loss of patchily
occurring rare taxa (e.g., in extremely dilute fish eDNA samples with a
detection threshold of 0.05% of total relative abundance at deep
sequencing; Sato et al., 2017). These results may be relevant for fungal
groups of relatively low DNA content and/or rRNA copy numbers, e.g.Glomeromycota and unicellular taxa. It is likely that the pooling
effect depends on habitat heterogeneity, including pH, organic matter
content, salinity and plant species present - all of which are factors
known to affect fungal composition in different environments (Amend et
al., 2019; Grossart et al., 2019; Nilsson et al., 2019; U’Ren et al.,
2019). Therefore, pooling samples with potentially different microbial
composition (e.g., leaves of different plant species) is not
recommended. Theoretically, pooling does not work optimally in
situations where the samples contain different amounts of DNA and where
the low-DNA samples feature unique, rare species. Given the greater
overall richness, pooled samples also require deeper sequencing to
detect rare taxa. Furthermore, pooling is unsuited for co-occurrence
analyses assessing biotic interactions (Bahram et al., 2014). Pooling
individual samples at the site level (at the phase of DNA extraction,
PCR, library preparation or sequence data) may be the most useful when
these samples cannot be used as independent replicates (local- or
landscape-scale spatial autocorrelation), e.g. for regional- to
global-scale analyses.