1CL, command line
Figure 1. Primer map of the rRNA operon internal transcribed spacer (ITS) region. Primers developed for metabarcoding studies are in bold; fungi-specific primers are underlined. Primers used for global mycobiome projects are indicated in red (all fungi), purple (Glomeromycota ) and blue (Earth Microbiome project). <I>, common intron sites. Updated from Nilsson et al. (2018).
Figure 2. Common and perspective library preparation strategies for Illumina sequencing: (a) adding indexes and adapters using amplification with fusion primers; (b) adding indexes with amplification and adapters by ligation; (c) amplification and then adding indexes and adapters in second PCR step; (d) indexing samples with combinations of Illumina indices (Holm et al., 2020); and (e) incorporating unique molecular identifiers (UMIs) in the first amplification step (modification of Karst et al., 2021). Libraries of other HTS platforms require more specific protocols.
Figure 3. Non-metric multidimensional scaling (NMDS) graphs illustrating relative performance of various methods and dissimilarity (B-C, Bray-Curtis or Aitchison) measures in recovering trends in microbial eukaryote composition using untransformed and Hellinger-transformed data matrices in plant roots (filled circles) and leaves (open circles) in terrestrial (orange) and aquatic (blue) habitats: (a, b) non-rarefied data; (c, d) rarefied data; (e, f) scaling with ranked subsampling (SRS) normalised data; (g, h) centered log-ratio (CLR) transformed data. Numbers on symbols indicate plant species (separate numbering for terrestrial and aquatic plants); ellipses depict 95% CI for tissue and habitat combinations. Explained variation (%) as revealed from Permanova+ analysis is indicated (t x h, tissue and habitat interaction; seqs, sequencing depth). Plant species effects are not analysed here for simplicity. Data from A. Azadnia, V. Mikryukov, L. Tedersoo (unpublished).
Box1. Trade-offs in Sample Pooling
To improve representativeness of the samples at minimum extra cost, pooling statistically non-independent subsamples is a widely used option. The number and spatial distance of subsamples may be of great importance to provide a representative view of the microbial diversity in heterogeneous habitats; less inclusive subsampling designs are likely to result in underestimating diversity (Figure Box1). The number of subsamples to be pooled depends on the research question and the size of the area, with 7-25 being optimal in most cases (Schwarzenbach et al., 2007). Both physical and analytical pooling improve richness and composition assessments of soil fungi (Schwarzenbach et al., 2007, Song et al., 2015) and reduce estimate variance (Dickie et al., 2018). However, pooling of physical samples may result in the loss of patchily occurring rare taxa (e.g., in extremely dilute fish eDNA samples with a detection threshold of 0.05% of total relative abundance at deep sequencing; Sato et al., 2017). These results may be relevant for fungal groups of relatively low DNA content and/or rRNA copy numbers, e.g.Glomeromycota and unicellular taxa. It is likely that the pooling effect depends on habitat heterogeneity, including pH, organic matter content, salinity and plant species present - all of which are factors known to affect fungal composition in different environments (Amend et al., 2019; Grossart et al., 2019; Nilsson et al., 2019; U’Ren et al., 2019). Therefore, pooling samples with potentially different microbial composition (e.g., leaves of different plant species) is not recommended. Theoretically, pooling does not work optimally in situations where the samples contain different amounts of DNA and where the low-DNA samples feature unique, rare species. Given the greater overall richness, pooled samples also require deeper sequencing to detect rare taxa. Furthermore, pooling is unsuited for co-occurrence analyses assessing biotic interactions (Bahram et al., 2014). Pooling individual samples at the site level (at the phase of DNA extraction, PCR, library preparation or sequence data) may be the most useful when these samples cannot be used as independent replicates (local- or landscape-scale spatial autocorrelation), e.g. for regional- to global-scale analyses.