Figure Box1. Potential underestimation of biodiversity and high variance at low number of (sub)samples: (a) OTU richness, (b) Shannon index of diversity and (c) Effective number of species (here as the exponential of the Shannon index). Rarefied datasets from six sites (Zhou et al., 2016) were randomized 100 times to generate subsets of various sample sizes representing composite, analytically pooled samples. These samples were further rarefied at the same level for calculating diversity values. Note that the differences would be greater without both rarefaction steps. Symbols represent means of analytically pooled samples, and error bars indicate 95% CI. The differences between a single sample and 21 randomly selected samples average 18.7% (+-3.6%), 4.4% (+-0.9%) and 35.7% (+-7.9%) for (a), (b) and (c), respectively.
Box 2. Amplified sequence variant (ASV) methods
ASV approaches represent a specific type of greedy de novoclustering and several alternative methods - notably DADA2 (Callahan et al., 2016), UNOISE (Edgar, 2016) and deblur (Amir et al., 2017) - have recently become popular in microbiology including fungal ecology (Glassman et al., 2018). In DADA2, ASVs correspond to 100%-similarity OTUs, where sample-wise rare variants are assigned to dominant haplotypes based on an error model with stringent settings (Callahan et al., 2016). Deblur is relatively less conservative (Amir et al., 2017) and relatively inefficient in removing rare haplotypes (Li et al., 2021). The ASV approaches are certainly useful for separating as many taxa as possible based on conserved genes, but their utility for ITS and protein-encoding genes is unclear. They may outperform traditional approaches in distinguishing the aforementioned saprotrophic and pathogenic Ascomycota with haploid genomes. However, it severely biases diversity estimates of metazoans based on the Cytochrome Oxidase 1 (CO1) gene (Brandt et al., 2021) and it is expected to perform poorly for fungal groups with dikaryotic (Basidiomycota ), diploid (most unicellular groups), and polyploid (Glomeromycota ) genomes that commonly exhibit two or multiple different rRNA gene and ITS copies per genome or even within haploid nuclei (Lindner et al., 2013; Egan et al., 2018). Estensmo et al. (2021) demonstrated that in polypores, the ASVs significantly overestimated species richness. Using a re-analysis of a dataset from Furneaux et al. (2021), we show that the ASV approach reduces phylogenetic richness by disproportionately eliminating rare members of the unicellular fungal groups, Glomeromycota and non-fungal eukaryotes (Figure Box2). In terms of beta diversity, the results are similar between ASV and OTU-based approaches (Glassman et al., 2018). If one is interested in using the denoising algorithm of DADA2, a subsequent post-clustering of ASVs (implemented in pipelines such as LotuS2 and amptk) may be a solution (Furneaux et al., 2021; Estensmo et al., 2021), but this approach does not ameliorate the loss of true unique species.