Introduction
Metabarcoding of fungal communities using high-throughput technologies is a powerful tool for investigating fungal ecology. The internal transcribed spacer (ITS) region of the rDNA operon has been used extensively as the DNA barcode for fungi, particularly in environmental sequencing studies (Gardes & Bruns, 1996; Schoch et al., 2012). Because the ITS region has been used as a barcode for approximately two decades, the bioinformatic tools for processing amplicon data are well developed and the ITS reference databases (UNITE and INSD) are consistently curated and updated (Abarenkov et al., 2020; Nilsson et al., 2019). However, even with the widespread usage of these databases there are problems, including poor representation of some taxonomic groups, low quality sequences, and incorrect taxonomic annotation (Abarenkov et al., 2018; Hofstetter et al., 2019; Nilsson et al., 2012). These problems mean that results from taxonomic assignments of operational taxonomic units (OTUs) must be interpreted with caution (Nilsson et al., 2006; Yahr et al., 2016). Despite its widespread use, the ITS is not a suitable barcode for all taxonomic groups. There are known issues with sequencing ITS in some fungal lineages, including high variability in ITS length between groups (resulting in favored PCR and sequencing for shorter fragments) (Castaño et al., 2020; Engelbrektson et al., 2010; Manter & Vivanco, 2007), lack of interspecific discrimination (and therefore inability to use the marker for species-level determinations) (e.g. Gazis et al., 2011), interspecific rDNA copy number variation (Lindner & Banik, 2011), and primer biases which exclude some groups (Bellemain et al., 2010; Li et al., 2020; Tedersoo & Lindahl, 2016). For example, among arbuscular mycorrhizal fungi (AMF) the ITS is hypervariable and has high intraspecific and intra-spore variation compared to the small (SSU) and large (LSU) rDNA subunits (Egan et al., 2018; Thiéry et al., 2012). One study found up to 6% divergence among sequences from a single spore (Lloyd-MacGilp et al., 1996). A recent study also found wide rDNA copy number variation across kingdom Fungi that was uncorrelated with trophic mode (Lofgren et al., 2019), making such variation unpredictable in environmental samples. Interspecific rDNA variation can lead to the formation of multiple OTUs derived from a single individual and individuals with more rDNA copies could potentially dominate during PCR amplification and sequencing from mixed templates. Among early diverging fungal (EDF) lineages, direct comparisons of markers for metabarcoding have not been performed for many groups. An exception are the AMF for which the SSU, LSU, and ITS have been evaluated and some combination of two markers is commonly used (Hart et al., 2015; Öpik et al., 2014). Additionally, the LSU was suggested to perform better than ITS as a barcode for EDF due to greater PCR success and a larger barcode gap (i.e. difference between inter- and intraspecific variation) than SSU (Schoch et al., 2012). Another critical advantage of the LSU and SSU is the ability to perform phylogenetic reconstruction with the OTUs to provide preliminary placement of unidentified sequences.
The majority of fungal metabarcoding studies have focused on the Dikarya. In contrast, EDF such as chytrids and zygomycetes are often overlooked or ignored in environmental sequencing studies. EDF are also generally missing or underrepresented during attempts to develop “universal” fungal primers for metabarcoding. For example, mock communities used to validate primer performance often contain none or a few EDF, despite the fact that these fungi often have highly divergent target sequences relative to Dikarya (e.g. Ihrmark et al., 2012; Pérez-Izquierdo et al., 2017; Stielow et al., 2015; Tedersoo et al., 2015). Early diverging lineages are among the least studied fungi and are generally challenging to collect and manipulate in the lab. Many EDF are not culturable using standard techniques, are obligate symbionts, and have limited or no sequence data available (e.g. Benny et al. 2016; Corsaro et al. 2014, 2017; Lazarus & James, 2015; Letcher & Powell, 2019; Malar et al. 2021). Available data suggest that a large portion of undescribed taxa belong to EDF lineages (Tedersoo et al. 2017, 2020; Torres-Cruz et al., 2017; Walsh et al., 2020). Metabarcoding methods can provide an essential tool for learning more about these “dark matter fungi” (Grossart et al., 2016), especially if taxonomy can be reliably assigned to the OTUs. Among the most understudied groups of EDF is the Zoopagomycota. The placement of Zoopagomycota is still unresolved but phylogenomic studies indicate this lineage is either sister to all other terrestrial fungi (Dikarya + Mucoromycota – Spatafora et al., 2016) or sister to the Mucoromycota (Li et al., 2021). This phylum is ecologically diverse and includes fungal parasites (mycoparasites) as well as parasites of small animals. Specialized enrichment methods indicate that some taxa are diverse and widespread in soils, leaf litter, and dung (e.g. Benjamin, 1958; Benny et al., 2016; Drechsler, 1938; Duddington, 1955). Despite these findings, Zoopagomycota species are absent or found in low abundance in most metabarcoding studies (Lazarus et al., 2017; Reynolds et al., 2019).
We focused on some species of Zoopagomycota as a test case for evaluating methodological biases because: 1) these fungi are routinely found in soil during culture-based studies (e.g. Benjamin 1958; Benny et al., 2016) and yet they are not readily detected with metabarcoding, 2) there are limited reference sequence data available, 3) among species for which sequence data are available there is large ITS sequence length variation (Lazarus et al., 2017; Reynolds et al., 2019), and 4) they are difficult to isolate, culture, and work with in the lab, making environmental sampling a particularly important investigative tool. Out of the three subphyla (Entomophthoromycotina, Kickxellomycotina, and Zoopagomycotina) we focused on taxa that have been isolated from soil samples. This includes members of the Zoopagomycotina and Kickxellomycotina that have been regularly isolated from our study sites and for which we have a large culture collection. Zoopagomycotina species are mycoparasites that primarily attack host fungi in the Mucoromycota (and sometimes Ascomycota) or are parasites of microinvertebrates (e.g. amoebae, nematodes, rotifers) (Zoopagales). The Kickxellomycotina is the most diverse subphylum and includes mycoparasites (Dimargaritales) that are parasitic on Mucoromycota and Ascomycota, commensalistic arthropod gut-dwelling species (Asellariales, Harpellales, Orphellales), and putative saprotrophs (Kickxellales). We collected soil (Dimargaritales, Kickxellales, Zoopagales), freshwater sediments and water (Harpellales, Orphellales), and microinvertebrate (Zoopagales) samples from multiple sites in California (CA) and Florida (FL), two geographically distant states with divergent climate and soils. Both locations have been heavily sampled for Zoopagomycota fungi using selective culturing methods (Benjamin 1958, 1959, 1961; Benny et al., 2016; Lazarus et al., 2017; Reynolds et al., 2019). We did not include Entomophthoromycotina because they are obligate arthropod pathogens and likely to be rare or absent from many environmental samples.
In order to detect fungal communities, we compared the ITS1 marker region to two different regions of the LSU rDNA, using two primer pairs for LSU (LROR/LR3 and LR3 paired with the newly designed primer LR22F). Because relatively few studies have used LSU as a sole marker for profiling fungal communities, we also compared the efficacy of two methods for LSU OTU taxonomy assignment: the RDP Naïve Bayesian Classifier (hereafter RDP classifier) (Wang et al., 2007) and UTAX (Edgar, 2010) to search a manually curated reference database. We created the new LSU database by combining the RDP (Cole et al., 2014) and SILVA (Quast et al., 2013) reference databases along with additional sequences from GenBank and our lab. Finally, we used a subset of LSU OTUs generated from each primer pair that were identified as Zoopagomycota or only to kingdom Fungi and included them in phylogenetic reconstructions. Analyzing amplicons in this way allowed us to validate the taxonomic assignments made by our pipeline as well as identify putative errors, a step that is not possible with ITS data. We created a Zoopagomycota mock community that was evaluated alongside the environmental samples to investigate how the metabarcoding pipeline affected that group specifically. We aimed to address the following questions: 1) Are Zoopagomycota truly rare in the environment?, 2) Are methodological biases (such as marker choice and fragment length) inhibiting the detection of Zoopagomycota?, and 3) Are both factors contributing to the absence of Zoopagomycota in metabarcoding studies? We hypothesized that: 1) fragment length amplification and sequencing bias would decrease the detection of Zoopagomycota species, 2) the LR22F/LR3 primer pair would outperform the LROR/LR3 primer pair in terms of OTU clustering and phylogenetic reconstruction, and 3) taxonomy assignment methods would strongly differ in the identification of Zoopagomycota and other EDF OTUs.