Introduction
Metabarcoding of fungal communities using high-throughput technologies
is a powerful tool for investigating fungal ecology. The internal
transcribed spacer (ITS) region of the rDNA operon has been used
extensively as the DNA barcode for fungi, particularly in environmental
sequencing studies (Gardes & Bruns, 1996; Schoch et al., 2012). Because
the ITS region has been used as a barcode for approximately two decades,
the bioinformatic tools for processing amplicon data are well developed
and the ITS reference databases (UNITE and INSD) are consistently
curated and updated (Abarenkov et al., 2020; Nilsson et al., 2019).
However, even with the widespread usage of these databases there are
problems, including poor representation of some taxonomic groups, low
quality sequences, and incorrect taxonomic annotation (Abarenkov et al.,
2018; Hofstetter et al., 2019; Nilsson et al., 2012). These problems
mean that results from taxonomic assignments of operational taxonomic
units (OTUs) must be interpreted with caution (Nilsson et al., 2006;
Yahr et al., 2016). Despite its widespread use, the ITS is not a
suitable barcode for all taxonomic groups. There are known issues with
sequencing ITS in some fungal lineages, including high variability in
ITS length between groups (resulting in favored PCR and sequencing for
shorter fragments) (Castaño et al., 2020; Engelbrektson et al., 2010;
Manter & Vivanco, 2007), lack of interspecific discrimination (and
therefore inability to use the marker for species-level determinations)
(e.g. Gazis et al., 2011), interspecific rDNA copy number variation
(Lindner & Banik, 2011), and primer biases which exclude some groups
(Bellemain et al., 2010; Li et al., 2020; Tedersoo & Lindahl, 2016).
For example, among arbuscular mycorrhizal fungi (AMF) the ITS is
hypervariable and has high intraspecific and intra-spore variation
compared to the small (SSU) and large (LSU) rDNA subunits (Egan et al.,
2018; Thiéry et al., 2012). One study found up to 6% divergence among
sequences from a single spore (Lloyd-MacGilp et al., 1996). A recent
study also found wide rDNA copy number variation across kingdom Fungi
that was uncorrelated with trophic mode (Lofgren et al., 2019), making
such variation unpredictable in environmental samples. Interspecific
rDNA variation can lead to the formation of multiple OTUs derived from a
single individual and individuals with more rDNA copies could
potentially dominate during PCR amplification and sequencing from mixed
templates. Among early diverging fungal (EDF) lineages, direct
comparisons of markers for metabarcoding have not been performed for
many groups. An exception are the AMF for which the SSU, LSU, and ITS
have been evaluated and some combination of two markers is commonly used
(Hart et al., 2015; Öpik et al., 2014). Additionally, the LSU was
suggested to perform better than ITS as a barcode for EDF due to greater
PCR success and a larger barcode gap (i.e. difference between inter- and
intraspecific variation) than SSU (Schoch et al., 2012). Another
critical advantage of the LSU and SSU is the ability to perform
phylogenetic reconstruction with the OTUs to provide preliminary
placement of unidentified sequences.
The majority of fungal metabarcoding studies have focused on the
Dikarya. In contrast, EDF such as chytrids and zygomycetes are often
overlooked or ignored in environmental sequencing studies. EDF are also
generally missing or underrepresented during attempts to develop
“universal” fungal primers for metabarcoding. For example, mock
communities used to validate primer performance often contain none or a
few EDF, despite the fact that these fungi often have highly divergent
target sequences relative to Dikarya (e.g. Ihrmark et al., 2012;
Pérez-Izquierdo et al., 2017; Stielow et al., 2015; Tedersoo et al.,
2015). Early diverging lineages are among the least studied fungi and
are generally challenging to collect and manipulate in the lab. Many EDF
are not culturable using standard techniques, are obligate symbionts,
and have limited or no sequence data available (e.g. Benny et al. 2016;
Corsaro et al. 2014, 2017; Lazarus & James, 2015; Letcher & Powell,
2019; Malar et al. 2021). Available data suggest that a large portion of
undescribed taxa belong to EDF lineages (Tedersoo et al. 2017, 2020;
Torres-Cruz et al., 2017; Walsh et al., 2020). Metabarcoding methods can
provide an essential tool for learning more about these “dark matter
fungi” (Grossart et al., 2016), especially if taxonomy can be reliably
assigned to the OTUs. Among the most understudied groups of EDF is the
Zoopagomycota. The placement of Zoopagomycota is still unresolved but
phylogenomic studies indicate this lineage is either sister to all other
terrestrial fungi (Dikarya + Mucoromycota – Spatafora et al., 2016) or
sister to the Mucoromycota (Li et al., 2021). This phylum is
ecologically diverse and includes fungal parasites (mycoparasites) as
well as parasites of small animals. Specialized enrichment methods
indicate that some taxa are diverse and widespread in soils, leaf
litter, and dung (e.g. Benjamin, 1958; Benny et al., 2016; Drechsler,
1938; Duddington, 1955). Despite these findings, Zoopagomycota species
are absent or found in low abundance in most metabarcoding studies
(Lazarus et al., 2017; Reynolds et al., 2019).
We focused on some species of Zoopagomycota as a test case for
evaluating methodological biases because: 1) these fungi are routinely
found in soil during culture-based studies (e.g. Benjamin 1958; Benny et
al., 2016) and yet they are not readily detected with metabarcoding, 2)
there are limited reference sequence data available, 3) among species
for which sequence data are available there is large ITS sequence length
variation (Lazarus et al., 2017; Reynolds et al., 2019), and 4) they are
difficult to isolate, culture, and work with in the lab, making
environmental sampling a particularly important investigative tool. Out
of the three subphyla (Entomophthoromycotina, Kickxellomycotina, and
Zoopagomycotina) we focused on taxa that have been isolated from soil
samples. This includes members of the Zoopagomycotina and
Kickxellomycotina that have been regularly isolated from our study sites
and for which we have a large culture collection. Zoopagomycotina
species are mycoparasites that primarily attack host fungi in the
Mucoromycota (and sometimes Ascomycota) or are parasites of
microinvertebrates (e.g. amoebae, nematodes, rotifers) (Zoopagales). The
Kickxellomycotina is the most diverse subphylum and includes
mycoparasites (Dimargaritales) that are parasitic on Mucoromycota and
Ascomycota, commensalistic arthropod gut-dwelling species (Asellariales,
Harpellales, Orphellales), and putative saprotrophs (Kickxellales). We
collected soil (Dimargaritales, Kickxellales, Zoopagales), freshwater
sediments and water (Harpellales, Orphellales), and microinvertebrate
(Zoopagales) samples from multiple sites in California (CA) and Florida
(FL), two geographically distant states with divergent climate and
soils. Both locations have been heavily sampled for Zoopagomycota fungi
using selective culturing methods (Benjamin 1958, 1959, 1961; Benny et
al., 2016; Lazarus et al., 2017; Reynolds et al., 2019). We did not
include Entomophthoromycotina because they are obligate arthropod
pathogens and likely to be rare or absent from many environmental
samples.
In order to detect fungal communities, we compared the ITS1 marker
region to two different regions of the LSU rDNA, using two primer pairs
for LSU (LROR/LR3 and LR3 paired with the newly designed primer LR22F).
Because relatively few studies have used LSU as a sole marker for
profiling fungal communities, we also compared the efficacy of two
methods for LSU OTU taxonomy assignment: the RDP Naïve Bayesian
Classifier (hereafter RDP classifier) (Wang et al., 2007) and UTAX
(Edgar, 2010) to search a manually curated reference database. We
created the new LSU database by combining the RDP (Cole et al., 2014)
and SILVA (Quast et al., 2013) reference databases along with additional
sequences from GenBank and our lab. Finally, we used a subset of LSU
OTUs generated from each primer pair that were identified as
Zoopagomycota or only to kingdom Fungi and included them in phylogenetic
reconstructions. Analyzing amplicons in this way allowed us to validate
the taxonomic assignments made by our pipeline as well as identify
putative errors, a step that is not possible with ITS data. We created a
Zoopagomycota mock community that was evaluated alongside the
environmental samples to investigate how the metabarcoding pipeline
affected that group specifically. We aimed to address the following
questions: 1) Are Zoopagomycota truly rare in the environment?, 2) Are
methodological biases (such as marker choice and fragment length)
inhibiting the detection of Zoopagomycota?, and 3) Are both factors
contributing to the absence of Zoopagomycota in metabarcoding studies?
We hypothesized that: 1) fragment length amplification and sequencing
bias would decrease the detection of Zoopagomycota species, 2) the
LR22F/LR3 primer pair would outperform the LROR/LR3 primer pair in terms
of OTU clustering and phylogenetic reconstruction, and 3) taxonomy
assignment methods would strongly differ in the identification of
Zoopagomycota and other EDF OTUs.