1. Introduction
Historically morphological differences remain the basis for species
identification, taxonomic keys, and effort in species delimitation. Yet,
reliable classification of specimens can be complex due to many factors.
For example, when species are morphologically extremely similar or when
morphological characters are not expressed at a given life-history stage
(e.g., juveniles). In the last decade, the increasing affordability of
reduced-representation data (e.g., restriction-site-associated DNA
sequencing or target enrichment) or whole genome re-sequencing has
provided new possibilities to assign species not only based on
morphological or meristic characters, but also on genomic information.
In some instances, this has even greatly contributed to the discovery
and description of new (i.e., previously cryptic) species (Fennessy et
al., 2016; Nater et al., 2017). Genetic species assignment approaches
are also promising to add novel tools to aid in conservation efforts of
endangered species, but practical implementations often fail (Campbell
et al., 2019; Piertney, 2016; Shafer et al., 2015). A major disadvantage
of high-throughput sequencing techniques are the cost and time that is
needed to generate libraries, sequence them, and to analyze the data.
But, importantly, genomic data also allow for the identification of a
suite of informative, diagnostic genetic markers for species or
population assignment that can be genotyped using cheaper and faster
methods (Shafer et al., 2015).
Among all genetic variants, single-nucleotide polymorphisms (SNPs) are
clearly the most abundant (in the human population for example more than
95% of all genetic variants are SNPs (Auton et al., 2015)) and
therefore powerful genetic markers for assigning populations or species.
Over the past 30 years, many methods have been developed to
cost-effectively genotype SNPs. One widely used fast method are PCR
restriction fragment length polymorphism (PCR-RFLP) markers (McKeown,
Robin, & Shaw, 2015; Ota, Fukushima, Kulski, & Inoko, 2007). Hereby, a
particular DNA fragment is first amplified by PCR. The resulting
amplicon is then digested using a restriction enzyme that cuts only one
allele at a diagnostic SNP (resulting in two fragments) but not the
other one (one fragment), due to an, ideally species-specific,
polymorphism in the enzyme’s recognition site. Homozygous individuals
for either allele, as well as heterozygous individuals (three
fragments), can be easily distinguished from each other by gel
electrophoresis (see detailed description of the method in Ota et al.,
2007). Therefore, PCR-RFLP is an excellent method that can be used for
fast, cheap, and reliable genotyping of diagnostic markers.
Recently, we have sequenced 453 genomes of a very young species flock of
Nicaraguan Midas cichlid fishes (Amphilophus cf. citrinellus )
(Kautt et al., 2020). This species complex includes, so far, 13
described species (Torres-Dowdall & Meyer, in press). Two species
(A.s citrinellus and A. labiatus ) can be found in both
Great Lakes Managua and Nicaragua (Barluenga, Stölting, Salzburger,
Muschick, & Meyer, 2006). From there, seven crater lakes (Apoyeque,
Apoyo, As. León, As. Managua, Masaya, Tiscapa and Xiloá) have been
colonized (K. R. Elmer et al., 2014; Kathryn R. Elmer, Lehtonen, Fan, &
Meyer, 2013; Kathryn R Elmer, Lehtonen, & Meyer, 2009). In two of the
crater lakes, Apoyo and Xiloá, six and four endemic species have been
described, respectively (Barlow & Munsey, 1976; Geiger, McCrary, &
Stauffer Jr, 2010; Recknagel, Kusche, Elmer, & Meyer, 2013; Stauffer
Jr, McCrary, & Black, 2008; Stauffer Jr & McKaye, 2002). In Crater
Lake As. Manuagua, another endemic species, A. tolteca , has been
formally described (Recknagel et al., 2013), while species of the other
crater lakes await formal description (why we included them here as
‘populations’).
Crater lake populations and sympatric species therein clearly form
separate clusters using both RAD-sequencing data (Kautt,
Machado-Schiaffino, & Meyer, 2018) and whole-genome data (Kautt et al.,
2020). While all crater lake populations and species differ
morphologically (Kathryn R. Elmer, Kusche, Lehtonen, & Meyer, 2010;
Kautt et al., 2018), species assignment can be difficult, especially
when specimens are young, and particularly for the sympatric species
from crater lakes Apoyo and Xiloá. Therefore, methods to quickly
genotype fish using genetic markers would give additional confidence for
species assignments and allow identification of species also for
juvenile fish. This is important for certain research questions
including for example cohort analyses and unbiased frequency
estimations. Moreover, several of these species are protected or live in
protected environments where illegal fishing occurs. Cheap genotyping
assays with a fast turnaround time might contribute to conservation
monitoring.
The objectives of this study were therefore to (1) design a workflow to
screen for suitable GB-RFLP markers for species and population
assignment, (2) test in silico if those markers would allow
unambiguous assignment and (3) to perform GB-RFLP assays on independent
samples (i.e., samples that have been not used for the design of the
markers in (1)) to test if the markers are suitable to assign species
and populations (i.e., lakes of origin).