3.1 Species delimitation
The complete data set consisted of 1,492 barcodes, ranging from 312 to 658 bp in length. In total, there were 319 variable sites (48.5%), of which 299 (93.7%) were parsimony informative. Most parsimony informative sites occurred in the third codon-position (Table 1). The sequences were heavily AT-biased, specifically in the third position, which exhibited a combined average AT-composition of 89.3% (Table 1). Average intraspecific and interspecific K2P-distances for all analyzedPolypedilum species were 1.3% and 15.2%, respectively. The barcode gap is an important concept in barcoding studies (Puillandre et al., 2012). It works well when the amount of intraspecific divergence is much smaller than the amount of interspecific variation between species. When this condition is met, a ‘barcoding gap’ exists (Meyer & Paulay, 2005). In general, our data showed clearly larger interspecific than intraspecific divergences, but we still could not observe the expected ‘barcoding gap’ in the pairwise K2P distances. On the contrary, a barcode overlap between the intraspecific and the interspecific distances was found, which may be attributable to the presence of cryptic species diversity and a few misidentifications. The lack of a gap is usually associated with recently diverged species with little genetic diversification, frequently coupled with incomplete lineage sorting and introgression (Wiemers & Fiedler, 2007; Dupuis et al., 2012).
Overall, most of the tested methods recovered similar groupings of molecular operational taxonomic units (MOTUs) (Figures 1-4), with the mPTP method being the most conservative, lumping the sequences into fewer MOTUs, and the bPTP algorithm the most relaxed, lumping the sequences into several MOTUs (Table 2). Two out of the three distance-based methods, ABGD and ASAP, yield unreliable delimitations with wide confidence intervals, with several clusters not reflecting relationships as understood based on the geographical sampling localities and others diverging into numerous lineages despite diminished divergence between them. ABGD and ASAP results were not included in the Figures 1-4. The BIN analysis returned a total of 415 MOTUs of which 174 were singleton BINs, 222 concordant BINs, and 19 discordant BINs. In total, 615 sequences of 143 morphospecies were assigned to 179 BINs, including 72 singleton BINs, 519 concordant BINs, and 24 discordant BINs. The unidentified 877 specimens, without binomial names, were assigned to 236 BIN-species, including 102 singleton BINs, 118 concordant BINs, and 16 discordant BINs.
DNA-based species delimitation applying bPTP, mPTP, sPTP, and sGMYC resulted in divergent number of clusters. The single-threshold general mixed Yule-coalescent calculations (sGMYC) recovered 370 MOTUs, while the sPTP model produced a more conservative number of MOTUs (411) compared to the bPTP method, which yielded 520 MOTUs (Table 2). The results from analyses using the multi-rate PTP (mPTP) model were also comparable to those of the other models, but revealed larger clusters, occasionally joining lineages belonging to different species in a single MOTU (Figure 1). Divergences in the number of clusters generated by the different species delimitation algorithms are caused by erroneously inferred splitting or lumping events (i.e., specimens of one morphospecies were divided or joined into two or more different MOTUs). However, regardless of the method applied, the total number of species delimited in Polypedilum in this study is at least twice as high (267–520) as the number of included morphospecies (143, see above).