Phylogenetic patterns
We observed some phylogenetic structure within the data, such as highest similarities between certain congener pairs and an overall resemblance of congeneric species. Consistently, only very small deviations in proteomic pattern have been reported in cryptic species complexes (Müller et al., 2013, Dieme et al., 2014), specifically those with only recent speciation (Maasz et al., 2020, Paulus et al., 2022). However, when including all six calanoid genera with congeners in the analysis, similarity was not consistently related to phylogenetic distance and was partly higher between non-congeners than between congeners. Phylogenetic relationships have successfully been identified using proteomic composition (Telleria et al., 2010, Maltseva et al., 2020) and it was suggested that proteomic fingerprints may describe phylogenetic relationships (Zurita et al., 2019). However, our data indicate that proteomic fingerprints are not suitable to address phylogenetic questions in calanoid copepods. This makes sense as proteomic fingerprints are a potpourri of around 300 mainly cytosolic molecules with genes of quite different mutation rates behind them, also influenced by various physiological processes.
For most genera the higher similarity between congeners was not influencing species identification success and is therefore probably not of practical relevance. However, the strongest misidentification while testing library robustness against regionality (i.e., the library did not include specimens from the respective region, but only from other regions), derived from the highly similar congener pairs from the same sub-genus A. danae and A. negligens , as well as C. typicus und C. chierchiae. This misidentification was not resolvable by the post-hoc test, which has been shown to detect false positives quite reliably (Rossel & Martinez Arbizu, 2018a). Since this only occurred when a non-region-specific library was used, this problem may only be relevant to monitoring studies in which a rare species in the habitat or a neobiota of a very similar congener pair is not included in the library used. We have demonstrated here that the composition of the reference library can have a significant impact on the identification of closely related species and therefore needs to be thoroughly tested.