Materials and Methods

The research objects –

As an ideal stress-test basis for evaluating repeatability of morphometric studies in insect systematic research, we selected ten specimens each of a cryptic species pair, Nesomyrmex devius(Csősz & Fisher, 2016) and N. hirtellus (Csősz & Fisher, 2016), for a total of twenty ant specimens. Every trait under observation shows overlapping ranges (Seifert 2009); thus, these species can be classified in multivariate fashion only. Today, cryptic species pairs are considered the most difficult cases and pose extraordinary challenges to systematic biology.
The material is deposited in the California Academy of Sciences, San Francisco, California, U.S.A. The full list of material morphometrically examined in this work is listed in Supplementary Table S1 (available on Dryad athttps://doi.org/10.5061/dryad.q83bk3jfq). Because two specimens suffered a certain degree of damage during the projects due to consecutive postal shipments, making the subsequent gaugers unable to measure them, final analyses were done on only 18 individuals. The ant specimens used in this study comply with the regulations for export and exchange of research samples outlined in the Convention on Biological Diversity and the Convention on International Trade in Endangered Species of Wild Fauna and Flora. For field work conducted in Madagascar, permits to research, collect, and export ants were obtained from the Ministry of Environment and Forest as part of an ongoing collaboration between the California Academy of Sciences and the Ministry of Environment and Forest, Madagascar National Parks and Parc Botanique et Zoologique de Tsimbazaza (Approval Numbers: N° 0142N/EA03/MG02, N° 340N-EV10/MG04, N° 69 du 07/04/06, N° 065N-EA05/MG11, N° 047N-EA05/MG11, N° 083N-A03/MG05, N° 206 MINENVEF/SG/DGEF/DPB/SCBLF, N° 0324N/EA12/MG03, N° 100 l\fEF/SG/DGEF/DADF/SCBF, N° 0379N/EA11/MG02, N° 200N/EA05/MG02). Authorization for export was provided by the Director of Natural Resources.

Gaugers –

We addressed the question of whether or not the morphometric measurements performed by eleven gaugers (“measurers”) could be considered repeatable based on statistical thresholds. Eleven volunteers from three continents and six countries, who all have different levels of taxonomic training and skill, were asked to perform a pair of measurements on the same set of ant specimens with their own equipment. Eight of the volunteers are myrmecologists and three are non-myrmecologists (two are wasp specialists and one is a dipterologist). The wide range of the observers’ morphometric skills and the different levels of laboratory facilities and equipment, especially the types of microscopes used, provided an overview of morphometric reproducibility as it works in daily practice. Data belonging to gaugers appear anonymously in this paper, but in order to provide the most important information regarding their skills and their equipment’s quality, gaugers are coded in triad format as follows: expertise in field, estimated total number of specimens measured in their career, and the maximum magnification of the microscope used in the present study separated by underscores (e.g. MYRM_9000_100x).

The morphometric character recording protocol –

Gaugers were asked to measure 21 continuous morphometric characters in each specimen twice in order to collect data for testing both intra-gauger error, equivalent to repeatability, and inter-gauger error rate, equivalent to reproducibility. Every gauger was provided the same measurement protocol, including visual and verbatim trait definitions to follow (Fig. 2 and Table 1). The protocol was assembled based on an existing set of characters used in published papers (Seifert, 2006, 2018; Csősz & Fisher, 2016; Schlick-Steiner et al., 2006; Wagner et al., 2017). In the current work, we addressed the question of to what extent random and systematic errors affect the rate of reproducibility. Therefore, all gaugers were encouraged to eliminate extraordinary differences due to gross error (occurring due to misreading, mistyping or erroneously set magnification) by comparing the values of the repeated observations.
Table 1. Verbatim trait definitions for morphometric character recording.