Calling workflow comparisons
Our final GT-seq dataset, called using the published GT-seq pipeline (Campbell et al., 2015), included 325 autosomal SNPs and 2 sex-linked markers. An additional 3 loci were removed from the BCFTOOLS workflow datasets after filtering for minimum depth (depth 6, 10), leaving each with 322 autosomal loci and 2 sex-linked markers. We removed the same 3 loci from our GT-seq pipeline dataset to enable direct comparison of genotypes and missing data by locus across the calling methods. Based on all 457 samples, there was an average of 25.4% missing data for the GT-seq calling pipeline, whereas missing data were 23.9% and 21.3% for BCF-10 and BCF-6 calling workflows, respectively. Regardless of BCFTOOLS calling workflow, genotype mismatch with the GT-seq workflow was 1.1% on average. Based on these results and the potential for easy comparison with existing ddRADseq data, we chose to use the dataset generated from the BCF-6 calling workflow to assess genotyping error and analyze population structure.