Identifying sex-linked DArT SNPs for Macquarie perch and golden perch
We used DArTseq (Kilian et al., 2012), a reduced-representation sequencing method, to genotype 93 female and 78 male Macquarie perch from the Dartmouth and Yarra populations, and 41 female and 25 male golden perch from Macquarie, Murray and Murrumbidgee populations (samples in Supplementary Material S1). DArTseq approach is similar to double-digest restriction-associated sequencing, except it ensures high SNP quality by including ~25% of technical replicates from independent libraries in each sequencing lane and rejects SNPs with low reproducibility between technical replicates. Sequencing libraries were prepared at Diversity Arrays Technology Pty Ltd (Canberra, Australia) following Kilian et al. (2012). DNA samples were digested using a combination of restriction enzymes Pst I and Sph I that target low-copy genomic regions. Pooled libraries were sequenced using single-read technology on an Illumina HiSeq2500 (94 samples per lane; details in Appendix A). SNP discovery and genotyping were performed using DArT P/L’s proprietary analytical pipeline (detailed in Nguyen, Premachandra, Kilian, & Knibb, 2018). Briefly, the primary pipeline removed poor-quality sequences, applying more stringent criteria to the barcode region than the rest of the sequence and corrected low-quality bases from singleton tags using collapsed tags with multiple members as a template. Then the secondary pipeline (DArTsoft14) parsed clusters, comprising 69-bp sequenced tags differing by no more than 3 bases, into separate SNP loci, while ensuring the balance of read counts for the allelic pairs: loci with a 5-fold or higher difference in read counts for each allele were rejected. SNPs with a reproducibility <95% were removed, but no other filtering was performed. DArT loci for each perch species were aligned to their respective newly-assembled reference genomes (Table 1) using BLAST, with e-value ≤5e-5 and sequence identity ≥90%.
We tested each DArT SNP locus for belonging to one of four types: (i ) Y-linked : homozygous in males, and absent in females; (ii )XY-gametologs: homozygous in females, heterozygous in all males; (iii ) loci with male-specific allele : homozygous in females and heterozygous in >10% of males; and (iv )loci with female-specific allele : heterozygous in >10% of females and homozygous in males. The tests were performed using the gl.sexlinkage function of the dartR package (Gruber, Unmack, Berry, & Georges, 2018) in R (R Core Team, 2020) (details in Appendix B). To reduce the number of false positives due to small sample size, only loci successfully scored in >75% (>58) of male Macquarie perch and >95% (>23) male golden perch were considered, male sample sizes being smaller than for females in both species.