Association analyses suggest colour is regulated via an epistatic interaction between two candidate loci
Mapping RADseq reads against reference genome resulted in 95% (± 0.6 SD) mapping rate, where an average of 9 million reads (± 5mio SD) were utilized per individual. Catalogues construction resulted in the discovery of 1165325 loci, with an average size of 185.7 sites and an average coverage of 46.2x (± 20.9x SD). Filtering for minor allelic frequency, presence in 80% of the dataset, and collecting one random SNP per RAD-locus resulted in a panel of 19212 bi-allelic SNPs. The distribution of RADseq reads across genomic regions of candidate loci for coloration implied that our strategy indeed covered not only flanking regions but also the coding sequence, despite not a single locus in the coding sequence has survived the variant calling pipeline (Table S5A and B). The search for structure at a meta-population level identified two clusters, though neither were associated with colour, family, or temporal stamp (Fig. S2-S3). Numbers varied between clusters, with cluster 1 containing most of the samples. Association studies jointly revealed a total of 2 markers concordantly associated with the colour phenotype, which we will henceforth refer to as 602-G/C(log(p) = 6.32) and 3657-C/T (log(p) = 4.43) (Fig. 2; Table 2). Exploring the effect size of those associations revealed a high predictive power for the colour “grey” either when genotypes are analysed independently for each locus or jointly. Specifically, independent genotype combinations showed that heterozygotes and homozygotes for the least common allele to have between 70% and 90% probability of being grey and between to 55% to 100% when genotypes of both loci are interpreted together (Fig. 2). Notably, the probability increase is explained by the number of copies of the less common allele in each genotype (w-test= C in 602-G/C , w = 20.28, 2, p = 1.0e-05; T in 3657-C/T w = 20.29, p = 1.0e-05) (Table S6-S11)