Environmental Association Analysis
Our intra-population randomization approach showed that the predictive power of GEAM was much larger than expected by chance. Such high predictive power was based on the genetic diversity found in five closely-located populations with geographical distances that were not correlated with either genetic or environmental distances (Llanos-Garrido et al. 2019). Yet, a small number of inter-population randomizations and randomizations by neutral loci also yielded significant models, as expected from a certain degree of environmental pseudoreplication and genetic aggregation in our data. However, the rate of significance was close to 5%, i.e. the conventional level of type I error rate for significance in statistical tests. On the other hand, our complete randomization approach, which included the critical outlier selection step, produced a relatively large number of significant models (25%, still much lower than the 100% obtained by the ‘correct’ intra-population approach). This confirmed that outlier analyses were effectively able to sort through the randomized SNP databases identifying those that explain the greatest variance among arbitrary subgroups, in such way that the projection of that genetic variance into the environmental PC-space around the five sampled populations resulted into significant association models. However, the environmental signal of these randomly genotyped SNPs was significantly smaller than that of real data. This supports the idea that the particular SNPs selected by our EAA could be good proxies for the genetic variability that is involved in local adaptation to different environmental conditions at each population.. In addition, given the low standard error of parameter estimates (Table 1), our final genotype-environment association model should be regarded as robust.