2.3 Genetic diversity and population genetic structure
For each population, we evaluated genetic diversity and population genetic structure according to standard metrics in POPGENE 1.32 (Wang, 1996; Yeh, Yang & Boyle, 1999). These metrics included the number of individuals (N), percentage of polymorphic loci (PPL), observed number of alleles (N a), effective number of alleles (Ne ), Shannon’s information index (I ; Lewontin, 1972), expected heterozygosity (H e; Kimura & Crow, 1964), Nei’s genetic diversity (h ), total gene diversity (H t), the average gene diversity within populations (H s), and Nei’s standard genetic distance (GD). We also calculated the degree of genetic differentiation between populations (G ST) as (H t–H s)/H t (Nei, 1973) and the parameter of gene exchange as N m = 0.5(1- G ST)/G ST (McDermott & McDonald, 1993). Similarly, we estimated parameters of genetic diversity, coefficients of gene differentiation and gene flow for eight pairs of AFLP primers.
In order to search for partitions of sampling sites genetically homogenous but maximally differentiated from each other, we conducted a spatial analysis of molecular variance using SAMOVA 1.0 (Dupanloup, Schneider, & Excoffier, 2002) based on AFLP datasets. Within SAMOVA, we used a K -means method to select the best clustering among groups of populations based on genetic variation coefficients (F CT) (Li et al., 2020). For values of Kin the range two to ten, we set simulated annealing processes to 100 with 10,000 steps each. We selected the value of K that maximizedF CT values as the optimal grouping of populations. Using this optimal grouping, we evaluated the genetic variation between populations within groups and between groups in SAMOVA 1.0 via an analysis of molecular variance (AMOVA, Excoffier, Smouse, & Quattro, 1992) in ARLEQUIN v3.01 (Excoffier, Laval, & Schneider, 2005). Neutrality tests, such as Tajima’s D and Fu’s Fs , were also calculated with this program. Subsequently, we determined the correlation between F ST inferred from the binary matrix of scored AFLPs and geographic distance of the populations via a Mantel test (Mantel, 1967) in GenAlEx 6.5 (Peakall & Smouse, 2012) with 9999 permutations to evaluate significance.
To further investigate the genetic associations among 43 populations ofP. villosa , we used the SAHN module in NTSYS-pc 2.10e (Rohlf, 1997) to generate a UPGMA tree from the genetic distance matrix derived from the binary AFLP dataset, and also carried out a principal coordinate analysis (PCoA) based on the distance matrix. Moreover, we constructed a similarity‐based network in SplitsTree 4.13 (Huson & Bryant, 2006) to infer the relationships between individuals and populations by applying the Neighbor‐Net algorithm with the Jaccard’s measure of distance.
In addition, we inferred groupings and genetic structures of populations of P. villosa using STRUCTURE V2.2 (Pritchard, Stephens, & Donnelly, 2000; Falush, Stephens, & Pritchard, 2007; Hubisz, Falush, Stephens, & Pritchard, 2009), which differs from SAMOVA by not requiring that groupings be geographically adjacent. In STRUCTURE, we performed the analyses using an admixture model with independent allele frequencies for 90 independent runs for the number of clusters (K ) ranging from one to ten. We applied 2×105repetitions of the Markov chain Monte Carlo with a burn-in of 25%. To determine the best value of K for the STRUCTURE analyses, we used the ΔK statistical method (Evanno, Regnaut, & Goudet, 2005).