Population structure with individual-based clustering
A multivariate statistical approach was used to infer the genetic relatedness of the individuals within the study plots in small patches (1- and 10-ha) and large patches (100-ha and continuous forests). Discriminant Analysis of Principal Components (DAPC) partitions the genetic variance into between-group and within-group components, to maximize discrimination between groups without making assumptions of panmixia (Jombart et al., 2010). This approach is more convenient for populations that are assumedly partially clonal and genetically related due to relatively recent isolation events. DAPC uses both principal component analysis (PCA) that are identified using discriminant analysis (DA), to infer the number of clusters in the metapopulation. To select the optimal number of PCs that should be retained in the following DAPC, stratified cross-validation of DAPC was performed by sampling variable numbers of PCs from a subset of the observations in each population (while the number of discriminant functions remained fixed). The number of principal components (PCs) with the best score was considered as the optimal number to include most sources of variation. A DAPC was run using population IDs (n = 12) corresponding to their geographical site as priors for population clusters, with the optimal number of PCs axes and using the five first axes retained in the DA. Following the same workflow, a DAPC was run grouping the twelve populations to their corresponding fragment sizes (1-, 10, 100-ha, and continuous forests) as cluster population priors.
To compare whether populations in small and large patches present significant genetic differentiation within and between populations, the pairwise genetic differentiation index (F ST: Weir and Cockerham, 1984) was computed for each population pair using the package hierfstat (Goudet & Jombart, 2022). Further, hierarchical analysis of molecular variance (AMOVA) was applied to theF ST calculations of the total genetic variation observed among forest habitat size (small vs. large patches), and populations within small patches, and large patches separately. Significances were determined using 999 permutations. All calculations were done using the software R with the packages hierfst version 0.5-11 R package (Goudet & Jombart, 2022), poppr (Kamvar et al., 2014, 2015), and pegas (Paradis, 2010).