Population structure with individual-based clustering
A multivariate statistical approach was used to infer the genetic
relatedness of the individuals within the study plots in small patches
(1- and 10-ha) and large patches (100-ha and continuous forests).
Discriminant Analysis of Principal Components (DAPC) partitions the
genetic variance into between-group and within-group components, to
maximize discrimination between groups without making assumptions of
panmixia (Jombart et al., 2010). This approach is more convenient for
populations that are assumedly partially clonal and genetically related
due to relatively recent isolation events. DAPC uses both principal
component analysis (PCA) that are identified using discriminant analysis
(DA), to infer the number of clusters in the metapopulation. To select
the optimal number of PCs that should be retained in the following DAPC,
stratified cross-validation of DAPC was performed by sampling variable
numbers of PCs from a subset of the observations in each population
(while the number of discriminant functions remained fixed). The number
of principal components (PCs) with the best score was considered as the
optimal number to include most sources of variation. A DAPC was run
using population IDs (n = 12) corresponding to their geographical
site as priors for population clusters, with the optimal number of PCs
axes and using the five first axes retained in the DA. Following the
same workflow, a DAPC was run grouping the twelve populations to their
corresponding fragment sizes (1-, 10, 100-ha, and continuous forests) as
cluster population priors.
To compare whether populations in small and large patches present
significant genetic differentiation within and between populations, the
pairwise genetic differentiation index (F ST: Weir
and Cockerham, 1984) was computed for each population pair using the
package hierfstat (Goudet & Jombart, 2022). Further,
hierarchical analysis of molecular variance (AMOVA) was applied to theF ST calculations of the total genetic variation
observed among forest habitat size (small vs. large patches), and
populations within small patches, and large patches separately.
Significances were determined using 999 permutations. All calculations
were done using the software R with the packages hierfst version
0.5-11 R package (Goudet & Jombart, 2022), poppr (Kamvar et al.,
2014, 2015), and pegas (Paradis, 2010).