The exact germplasm source of HB and LN plantation population
The distribution of haplotypes in different populations (Figures
1,2,6, Supplementary Tables S7, S8 ) refers to the phylogenetic
relationship of all individuals and the cluster relationship based on
the population (Figures 4&5 ), the germplasm sources of the
plantation could be inferred.
The fixed ancestral status relationship among the 5 groups was:
SX*-HB*-HB-LN*-LN (Figure 4A ), which showed that the SX* group
had an absolute core position in northern China consistent with previous
research results (Zhou et al., 2022). 21 haplotypes (nad5-1) could
provide valuable information for germplasm traceability of plantations,
and the other 7 haplotypes exist in someone group privately
(Supplementary Table S8-2 ).
Among HB plantation populations, there was a little difference in
genetic structure (Figure 4,5 ), and a similar haplotype
distribution pattern (Figure 1,2,6 ), suggesting the germplasm
sources background of the HB were relatively simple. SHB2 and SHB1* had
almost the same haplotype distribution pattern (Supplementary
Table S8 ). SHB2 and SHB1* were always in the same topology, with
adjacent genetic distance (Figure 4B, 5B ), and were always in
the same subgroup (Figure 4E, 5E ). At the same time, the two
populations were geographically close (Figure 1,2, Supplementary
Table S1 ). Based on the above explanation, it was very possible to
determined that the germplasm of SHB2 came from SHB1* (Figure
7 ), which also brought the accuracy of the way of determining the
germplasm sources of plantations based on haplotype distribution,
genetic distance, and ancestral composition. Hap-6 was the dominant
haplotype of HTLZ1, which also accounts for the majority of GDS*(9/33),
and there was a close genetic distance between HTLZ1 and GDS*
(Figure 5B ), suggesting the germplasm sources of HTLZ1 may came
from GDS* (Figure 7 ). Hap-4 was the dominant haplotype of
HTLZ2-4, GDS*(17/33), ZTS*(7/21), LLS2*(5/12), SHB1*(17/33),
DWP1*(9/17), DWP2*(9/23), HLH*(5/21) and BDS*(10/34), which also
accounts for the majority of GCS*(8/31) (Figure 2 ,Supplementary Table S8) , suggesting HTLZ2-4 might came from
these natural populations. According to our field investigation, the
size of SHB1* and BDS* populations was so small that it was difficult to
provide enough seeds for the plantations. The stand age of DWP1-2* is
about the same as that of HTLZ2-4, DWP1-2* might be mistaken for a
natural forest, which means that DWP1-2* was unlikely to provided seeds
for HTLZ2-4. Hap-6 was the secondary haplotype of HTLZ2-4, which also
accounts for the majority of GDS*(9/33) (Figure 2 ,Supplementary Table S8 ), suggesting the germplasm sources of
HTLZ2-4 most likely came from GDS*. GDS* and HTLZ2-4 existed in the same
topology, while GDS* occupies the ancestral position (Figure
5B ), which provided strong evidence that the germplasm sources of
HTLZ2-4 came from GDS*. Combined with the above evidence, we confirmed
that the germplasm sources of HTLZ1-4 were most likely to came from GDS*
(Figure 7 ). The geographical location between HTLZ1-4 is close,
and their germplasm sources should be in the same place, which showed
the accuracy of our traceability method. Based on hap-4 and hap-11
(Figure 2 , Supplementary Table S8 ), it was inferred
that DWP3 was most likely to came from ZTS* (Figure 7 ), and the
adjacent genetic distance between them provided strong evidence for this
(Figure 4B ). Hap-4, hap-6, and hap-11 were the main haplotypes
of DWP4, which were consistent with ZTS*. In addition, the GDS* was
adjacent to DWP4 (Figure 5B ), and GDS* was the ancestor
population of the topology where DWP4 was located (Figure 4B ).
It was speculated that both ZTS* and GDS* might be the germplasm sources
of DWP4. The germplasm sources of DWP4 and DWP3 should be the same, the
most likely germplasm sources of DWP4 should also be ZTS*
(Figure 7 ). The haplotype distribution pattern of MJB was
almost the same as that of ZTS* and THS* (Supplementary Table
S8 ), nailed in the same topology with the close genetic distance
(Figure 4B, 5B ), and fixed in the same subgroup (Figure
4E, 5E ). It was hard to determined which of the two natural forests was
more similar to the genetic background of MJB, we suggested that both
ZTS* and THS* might be the germplasm sources of MJB (Figure 7 ).
XF, LH, and QG fixed in the same subgroup with a consistent proportion
of the dominant ancestral component, nailed in the same topology with
the adjacent genetic distance, and shared a similar haplotype
distribution pattern, the main haplotypes were hap-4, 11, 6, 13, and 7
in turn. GCS* showed a similar haplotype distribution pattern to these
three populations. Taking genetic distance and ancestral components as
auxiliary information: GCS* occupied the ancestor position of the
topology and subgroup of QG, XF, and LH. It was inferred that the
germplasm sources of XF, LH, and QG were most likely came from GCS*
(Figure 7 ).
Among LN plantation populations, there was a large difference in genetic
structure (Figure 3,4 ), and a diverse haplotype distribution
pattern (Figure 1,2,6 ), suggesting the germplasm sources
background of the LN was complicated. The haplotype distribution
patterns of HD1, HD2, WD1, WD2, and DCY were the same, with the dominant
haplotype of hap-7, which was only WF* consistent with it
(Figure 2 , Supplementary Table S8 ). The adjacent
genetic relationship (Figure 4B, 5B ) and similar genetic
lineage between them (Figure 4E, 5E ) provided powerful evidence
for the possibility that the germplasm of the five plantations came from
WF*. WF* population with the large size, tall trees, convenient
geographical location, and long-term artificial management, we further
speculated that WF* provided germplasms for these five populations (HD1,
HD2, WD1, WD2, and DCY) (Figure 7 ). The adjacent genetic
relationship of DB1, DB2, and ZJS* in the NJ tree (Figure 5B )
and the consistency of genetic lineages (Figure 5E ) provided
strong evidence for the hypothesis that the germplasm of these two
plantations came from ZJS*. Due to the limited sample size of the ZJS*
population, we did not found obvious evidence of the haplotype
distribution model. Based on the distribution of haplotypes, we could
not rule out that GDS* and GCS* provided germplasm for DB1 and DB2, but
we did not found this evidence in the genetic distance and genetic
lineage. The geographical distance between DB and ZJS* is relatively
close (Figure 1,2, Supplementary Table S1 ), which providing
germplasm allocation convenience. The convenient geographical location
and large size of the ZJS* bring sufficient conditions for it as a
germplasm allocation population. We speculated that the germplasm of DB1
and DB2 came from the local area, and ZJS* provided them with germplasm
(Figure 7 ). ZZD1, ZZD2, LJG, and ZGT were located in the same
topology with the adjacent genetic distance (Figure 4B, 5B ),
fixed in the same cluster with a similar lineage (Figure 5E ),
it showed that there was little difference in their genetic background.
GDS* and WTG* nailed in the topology in which ZZD1, ZZD2, LJG, and ZGT
exist, and occupied the ancestral position (Figure 5B ). ZZD1,
ZZD2, LJG and ZGT shared a similar haplotype distribution pattern with
GDS* and WTG*, suggesting that GDS* and WTG* might be the germplasm of
these four plantations. According to our investigation, WTG* with the
remote geographical location, small population size, poor growth, and
few seeds, indicates that WTG* do not have enough conditions to provide
germplasms for plantations construction. LJG shared a similar haplotype
structure with WF*, however, the large genetic distance between them
weakens the possibility of WF* providing germplasm for LJG. We
immaturely judged that their germplasms came from GDS* according to
genetic structure and genetic lineage. Based on the above inference, we
speculated that the germplasms of these four populations (ZZD1, ZZD2,
LJG, ZGT) were all came from GDS* (Figure 7 ). HQ was nailed the
priority position of the NJ tree (Figure 4B, 5B ), and it was
difficult to judged the origin of its germplasms according to its
haplotype structure, genetic distance, and genetic lineage. We consulted
the afforestation archives of HQ and determined that most of its
germplasm came from the Xingcheng seed orchard (A seed orchard of
Chinese pine in northern China), which was selected from superior trees
in all over Liaoning Province. The complex germplasm background of HQ
brought challenges to germplasm traceability, and we could laboriously
determine its germplasm sources here (Figure 7 ).
To sum up, we suggesting almost all HB populations came from SX* (GDS*,
ZTS*, GCS*, and THS*), which leaded to the genetic background
homogeneity of HB populations. Shanxi and Hebei Provinces are
geographically close, which provides convenience for germplasms
allocation. Most of the germplasms of LN plantations come from LN*
(ZJS*, WF*), and the other part come from GDS* (SX*), which resulted in
great differences in genetic structure within the LN group.