Discussion
Although haploid induction was not successful, we obtained a more juvenile, easily regenerable and transformable individual GM15, which appears to be extremely similar to its parent tree LM50 based on ploidy, genotype and genome size evidence, and thus was considered suitable for sequencing. Here, we integrated advanced SMRT sequencing technology (PacBio), Illumina correction and chromosome conformation capture (Hi-C) to assemble a high quality haplotype-resolved genome. In comparion to several published poplar genomes, including P. trichocarpa(Tuskan et al., 2006), P. euphratica (T. Ma et al., 2013),P. pruinosa (Yang et al., 2017), and P. alba var. pyramidalis(J. Ma et al., 2019), the assembly quality of P. tomentosa was of higher or comparable quality . Only for the P. alba genome was the contig N50 longer than forP. tomentosa (1.18 Mb vs. 0.96 MB); however, its contigs have not been associated with specific chromosomes yet (Y. J. Liu et al., 2019) (Table S7). The whole genome size of P. tomentosa is 740.2 Mb, which is comprised of the sum of subgenome A (P. alba var. pyramidalis ) and subgenome D (P. adenopoda ). It obviously differs with those ofP. trichocarpa (422.9 Mb), P. euphratica (497.0 Mb), andP. pruinosa (479.3 Mb), P. alba var. pyramidalis(464.0 Mb) and P. alba (416.0 Mb), which respectively consist of 19 chromosomes as the allelic diversity in these diploids were subsumed into a single haploid genome rather than into two diploid subgenomes (Y. J. Liu et al., 2019; J. Ma et al., 2019; T. Ma et al., 2013; Tuskan et al., 2006; Yang et al., 2017). However, this case is very similar to the genome of a hybrid poplar (84K) recently published, which was subdivided into two subgenomes (P. alba and P. tremula var. glandulosa ) with a total genome size of 747.5 Mb (Qiu et al., 2019) (Table S7).
We presented evidence for divergence and duplication events inPopulus , as well as within the P. tomentosa lineage. Like other many flowering plants (Otto, 2007),Salicaceae species underwent a common palaeohexaploidy event, followed by a palaeotetraploidy event before the divergence ofSalix and Populus (Lin et al., 2018; Y. J. Liu et al., 2019; Tuskan et al., 2006). Subsequently, poplar speciation occurred gradually. Section Populus andP. trichocarpa differentiated from each other approximately 13.44 Mya (Ks ≈ 0.035). The ancestors of P. tomentosa , P. adenopoda and P. alba var. pyramidalis successively diverged from section Populus approximately 9.3 Mya and 4.8 Mya.Populus tomentosa emerged from a hybridization event approximately 3.9 Mya. This finding differs from previous proposals on the origin of P. tomentosa (Z. Wang et al., 2014). Unlike most other sequenced poplars (T. Ma et al., 2013; Tuskan et al., 2006; Yang et al., 2017), the P. tomentosa genome consists of subgenome A (P. alba var.pramidalis ) and subgenome D (P. adenopoda ) (Fig. 3 and Table 1). Hi-C, as a chromosome conformation capture-based method, has become a mainstream technique for the study of the 3D organization of genomes (W. Ma et al., 2018). Based on both Hi-C analysis (Figure 2) and phylogenetics analysis with P. adenopoda and P. alba var. pramidalis , we were able to partion the P. tomentosa genome into two subgenomes. Phylogenetic analysis clearly revealed the relationships among three white poplars (Fig. 4d, Fig. S6). Further, 19 chromosome-by-chromosome phylogenetic trees all supported the same hybrid origin hypothesis (Fig. S7). The phylogenetic analyses of the chloroplast genomes of P. tomentosa showed that the female parental species was P. adenopoda (Figure S8); thus, it appears that P. alba var.pramidalis was the paternal parent species. There also appears to be variation within P. tomentosa with respect to its hybrid origin. Based on a small number of marker genes, Wang et al. (2019) suggested thatP. alba acted as the male parental species, but that the maternal parent could be either P. adenopoda or P. davidiana (forP. tomentosa types mb1 and mb2, respectively) (D. Wang et al., 2019). However, P. tomentosa of Shandong provenance had not been collected in their experimental materials, quite coincidentally, the elite P. tomentosa clone LM50 in our study was from Shandong provenance, is different with P. tomentosa types mb1 and mb2. Thus, P. tomentosa may have a more complex evolutionary history than is fully understood, including possibly multiple independent origins.
Our analysis of recombination events within genes showed that theP. tomentosa subgenomes have largely remained independent, despite sharing the same nucleus for approximately 3.93 million years. Comparision of 5,345 single copy orthologs from P. tomentosa ,P. alba var. pyramidalis and P. adenopoda showed recombination was only observed in 0.87% of the genes studied (Fig. S5, Table S13). To assess if this low rate of recombination would be expected given the time since the species’ origin, we used recombination data from a recent study in the closely related European aspen (P. tremula ) (to generate an expected rate of recombination, assuming this non-hybrid species shows normal recombination rates for Populus ). They estimated the recombination rate to be 15.6-16.1 cM/Mbp/generation (Apuli et al., 2020). In general, P. tomentosa has a long life cycle, the seedlings begin flowering after at least 7–8 years and thereafter annual flowering occurs during the reproductive phase (Zhu, 1992). Assuming a generation time of about 20 years, 31 recombinations per 1 kb gene would be expected—several orders of magnitude below our observation. This suggests that the two subgenomes of P. tomentosa have been maintained largely intact over many thousands of genertions, despite ample opportunity for recombination events to have occurred within the studied genes. The subgenome integrity of P. tomentosa , where there appears to be a low rate of normal meiotic products, is congruent with observations of very low fertility in the species. In a study of elite tree resourse of P. tomentosa , most of them showed weak fertility, a low rate of seed setting, germination and seedling surviving (Bai, 2015). Such characteristics and recent genetic analysis of P. tomentosa(D. Wang et al., 2019) suggest thatP. tomentosa acts like the F1 generation of a wide cross, with quite limited but not zero fertility.
SVs are increasingly being recognized as major factors underying phenotypic variation in eukaryotic organisms (Gabur, Chawla, Snowdon, & Parkin, 2019). In plants, SVs have been proved to be closely related to many phenotypic variations such as of plant height (Zhou et al., 2015), and biotic stress resistance (Cook et al., 2012). In our study, we detected 15,480 SVs across the genome of GM15 of which 12,885 were INDELS and accounted for the majority of SVs (83%). GO analysis indicated INDELS are highly represented within genes with roles in plant-pathogen interaction and carbohydrate metabolism. They may therefore contribute to characteristics such as disease resistance and fast growth, for which P. tomentosa is well known. A few INDELS are also enriched in genes associated with meiotic DNA double-strand break processing and repair, as well as inactivation of chromation and histone methylation in telomeres. Perhaps such SVs contribute to retaining independence of the two subgenomes and maintaining karyotype stability in P. tomentosa —thus play a role in maintaining its putative “fixed heterosis,” as discussed further below. We also found 299 CNVs, and GO analysis suggested an association with plant hormone signal transduction, plant-pathogen interaction, and sugar metablism. In sum, the many identified SVs in P. tomentosa provide logical focal points for study of their biological roles and phenotypic effects in relation to heterosis, evolution, breeding and biotechnology.
The mechanisms for the low recombination among sub-genomes are unknown.P. tomentosa is well known for having low sexual fertility (K. Ma et al., 2013), likely a reflection of meiotic difficulties that give rise to abnormal gametes. As suggested for Cucurbita subgenomes (Sun et al., 2017), the low recombination rate in P. tomentosa genome could be due to the rapid divergence between the two parental species in their repetitive DNA composition, which may have inhibited meiotic pairing of homologous chromosomes and subsequent exchanges; as shown above, the transposon compositions of the two genomes differ significantly. In addition, TE activity can cause CNVs, INSs, TRANSs and DELs due to their capacity to mobilize and recombine gene sequences within and between chromosomes (Morgante, De Paoli, & Radovic, 2007), both in the wild and in breeding processes (Lisch, 2013). These SVs may further inhibit normal meiosis. Karyotype stability and rare recombination among sub-genomes has been observed in paleo-allotetraploid Cucurbitagenomes (Sun et al., 2017), and in newly synthesized allotetraploid wheat genome (H. Zhang et al., 2013). However, their functional connection to recombination rate suppression is unclear. The maintenance of subgenomes that we found in P. tomentosa may be advantageous in providing a degree of “fixed heterosis”. This may help to explain P. tomentosa ’s high productivity and wide distribution in spite of its low sexual fertility.