RNA-seq library construction and sequencing
Total RNA of the plantlet GM15 was extracted using the Qiagen RNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA). RNA quality was evaluated by agarose gel electrophoresis and its quantity determined using a NanoDrop spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). To assist prediction and annotation of genes, the RNA-seq library was constructed and sequenced on the Illumina HiSeq X Ten platform following the protocol of manufacturer (New England Biolabs Ipswich, MA, USA) (The detailed description in File S1).
Genome assembly and estimation of genome size
A total of 6.24 M PacBio post-filtered reads were generated, producing a total of ~54 Gb (~ 70× coverage) of single-molecule sequencing data. De novo assembly was conducted using an overlap-layout-consensus method in CANU (Koren et al., 2017). Subsequently, the primary draft assembly was polished using Arrow (https://github.com/PacificBiosciences/GenomicConsensus) to improve accuracies. Using the Genome Characteristic Estimation (GCE) program (B. Liu et al., 2013), the genome sizes of GM15 and LM50 were estimated by 17-mer analysis based on PCR-free Illumina short reads. The detailed description in File S1.