RNA-seq library construction and sequencing
Total RNA of the plantlet GM15 was extracted using the Qiagen RNeasy
Plant Mini Kit (Qiagen, Valencia, CA, USA). RNA quality was evaluated by
agarose gel electrophoresis and its quantity determined using a NanoDrop
spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). To
assist prediction and annotation of genes, the RNA-seq library was
constructed and sequenced on the Illumina HiSeq X Ten platform following
the protocol of manufacturer (New England Biolabs Ipswich, MA, USA) (The
detailed description in File S1).
Genome assembly and estimation of genome size
A total of 6.24 M PacBio post-filtered reads were generated, producing a
total of ~54 Gb (~ 70× coverage) of
single-molecule sequencing data. De novo assembly was conducted
using an overlap-layout-consensus method in CANU
(Koren et al., 2017). Subsequently, the
primary draft assembly was polished using Arrow
(https://github.com/PacificBiosciences/GenomicConsensus) to improve
accuracies. Using the Genome Characteristic Estimation (GCE) program
(B. Liu et al., 2013), the genome sizes
of GM15 and LM50 were estimated by 17-mer analysis based on PCR-free
Illumina short reads. The detailed description in File S1.