Note: N50, shortest sequence length at 50% of the genome; N90, shortest
sequence length at 90% of the genome. The dashed line indicates data
not available.
The completeness of genome assembly has been validated using various
approaches. More than 97.31% of the complete single-copy BUSCOs were
found in the genome assembly, and only 2.18% of the BUSCOs were missing
(Fig. S2 ; Table S5 ). CEGMA assessment retrieved 241
(97.18%) (Table S6 ) of the 248 core eukaryotic genes (CEGs).
Furthermore, Illumina short reads (65.3 Gb) were aligned to the
assembled genome using BWA software, with a mapping efficiency of
~ 98.15% and coverage percentage of ~
95.44%, suggesting a high consistency between Illumina reads and the
assembled genome (Table S7 ). Together, these results show that
the assembled T. polyphylla genome sequence was complete and had
a low error ratio.