Structural variations (SVs) identification and distribution
The SVs were identified using two complementary methods: “reads mapping” and “assembly comparison”. Specifically, for the “reads-based” method, we used two software programs, NanoSV [60] and Vulcan [61], to identify signals of SVs. NanoSV takes advantage of split- and gapped-aligned reads to define breakpoint-junctions of SVs, following the mapping of long reads to genome references (Mmul_10) with LAST v1256 [114] and the alignment processing with Sambamba V0.8.2 [115]. Vulcan integrates several pipelines, including the dual-mode alignment of long reads with aligners minimap2 [109] and NGMLR v0.2.7 [116] and the SVs calling with Sniffles2 [116]. The “assembly-based” method was based on SyRI v1.6 [53]. We compared the results of SVs with BEDTools v2.30 [117]. The consensus SVs with shared regions (covering mutually at least 80% of SVs lengths) were identified as the lower bound of a reliable call set. To reveal potentially consistent patterns of SVs from different algorithms, the SVs from these methods were compared and defined as the upper bound of a reliable call set.