3.2 Chromosomal-level genome assembly
Based on the clean data, we estimated the genome size to be 871 Mb with the 17-kmer analysis. A dominant peak of the 17 k-mer distribution corresponding to the homozygous peak was demonstrated (Figure 2a ) and the heterozygosity was estimated to be 0.635%.
The 10× Genomics short reads were de novo assembled with Supernova software (v1.2) (Weisenfeld et al., 2017). The contigs and scaffolds in the draft assembly were then anchored and oriented into a chromosomal-scale assembly using the Hi-C scaffolding approach (Figure 2b ). As a result, we obtained a draft genome assembly of 902.46 megabase (Mb) in length, with a contig of 33.60 Kb. To further improve the completeness and accuracy of the genome assembly, we used PacBio long sequence reads, with a depth of ~20 ×, to close the gaps in the assembly using TAG-Gapcloser. Finally, the total length of the P. leopardus genome was 881.55 Mb, with a contig N50 of 855.69 Kb and a scaffold N50 of 34.14 Mb (Table 2 ). The genome assembly had 24 pseudo-chromosomes, with chromosome lengths ranging from 15.72 Mb to 41.71 Mb (Supplementary TableS1 ).
BUSCO analysis showed that the assembly retrieved 97.2% of the conserved single copy orthologue genes, including 94.0% of the complete and 3.2% fragmented genes (Table 3 ). The distribution of GC content and sequencing depth were relatively concentrated, with an average GC content of 39.65%.