3.7 Genome re-sequencing for SNP calling
The genome re-sequencing of 54 individuals of P. leopardusproduced a total of 2,232,057,448 raw sequencing reads. After quality filtering, we obtained a total of 668.90 Gb clean data, with an averagely 12.39 Gb for each fish and a mean sequencing depth of 14.0× (Supplementary Table S5 ). The clean reads were aligned to the assembled genome for each individuals, with 99.63 % of the total reads mapped to the genome. Based on these alignments, we identified a total of 5,178,453 SNPs after quality filtering. The location and effects of these SNPs were also annotated, showing that 132,709 and 57,082 were synonymous and nonsynonymous SNPs, respectively, locating in coding regions (Table 6 ). These SNPs will provide an important genomic resource for the genetic studies, such as population structure analysis, dissection of agronomical traits, identification of selective sweeps, and for genomic selective breeding for superior strains. In the future work, we will use these genomic variations and recorded phenotypes of the corresponding individuals to dissect the genomic associations and to identify key genes playing important roles in the phenotypic differences.
Conclusion
Here we provide a chromosomal-scale genome assembly of the P. leopardus by integration of 10× Genomics, Hi-C and PacBio long read sequencing technologies. The genome assembly and annotation supplies the first genome of genus Plectropomus and implement the Epinephelidae genomes, in addition to E. lanceolatus and E. akaara , thus supplying important genomic data for whole-genome analysis to elucidate the population genetics, evolution and to dissect the genetic diversity underlying their phenotypic traits and adaptions. The genomic variations, together with their functional annotations, will promote accurate genetic analysis and accelerate the genomic breeding programs in aquaculture of the P. leopardus .