Introduction
Rhesus macaque (Macaca mulatta) has been extensively studied in the biomedical field for human diseases [2]. This nonhuman primate model has greatly expanded and deepened our understanding of both infectious and genetic diseases, such as AIDS [3, 4], Ebola [5], SARS-CoV-2 [6, 7], autism [8-10]. Parkinson’s disease [11, 12], cataract [13, 14], cardiovascular diseases [15], infertility [16, 17], etc. The comparative advantages of this animal model lie in several biological, evolutionary, and ecological features, including adaptive flexibility, widespread distribution, population abundance, genetic closeness to humans in comparison to other major model animals (i.e., mouse, fruit fly, and zebrafish), and highly diverse genomic variants [18, 19]. As one of the most evolutionarily and ecologically successful nonhuman primates [20, 21], it occupies a wide geographical range, extending from Pakistan and Afghanistan in the west across South Asia to Southeast Asia and the eastern coast of China [22]. Taxonomically, the best-characterized rhesus macaques are from two well-differentiated subspecies: the Indian subspecies and the Chinese subspecies, which diverged at ~162,000 years ago [23].
Genomic SNPs derived from the next-generation sequencing (NGS) data have greatly promoted our understanding on evolutionary history and biomedical relevance of the macaque species [24-26]. However, it is still lacking the SVs-oriented studied to demystify basic and evolutionary patterns of SVs. In this study, we are particularly interested in whether genomic SVs could be shaped by chromosomal distribution, methylation frequencies, recombination rates, and evolutionary forces. The distribution of SVs and the related evolutionary forces could provide a new empirical evidence for the “faster-X effect”, an evolutionary pattern caused by differences between autosomes and hemizygous sex chromosomes [27-30]. The relationship between SVs and DNA methylation is a still unclear, which could be addressed with the nanopore long-read sequencing due to its direct signals of epigenome-wide methylation [31, 32]. In addition, NGS data could be incorporated to infer the fine-scale recombination rates [33], which may address issues on the recombination bias of SVs. Thus, combining both NGS data and nanopore long-read sequences would allow us to achieve a more balanced and nuanced perspective on the genomic features of SVs using rhesus macaque as a model.
Here, we used both the long read-based and assembly-based methods to identify SVs. We also estimated methylation frequencies, recombination rates, and selective sweep signals. We found an excess distribution of long SVs (1Kb) in the X chromosome compared to autosomes. We further found lower methylation frequencies, but higher recombination rates for duplications than for other SVs (deletions, insertions, and inversions). We finally found that some SVs were located within regions under selective sweeps, which suggests that the positive Darwinian selection may impact the evolutionary fate of these SVs at the population level. We also found that the X chromosome contributes disproportionately to the positively selected SVs-involved genes, which suggests the existence of the “faster-X effect” at the subspecies level. Our analyses provide insights into the patterns of SVs and their diverse levels of methylation, recombination, and selective forces.