Introduction
Rhesus macaque (Macaca mulatta) has been extensively studied in the
biomedical field for human diseases [2]. This nonhuman primate model
has greatly expanded and deepened our understanding of both infectious
and genetic diseases, such as AIDS [3, 4], Ebola [5], SARS-CoV-2
[6, 7], autism [8-10]. Parkinson’s disease [11, 12],
cataract [13, 14], cardiovascular diseases [15], infertility
[16, 17], etc. The comparative advantages of this animal model lie
in several biological, evolutionary, and ecological features, including
adaptive flexibility, widespread distribution, population abundance,
genetic closeness to humans in comparison to other major model animals
(i.e., mouse, fruit fly, and zebrafish), and highly diverse genomic
variants [18, 19]. As one of the most evolutionarily and
ecologically successful nonhuman primates [20, 21], it occupies a
wide geographical range, extending from Pakistan and Afghanistan in the
west across South Asia to Southeast Asia and the eastern coast of China
[22]. Taxonomically, the best-characterized rhesus macaques are from
two well-differentiated subspecies: the Indian subspecies and the
Chinese subspecies, which diverged at ~162,000 years ago
[23].
Genomic SNPs derived from the next-generation sequencing (NGS) data have
greatly promoted our understanding on evolutionary history and
biomedical relevance of the macaque species [24-26]. However, it is
still lacking the SVs-oriented studied to demystify basic and
evolutionary patterns of SVs. In this study, we are particularly
interested in whether genomic SVs could be shaped by chromosomal
distribution, methylation frequencies, recombination rates, and
evolutionary forces. The distribution of SVs and the related
evolutionary forces could provide a new empirical evidence for the
“faster-X effect”, an evolutionary pattern caused by differences
between autosomes and hemizygous sex chromosomes [27-30]. The
relationship between SVs and DNA methylation is a still unclear, which
could be addressed with the nanopore long-read sequencing due to its
direct signals of epigenome-wide methylation [31, 32]. In addition,
NGS data could be incorporated to infer the fine-scale recombination
rates [33], which may address issues on the recombination bias of
SVs. Thus, combining both NGS data and nanopore long-read sequences
would allow us to achieve a more balanced and nuanced perspective on the
genomic features of SVs using rhesus macaque as a model.
Here, we used both the long read-based and assembly-based methods to
identify SVs. We also estimated methylation frequencies, recombination
rates, and selective sweep signals. We found an excess distribution of
long SVs (1Kb) in the X chromosome compared to autosomes. We further
found lower methylation frequencies, but higher recombination rates for
duplications than for other SVs (deletions, insertions, and inversions).
We finally found that some SVs were located within regions under
selective sweeps, which suggests that the positive Darwinian selection
may impact the evolutionary fate of these SVs at the population level.
We also found that the X chromosome contributes disproportionately to
the positively selected SVs-involved genes, which suggests the existence
of the “faster-X effect” at the subspecies level. Our analyses provide
insights into the patterns of SVs and their diverse levels of
methylation, recombination, and selective forces.