3. Results and discussion
3.1 Genome organization
After quality filtering the raw reads, a total of 1,578,672 high quality clean reads were obtained and used to assemble the N. andersonimitochondrial genome. After obtaining the complete mitochondrial genome sequence of N. andersoni we deposited it in NCBI with GenBank accession number MW030174. The mitogenome of N. andersoni was a circular DNA molecule which was 16,291 bp in length. As shown in Fig. 1, the mitogenome organization of N. andersoni was similar to that of most all other rodents [26]. Thirty-seven typical mitochondrial genes were identified, including 13 PCGs, 22 tRNAs and 2 rRNAs (Table 2). Most of genes were encoded on the Heavy (H)-strand, while ND6 and 8 tRNAs were encoded on the Light (L)-strand.
The total base composition of N. andersoni mitochondrial genome was estimated to be 33.7% for A, 25.8% for C, 12.1% for G and 30.0% for T, which makes as AT and GC percentage of 61.6% and 38.4%, respectively, indicating that the mitochondrial genome is biased towards AT (Table 3). Such base composition biases have been reported to play a vital role in the replication and transcription of mitochondrial genome [27]. It also showed a negative GC skew value (−0.347), indicating that C is more common than G whereas the AT skewness was positive (0.092) suggesting that A occurs more frequently than T in the N. andersoni mitochondrial genome (Table 3).
3.2 Protein-coding genes (PCGs)
Total length of the 13 PCGs was 11,420 bp, which accounted for 70.1% of the mitogenome. Initiation codons of all PCGs in mitogenome of N. andersoni were typical ATN, except for ND1, which started with GTG. All PCGs of the mitogenome of N. andersoni terminated with complete (TAA) or truncated (T) stop codons, except for ND2 which terminated with CAT (Table 2). The relative synonymous codon usage (RSCU) values of PCGs are displayed in Table 4, which also shows that the protein-coding gene region has 3,805 codons. According to the RSCU analyses, CUA (L), AUU (I) and AUA (M) were the three most frequently used codons. Leucine, isoleucine and threonine were the most frequent PCG amino acids (Fig. 2). This may explain the negative GC-skew and positive AT-skew of PCGs.
3.3 Ribosomal RNA and Transfer RNA genes
The mitogenome of N. andersoni contained the typical 22 tRNA genes throughout the genome and appeared to be highly A+T biased, ranging in length from 59 bp to 75 bp. Among these tRNA genes, eight tRNAs were encoded on the L-strand and the remaining 14 were encoded on the H-strand (Table 2). All the tRNA genes exhibited a typical cloverleaf structure, except trns1, which lacked a dihydroxyuridine arm that had been simplified to a ring shape. Loss of the DHU arm is common in the mitogenomes of many mammal animals [28].
The two rRNA genes (lrRNA, srRNA) encoding the small and large ribosomal subunits, were identified on the L- strand of N. andersoni , and were located between tRNAPhe and tRNALeu. The lrRNA and srRNA lengths are 1,567 and 957 bp, respectively. The A+T content of rRNA was 63.43%, and its AT-skew (0.204) and GC-skew (-0.099) showed that more As and Cs were present in the rRNA than As and Gs (Table 3).
3.4 Phylogenetic analysis
Based on 13 PCGs of 13 rat species, we established a phylogenetic tree by maximum likelihood method with 1,000 replications which set Mus musculus as outgroup (Fig. 3A). Some researchers have suggested that ND6 gene sequences should be excluded during phylogenetic analysis due to its high heterogeneity and consistently poor phylogenetic performance [29]. Thus, we constructed another phylogenetic tree based on PCGs excluding ND6 (Fig. 3B). The results of the two phylogenetic analyses were almost the same. When compared with other rat species, N. andersoni was phylogenetically closer to N. excelsior and clustered within genus Niviventer .
To further investigate the phylogenetic relationships of N. andersoni , the phylogenetic relationships were reconstructed based on the complete mitochondrial genome (Fig. 4). 13 species were used to preform phylogenetic analysis (Table 1). The D-loop region was excluded because of the rapid mutation rate in this region. The maximum likelihood tree was constructed based on the complete mitochondrial genome (except D‐loop). The topologies of the maximum likelihood trees constructed based on the complete sequence and PCGs of the mitochondrial genome were identical.Our results were generally congruent with those from the previous study using only the cytb gene, except for the phylogenetic position of N. confucianus . Single cytb gene trees in previous studies showed that N. confucianus was closer toN. fulvescens and N. cremoriventer than to N. andersoni and N. excelsior[6,30,31]. Our results suggest that N. andersoni and N. excelsior clustered together, then with N. confucianus , and formed a sister group of N. fulvescens and N. cremoriventer . Since each gene evolves under different evolutionary pressure and time scale, it has been known that one gene tree for a population may differ from other gene trees for the same population depending on the subjective selection of the genes [9]. The single mitochondrial gene tree and complete mitogenome tree were conflicting, suggesting that phylogenetic tree using complete mitochondrial genomes was warranted.