Effective number of codons
Effective number of codons (ENC) is a measure for assessing codon usage bias. The ENC value ranges from 20 (maximum bias, i.e. only one codon is used to encode an amino acid) to 61 (minimum bias, i.e. codons are used randomly)36,37. In contrast to the RSCU value, the ENC value characterizes the overall codon usage bias for a gene/genome.
The calculation of codon homozygosity (Fk ) for all amino acids having synonymous codons, which is required to calculate the ENC value, was carried out according to the following equations (5):
\(S=\ \sum_{i=1}^{k}{\left(\frac{n_{i}}{n}\right)^{2};\ \ \ \ \ F_{k}=\frac{nS-1}{n-1}}\) (5),
where ni is the occurrence of the k -codon for the i -amino acid; n is the total number of codons for the i -amino acid; k is the number of codons for thei -amino acid (k = 2, 3, 4, 6). In this work,ni and n were represented by the average values for Hsp60 sequences of 17 phyla excluding Viruses and Platyhelminthes (Supplementary, GC-content and ENC).
The ENC values were calculated for 17 phyla according to equations (6):
\(ENC=2+\frac{9}{F_{2}}+\frac{1}{F_{3}}+\frac{5}{F_{4}}+\frac{3}{F_{6}};\ \ \ \ \ \text{ENC}_{\exp}=2.5+s+\frac{29.5}{s^{2}+{(1-s)}^{2}}\) (6)
where 2 is a sum of F for Met and Trp, since these amino acid residues have no synonymous codons; F2/3/4/6is an average Fk for those amino acid residues that have k synonymous codons. The ENCexpvalue is the expected ENC value for various GC contents at the third synonymous codon position (GC3 content) in the Hsp60 gene for 17 phyla. Here, s is the given value of GC3 content (Supplementary, GC-content and ENC). The data obtained were used to construct the Nc-plot, where the ENC values are presented as a scatter plot, and the ENCexpvalues are presented as a solid curve. Gene codons are unbiased if the corresponding point is on the expected curve38.