Possible evolutionary development of the family of bacterial 30S ribosomal S1 proteins
The problem of understanding the nature of protein repeats, the corresponding functions for each repeat, and their evolution is still unclear. These repeats evolved from a common ancestor, which necessarily contained a single repeat 50. Some authors suggested that the common ancestor of the family was indeed a single repeat that formed homo-oligomers for effective functional activity51. The homo-oligomeric structure of an ancestor may reflect the intrachain repeating structure of its modern homologue, with the exception of its multi-chain character. However, there are examples of homologous multiple repeats, which are formed both from oligomers with single repeats and from one chain of several repeats (Andrade et al., 2001).
For the investigated bacterial proteins, the maximum number of repeats of the S1 domain (six) is sufficient to perform all the necessary functions. The third domain in this group has the highest identity (68%) among other domains. In addition, this domain has the highest identity with the S1 domain from PNPase (E. coli ) and the S1 domains from S1 single domain proteins (Tenericutes, Mollicutes)9, and the RNA binding site is formed by five residues: F19, L22, H34, N64, and R68, which once again confirms the uniqueness of this repeat and allows us to consider it as the strongest RNA binding site. Thus, the central part of proteins (third and fourth domains) appears to be vital for the activity and functionality of these proteins. This suggestion is consistent with experimental data. One of the well-studied proteins with six repeats of the S1 domain is the bacterial 30S ribosomal protein S1 from E. coli . It was shown that cutting off one S1 domain from the C-terminus or two S1 domains from the N-terminus of the protein reduces only the efficiency of the protein functions, but not its functionality 14,41.
As mentioned above, the Proteobacteria consists of 55% of all proteins S1 (Figure 1b). Within this, phylogenetic classes are represented by a different number of sequences and structural S1 domains (Figure 4). Thus, Acidithiobacillia and Epsilonproteobacteria have six S1 domains, Alpha – and Deltaproteobacteria consist of five or six S1 domains. Note that Epsilonproteobacteria is considered to be the oldest class in this phylum 29,52. The Oligoflexia class is characterized by the presence of four or six S1 domains; for Beta and Gamma proteobacteria, the number of S1 domains ranges from one to six. Betaproteobacteria are evolutionary most closely related to Gamma-proteobacteria and Acidithiobacillia, and together they make up a taxon called Chromatibacteria 53. However, the Acidithiobacillales class was previously classified as part of the Gamma-proteobacteria 54. Our data also confirm the separation of this class into a separate one for a constant number of structural S1 domains (Figure 4). Phylogenetic analyses of various proteins suggest that that Beta-proteobacteria and Gamma-proteobacteria branched out later than most other phyla of Bacteria along with Proteobacteria 55,56.
Alphaproteobacteria branched out at the same time as Deltaproteobacteria55,56. Note that these classes have five and six domains, with Beta-proteobacteria and Gamma-proteobacteria having different numbers of S1 domains. According to our data, these classes within Proteobacteria (in addition to the Actinobacteria, Bacteriodites and Firmicutes phyla) have the greatest diversity in the number of S1 domains in comparison with other phyla, where this number constantly or insignificantly changes. The specific relationship of the phylum Aquificae to the Epsilonproteobacteria is supported by the conserved indel signature in inorganic pyrophosphatase, which is uniquely found in the species of the two phyla 57. In58, the authors also suggested that Aquificae are closely related to Proteobacteria. This closeness is due to frequent horizontal gene transfer due to common ecological niches. According to our data, bacteria from the phylum Aquificae and class Epsilonproteobacteria have strictly six S1 domains. The evolutionary development of representatives of the Acidobacteria phylum is often considered to be associated with Alphaproteobacteria59,60 due to the fact that both bacteria belonging to these phyla were associated with a copiotrophic lifestyles61. According to our data, the phyla Acidobacteria and the class Alphaproteobacteria have six S1 domains. The evolutionary independent development of such phyla as Caldiserica, Deferribacteres, Fusobacteria, Spirochaetes, Nitrospirae, Nitrospinae/Tectomicrobia is apparently reflected in the constant number of structural S1 domains in these bacteria. Moreover, the phylum Spirochaetes in the literature is considered a phylogenetically ancient and distinct group of microorganisms 62. This phylum contains six S1 domains (Figure 4).
As mentioned above, the analysis of 16S rRNA and characteristic conserved indels in some proteins is used to group the phyla Planctomycetes, Verrucomicrobia, Chlamydiae in the PVC clan28. Bacteria of the Chlamydiae and Verrucomicrobia phyla generally contain six S1 domains, while Planctomycetes can have four, five, and six S1 domains (Figure 4). According to some published data, the genome of organisms of the phylum Planctomycetes, in comparison with other phyla of superphylum PVC, is the largest and most susceptible to evolutionary changes 63. Phyla Clamydiae and Verrucomicrobia are considered evolutionarily closer to each other 64.
The FCB group is a superphylum of bacteria named after the main member phyla Fibrobacteres, Chlorobi, and Bacteroidetes. Some authors also include the phyla Gemmatimonadates and Ignavibacteriae in this group27. It should be noted, that these phyla on phylogenetic trees are often at the same level, while the phylum Fibrobacteres is considered a phylogenetically more ancient group. Our data show that the ribosomal S1 protein in this group almost always contains six S1 domains (constant number for the Gemmatimonadates, Ignavibacteriae, Fibrobacteres, Chlorobi and class Bacteroidia phyla). The class Cytophagia has one, four, and six domains within the phylum Bacteroidetes (Figure 4).
Phylum Bacteroidetes, along with Proteobacteria, Firmicutes, and Actinobacteria, are also among the most common bacterial groups in the rhizosphere 65. They have been found in soil samples from various locations, including cultivated fields, greenhouse soils, and unexploited areas 66. Note that for these phyla, the number of structural S1 domains can vary from one to six (Figure 4).
Terrabacteria are a supergroup containing the Actinobacteria, Tenerecutes, and Firmicutes phyla, as well as the Cyanobacteria, Chloroflexi, and Deinococcus-Thermus phyla 29,52. It is widely accepted that oxygenic photosynthesis devoloped in ancient lineages of Cyanobacterial 67, but very little is known about the nature and evolutionary history of anoxygenic phototrophy, and much of the understanding is based on assumptions and hypotheses based on few existing bacterial taxa, in which this metabolism occurs. However, a number of studies have argued that one of the earliest forms of anoxygenic photosynthesis arose in the Chloroflexi phylum before the invention of oxygenic photosynthesis during the Archean Eon 68,69. Our data revealed three S1 domains in the phylum Cyanobacteria and four S1 domains in the phylum Chloroflexi. According to another version, the phyla Actinobacteria and Chloroflexi are more evolutionarily close 32. Note, that Actinobacteria predominantly have four S1 domains. Evolutionary close to the phyla Actinobacteria, Cyanobacteria, Chloroflexi, and Deinococcus-Thermus, and the phylum Firmicutes according to our data, it also predominantly has four S1 domains 70,71. meanwhile, according to 32,70 the phylum Deinococcus-Thermus (five S1 domains) is more ancient than other phyla in the supergroup Terrabacteria.
Note that the bacterial 30S ribosomal S1 protein from the parasitic bacteria Mollicutes (phylum Tenerecutes) effectively performs the basic functions of RNA binding 40. There is an assumption in the literature that mycoplasmas (Mollicutes) are a regressive branch of the evolution of some Gram-positive bacteria or Firmicutes72. This hypothesis was confirmed experimentally and is considered in two possible variants: all mycoplasmas originate either from a common ancestor with Gram-positive bacteria, or from different bacteria 72. Based on a comparison of the 16S rRNA oligonucleotide sequences of several species of mycoplasmas and Gram-positive bacteria from the genera Clostridium, Bacillus, Lactobacillus, and Streptococcus, a reasonable assumption was made about their evolutionary relationship with the phylum Firmicutes73,74. A more detailed analysis of 16S RNA sequences showed that mycoplasmas are phylogenetically closest to clostridia75. In turn, the most likely ancestors of clostridia (Firmicutes) are Gram-positive bacteria with a low G+C content in their DNA. According to our data, the 30S ribosomal S1 protein from the phylum Tenerecutes has one S1 domain.
Summarizing all the above, it can be argued that, firstly, the number of structural S1 domains in bacteria of different phyla may coincide during symbiotic life and secondly, more phylogenetic ancient divisions have a greater number of structural domains (basically six). Moreover, the earlier in the phylogenetic respect the microorganism, the greater the likelihood of decreasing and ranking the number of structural S1 domains in it.