Introduction
The discovery of specific signatures of the evolution of bacterial ribosomal proteins is an actual task, which allows a new insight at the emergence and evolution of not only the protein component of ribosomes, but also bacterial evolution 1–5. The family of ribosomal S1 proteins is a unique family of proteins characterized by a different number of S1 structural domains, each of which has a number of specific characteristics 6,7. The family of ribosomal proteins S1 makes up about 20% of all bacterial proteins containing the S1 domain 8. The number of structural S1 domains in bacteria varies within a strictly limited range from one to six9. Proteins of this family interact with mRNAs, participate in the initiation and translation of mRNAs in vivoand interact with the mRNA-like part of the tmRNA molecule10,11. Like some other ribosomal proteins, ribosomal S1 protein is an autogenic repressor of its own synthesis12. In addition, S1 can function outside the ribosome.
The role of the separate S1 domains is also being actively studied. Thus, a partial functional specialization of the S1 protein domains ofEscherichia coli has been identified by many studies. It is known that the first two domains are responsible for binding to ribosomes13, while the next four are involved in interactions with mRNA. The sixth domain has been shown to be optional for translation initiation 14. In addition, the first two domains are responsible for the binding of S1 to RNA replicase of the Qb phage, while the sixth domain is not required for its activity in phage replication 15. The S1 fragment, formed by the third, fourth and fifth domains, increases the activity of ribonuclease RegB of the T4 phage as efficiently as the whole protein 16. Accordingly, S1 appears to consist of three main regions: the N-terminal region formed from the first and second domains and involved in interaction with other S1 partners in the cell (ribosome, Qb replicase), the intermediate region formed by the third, fourth and fifth domains and involved in the interactions with RNAs (translation or replication initiation region, RegB substrates) and, finally, the sixth domain, the role of which remains to be elucidated 9,17.
Several attempts have been made to classify ribosomal S1 proteins taking into account different numbers of sequences. 13 bacterial phyla were studied by Salah et al. 17. This work was carried out on 26 bacterial sequences. The authors used the number and pairwise alignment of S1 domains in the family of ribosomal S1 proteins to investigate the relationship between Gram-positive and Gram-negative bacteria. Of the 273 S1 sequences, 12 phyla were identified18. The authors of another work 19used the rpsA gene encoding the ribosomal protein S1 as a biomarker for the main 8 types of mycobacteria, the differences between which were not revealed in the analysis of 16S rDNA.
We have recently shown that the number of domains in S1 is a distinctive characteristic of the phylogenetic grouping of bacteria in the main phyla. The studied data, containing 1453 S1 sequences made it possible to identify bacterial ribosomal S1 proteins in 25 different phyla according to the List of Prokaryotic Names with Standing in Nomenclature. In addition, we searched for a conserved domain in the family of 30S ribosomal S1 protein and hypothesized a possible evolutionary development of the family of 30S ribosomal S1 proteins. The obtained data made it possible to group some bacterial phyla into superphyla according to the number of S1 domains 9.
Here we collect and structured data about features of the family of ribosomal S1 protein and expand and analyze them with data on the percentage identity, amino acid composition and logo motifs, as well as dN/dS ratios. The presented data are integrated in the server, which can be accessed athttp://oka.protres.ru:4200.