Bacterial ribosomal S1 proteins biodiversity
The obtained data make it possible to estimate the prevalence of groups containing different numbers of structural S1 domains in the family of the bacterial S1 proteins. Thus, one-, two-, three-, and five-domain S1 proteins account for 1%, 0.8%, 2% and 1.2% of all studied sequences, respectively. Four- and six domain proteins are most represented: 33% and 62%, respectively (Fig. 1a.). At the same time, as we showed above, 55% of all studied bacterial S1 sequences belong to the Proteobacteria, 16% and 17% belong to Firmicutes and Actinobacteria, respectively, and 6% to Bacteroidetes (Fig. 1b.).
Numerous studies showed that >88% of all bacterial isolates belong to four phyla of bacteria (Big Four): Proteobacteria, Firmicutes, Actinobacteria, and Bacteroidetes 33,34(Fig. 1c.). In fact, obtaining isolates that do not belong to the Big Four is challenging, and therefore these four phyla dominate our current understanding of microbiology 34. At the same time, the number of microorganisms belonging to the Proteobacteria phyla was highlighted in most studies determining the diversity of microorganisms, with a range from 40 to 90%, either for isolation analysis or for analysis of microbiomes 35–37. In general, the dataset we study reflects the percentage of major bacterial phyla and can be considered representative. The most representative groups are the six-domain containing proteins S1 from Proteobacteria and Bacteroidetes and the four-domain containing S1 proteins from Actinobacteria and Firmicutes.