Positive selection analysis and cavities
As a first step in the sequence analysis, we estimated the presence of protein residues evolving under a positive selection process. These residues are thought to be involved in functional adaptation35,36. To achieve this goal, we built a multiple sequence alignment of 46 albumin proteins from 38 different species (see Supplementary Table 1). In particular, 6 representatives have known crystallographic structures. Using the MEME algorithm we detected a total number of 38 sites evolving under positive selection on the alignment (Supplementary Table 2).
Aiming to have a deeper understanding of the biological relevance of these positions, we mapped these residues to the cavities predicted by Fpocket in all the conformers of the different proteins with known structures. We also complemented the biological data using the information about functionally important residues and cavities, collected from the bibliography17,37–39. From this mapping we observed the overlap of approximately 50 to 70% of positively selected residues with the biological relevant cavities (Figure 3 and Supplementary Table 2), fluctuating according to the species and conformations used. Among these cavities we found the IIA drug binding site, described previously40as containing residues involved in promiscuous reactions in HSA and BSA. This cavity is one of the major cavities predicted by Fpocket and contains the residues involved in supporting the aldol condensation41. Interestingly, Arg 222 is evolving under a positive selection process, suggesting a putative functional role (Supplementary Table 2).