RESULTS AND DISCUSSION
Cofactor-type is the primary determinant of redox energetics . Redox potentials included in ProtRedox span almost 2 V, ranging from the -675 mV 2[4Fe-4S] binding bacterial ferredoxin of E. coli[55] to the +1301 mV chlorophyll A in PS II within T. elongatus [56]. Within this broad range, the cofactor type is the primary determinant of redox potential (Fig. 1 ). Cofactor types are designated based primarily on the PDB-derived nomenclature. Cofactors from most reducing to most oxidizing were 4Fe-4S (SF4), 2Fe-2S (FES), flavins, mononuclear iron sites (Fe), iron-bound hemes (HEM) and copper sites (Cu). These ranges are consistent with previous analyses of protein electron carriers [8].
Molecular features that determine energetics . Protein redox potential is a complex property that is affected by features of the redox site first and second shell environment: solvation, hydrogen-bonding, ligand interactions, metal coordination, electrostatic interactions [57, 58] and corresponding enthalpic and entropic energy terms [59]. Redox potential can be directly calculated from first principle quantum mechanics calculations [60, 61], however these calculations are expensive and are not practical for protein design. To better understand the protein features that determine redox potential, we calculated the correlation between 433 physicochemical features (including energy and geometry features) and reduction potentials (Fig. 2 ) for copper proteins with ReProDox.
The categories of features with best correlations tended to be those related to electrostatics and solvation. These include solvation features that describe Lazaridis-Karplus solvation energies both isotropic and anisotropic contributions for various distance cutoffs within 9 Å. The significant electrostatic features include calculations for Coulombic electrostatic potential as well as features describing the theoretical titration curve of surrounding residues. In contrast, other categories of features are more statistical. For example, eight of the nine significant “amino acid angle” features are Dunbrack rotamer energies of residues within 5 Å, indicating the use of some more common and some less common rotamers. In addition to further evaluating significant features that correlate with protein redox potentials found in ProtRedox, we expect these features can be used to train models [37, 53] for high throughput redox active protein design.
Coupling redox energetics to pH. Comparing protein redox potentials is challenging due to the numerous experimental conditions under which redox potentials are measured. Experimental pH is known to be a significant factor affecting redox processes accompanied by protonation/deprotonation events [62], which is commonly observed among Cu redox proteins [62-65]. To compare experimental redox potentials values are normalized to a reference pH (7.0) using the Nernst equation,
Eq. 1: \(E_{\text{red}}\ =\ E+\ (59.16\ mV*n*(pH\ -7))\)
where 59.16 mV is the Nernst constant relating pH to redox potential. Ered is the normalized reduction potential of each protein at pH 7 and E is the reduction potential measured at the literature pH. The variable n, assumed to be one, is the electron and proton ratio involved in the redox reaction, respectively.
For copper proteins with an azurin fold, we observed a correlation between pH and redox potential with a slope of -51 mV/pH unit (Fig 3A ), near what is expected if the reactions followed Nernstian behavior (-59 mV/pH unit), assuming one electron transfer per reaction. Normalization removes the slope of this correlation (Fig. 3B ). Experimental pH conditions showed the strongest positive Pearson coefficient with redox potential, above the computed factors from structure described earlier. However, there is a very large variance in observed potentials, clearly indicating that no one parameter can fully explain redox energetics.
Redox gradients and oxidoreductase evolution. Many of the ProtReDox entries are associated with an experimentally determined three-dimensional structure deposited in the PDB. This allowed us to map the redox energetics onto the SpAN – an existing network mapping electron transfer pathways in oxidoreductases of known structure [16, 17]. Nodes in the SpAN correspond to redox-cofactor binding protein microenvironments – termed modules. Edges reflect the existence spatial proximity of cofactor atoms in a pair of modules in one or many structures, providing a pathway for electron transfer. Cofactor edge-to-edge distances less than 14.0 Å were considered electron-transfer competent [7].
The full SpAN contains 133 modules [17]. We identified 18 modules with specified redox energetics (Fig. 4, Table 1 ). These modules formed a fully connected sub-graph within the SpAN with the exception of the heme-binding cytochrome-C fold module 140. Within this network, there is a clear downhill redox energetic gradient, starting from 4Fe-4S coordinating ferredoxin folds (module 85) with an average potential of -430 mV and ending with more oxidizing hemes (modules 1737 at +168 mV; 1746 at +70 mV), the molybdenum containing module 16 (+204 mV) and copper module 72 at +325 mV. One can envision electrons percolating from the center of this network to the periphery, driving redox-coupled reactions along a metabolic pathway.
Multiple features of the SpAN suggest its structure provides insight into the evolution of oxidoreductases in addition to their metabolic function. Network models of growing systems indicate that nodes with high centrality and connectivity are the first to arise [66-68]. In the ProtReDox annotated sub-graph of the SpAN, flavin module 7 and 4Fe-4S module 85 are reducing such that they are energetically matched with the early Earth redox environment. It is informative that the annotated modules form a connected sub-graph within the SpAN. Most of these modules correspond both to isolated protein electron carriers [45] as well as being domains within larger oxidoreductases. Assignment of potentials is easier within an isolated domain versus a larger, multi-cofactor enzyme. Small, isolated modules would be useful building blocks of larger enzymes, forming multi-domain structures through duplication and diversification. Metal utilization for central versus peripheral modules is largely consistent with metal availability through geologic history [21, 69, 70], with early folds incorporating iron-containing cofactors and later folds binding molybdenum and copper.