3.5 | Intrinsically disordered microproteins
Microproteins are much shorter than annotated proteins, and they tend to exhibit limited conservation to protein domains of known function. As a result, it is challenging to perform bioinformatic analyses, for example of predicted structure or intrinsic disorder, of microproteins with confidence, particularly because many of these predictive algorithms rely, at least in part, on homology to structures of known, larger proteins on which they are trained. Nonetheless, some studies have suggested that microproteins may be enriched in intrinsic disorder relative to canonical proteins (though an alternative analysis suggests that evolutionarily young microproteins are de-enriched in intrinsic disorder), which, if true, suggests that some microproteins could carry out cellular functions associated with intrinsically disordered proteins, such as regulating signaling and other processes by binding to protein partners via short linear interaction motifs (SLIMs). In this section we discuss two human microproteins that have been experimentally confirmed to be predominantly intrinsically disordered.
MRI (Modulator of retroviral infection) was first identified in a cDNA library screen for host proteins that could complement resistance to retroviral infection of human cells, but it remained annotated as a predicted or uncharacterized protein-coding gene (C7ORF49 ) in the early 2010s. While the long isoform of MRI (MRI-1 hereafter) is 157 amino acids long and therefore not a microprotein, a 2013 peptidomics study identified an unannotated, sORF encoded isoform (MRI-2) of 69 amino acids. Follow-up work demonstrated that the long MRI-1 and short MRI-2 proteins could interact with a complex of proteins essential for the non-homologous end joining pathway (NHEJ), which is essential for repairing DNA double strand breaks in G1 phase of the cell cycle, as well as for B and T cell receptor gene diversification via V(D)J recombination. Specifically, MRI-1 interacts with the double-strand break binding adaptor proteins Ku70/80 (Ku) and DNA-PKcs (DNA-dependent protein kinase catalytic subunit), while MRI-2 binds to Ku. Both of these MRI isoforms contain an N-terminal Ku-binding motif, explaining their association with Ku, while MRI-1 also contains a C-terminal XLF-like motif (XLM) that associates with additional, distinct NHEJ factors. The XLM of MRI-1 is absent in the frameshifted, truncated MRI-2 isoform. One study suggests that MRI inhibits aberrant NHEJ at telomeres during S phase, while two studies to date are consistent with a positive role for MRI in NHEJ during most phases of the cell cycle, suggesting that the activity of MRI may be context-dependent. Purified MRI-2 was shown to promote NHEJ in vitro. However, abrogating all isoforms via knockout of the MRI gene in vivo and in pre-B cells increases sensitivity to ionizing radiation and inhibits NHEJ when coupled with knockout of the NHEJ “sentinel” gene XLF. Purified MRI-1 was shown to be predominantly intrinsically disordered via hydrogen-deuterium exchange; while MRI-2 was not directly investigated in this study, it is likely to have a similar degree of intrinsic disorder because these proteins share substantial sequence identity until the frameshift that truncates MRI-2. Interestingly, the N-terminal and C-terminal motifs of MRI-1 alone can nucleate separate complexes of NHEJ factors, and MRI-1 can recruit NHEJ factors to chromatin in the presence of DNA double strand breaks. It is interesting to speculate MRI-2 may therefore be able to serve the same nucleating function in NHEJ via its Ku-binding motif even in the absence of the C-terminal XLM. Sleckman and colleagues proposed that MRI-1 serves as an adaptor protein for NHEJ, promoting stable association of active NHEJ complexes at sites of double strand breaks as a result of its (1) intrinsic disorder, (2) independent linear interaction motifs, and (3) its potential to multimerize. While better understanding of the contributions of individual MRI isoforms to their function in vivo is required, MRI-1 and MRI-2 appear to be paradigmatic examples of intrinsically disordered (micro)proteins that promote assembly of a functional protein interaction network.
Another example of an experimentally validated, intrinsically disordered microprotein is NBDY. NBDY is a 68-amino acid microprotein expressed from a previously misannotated lncRNA (LOC550643 ). NBDY associates with members of the cytoplasmic mRNA decapping complex. The interaction partners of NBDY, EDC4 and DCP1A, are coactivators required for allosteric activation of DCP2, which catalyzes the first step in 5′-to-3′ mRNA decay (removal of the 7-methylguanosine cap), thus regulating the stability of thousands of specific mRNA substrates . Genetic ablation or silencing of NBDY stabilizes a majority of DCP2 substrates, consistent with the requirement of NBDY for their effective decapping, including transcripts encoding proteins involved in immune responses – a pathway previously reported to be regulated by DCP2. However, at the same time, a number of DCP2 substrates are destabilized by NBDY ablation, suggesting that the microprotein may act as a specificity factor for recruitment of mRNA targets to the decapping complex. In particular, in the presence of NBDY, DCP2 substrate mRNAs with shorter 5′UTRs decay more rapidly, suggesting that there may be a requirement for NBDY for efficient recognition of transcripts with short leader sequences by DCP2. While the molecular mechanism by which NBDY regulates the mRNA decapping complex is not yet known, mRNA decapping proteins have previously been reported to associate via SLIMs within disordered regions, and it is likely that NBDY participates in this network. NMR experiments indicated that NBDY is largely intrinsically disordered in solution, consistent with its ability to phase-separate in the presence of RNA to form liquid droplets in vitro. Within the intrinsically disordered NBDY sequence, two independent SLIMs interact with the WD40 domain of EDC4 and the EVH1 domain of DCP1A. The interaction between EDC4 and NBDY appears to be more important for NBDY function in mutagenesis experiments, but, given the relatively low affinity of NBDY for EDC4 (KD ~ 1 micromolar), the interaction with DCP1A could speculatively be important for increasing avidity of NBDY for the mRNA decapping complex, retaining it at interaction sites. Importantly, NBDY also partially localizes to and regulates phase-separated RNA granules termed P-bodies in cells, consistent with a role for intrinsically disordered microproteins in biological phase separation. NBDY is phosphorylated downstream of EGFR and cyclin-dependent kinase signaling, and this phosphorylation is required for dissociation of P-bodies – likely via electrostatic repulsion of negatively charged P-body components that promotes liquid-phase remixing and cell proliferation. Taken together, NBDY’s intrinsic disorder enables its SLIM-mediated protein-protein interactions, phase separation and regulation of P-bodies, providing a well-defined example of the functional significance of intrinsic disorder in a microprotein.