3. NON-CANONICAL CODING RNA MODIFICATIONS: DISTRIBUTION,
DYNAMISM AND FUNCTION.
3.1. N6-Methyladenosine. N6-methyladenosine (m6A) is the most
abundant internal modification detected in mammalian mRNAs (0.2%–0.6%
of all adenosines) (Śledź and Jinek, 2016). Its abundance together with
the development of robust detection methods led to an intense research
interest, and nowadays, m6A is the best characterized RNA modification.
It consists of the addition of a methyl group at the nitrogen-6 position
of adenosine (Figure 1 ). The methyltransferase-like 3
(METTL3)–METTL14 heterodimer is involved in the methylation process,
where METTL3 is the catalytic subunit and METTL14 acts as the
RNA-binding scaffold for substrate recognition (Śledź and Jinek, 2016).
Another m6A writer protein is METTL16, a U6 snRNA m6A methyltransferase.
METTL16 is involved in the regulation of the cellular levels of
S-adenosylmethionine (SAM), the methyl donor for methylation, as well as
in the mRNA splicing process (Pendleton et al., 2017). Apart from
passive m6A demethylation of the transcriptome, this modification is
actively removed by the activity of the fat mass and obesity-associated
protein (FTO) (Jia et al., 2011) and AlkB homologue 5 (ALKBH5) (Zheng et
al., 2013) demethylases. FTO and ALKBH5 proteins are dioxygenases known
to demethylate N-methylated nucleic acids. m6A readers have been also
identified, included m6A-binding proteins belonging to the YTH family
(YTHDF and YTHDC proteins) (Xiao et al., 2016), IGF2BP proteins (Huang
et al., 2018), and some heterogeneous nuclear ribonucleoproteins (hnRNP)
(Alarcón et al., 2015a).
Generally, m6A deposition on mRNA occurs in a sequence- dependent
manner, mainly in the coding regions (CDS) and 3’ untranslated regions
(UTR) with a significant enrichment just upstream of the stop codon
(Dominissini et al., 2012; Meyer et al., 2012). Interestingly, it has
been described that trimethylation of histone H3 at Lys36 (H3K36me3)
influences m6A deposition into specific genomic sequences by recruiting
METTL14 complex (Huang et al., 2019a). Chromatin immunoprecipitation
(ChIP)-sequencing studies demonstrated that approximately 70% of m6A
peaks overlapped with H3K36me3 sites (Huang et al., 2019a). Altogether,
the association between histone H3K36me3 and m6A RNA methylation adds a
new layer of complexity in the control of gene expression. An
anticipated research scenario focused on the integration of epigenetic
and epitranscriptomic signals to explain gene control is expected in the
near future.
The wide range of readers could explain why m6A is involved in almost of
aspects of postranscriptional gene regulation and mRNA life cycle,
including mRNA stability, splicing and translation. For instance, the
m6A readers YTHDF1 and YTHDF2 controls \soutthe mRNA stability during
stem cell differentiation and modulates processes such as haematopoietic
stem and progenitor cell specification (Zhang et al., 2017a; Li et al.,
2018b), neural induction from induced pluripotency stem cells (Heck et
al., 2020), mammalian spermatogenesis (Hsu et al., 2017) or circadian
regulation of downstream genes involved in lipid metabolism (Zhong et
al., 2018). By recognizing m6A on pre-mRNA, YTHDC1, hnRNPC, hnRNPG, and
hnRNPA2B1 could also modulate mRNA splicing (Alarcón et al., 2015a; Liu
et al., 2015; Xiao et al., 2016). YTHDC1 could also mediate nuclear
export of processed RNAs into cytoplasm (Roundtree et al., 2017b). In
addition to regulating RNA stability and splicing, m6A reader proteins,
including YTHDF1, YTHDF3, IGF2BP1/2/3, YTHDC2, supervise the RNA
translation process and RNA decay (Shi et al., 2017; Huang et al.,
2018). Strikingly, the deposition of m6A in 3’ UTRs suggest that m6A
could be incorporated into specific miRNA target sequences to modulate
miRNA-binding (Alarcón et al., 2015b). And vice versa , it has
been recently described that microRNAs regulate m6A modification via a
sequence pairing mechanism and influences cell reprogramming in
pluripotency (Chen et al., 2015). This finding reinforces the crosstalk
between the epigenome and epitranscriptome in the control of gene
regulation.
3.2. N1-Methyladenosine. The N1-methyladenosine modification
(m1A), or the addition of a methyl group at the nitrogen-1 position of
adenosine (Figure 1 ), was described decades ago to primarily
affect all classes of RNAs (Barbieri and Kouzarides, 2020). It is
predominant in tRNA and rRNA, but it was recently determined that it
also exists in mRNA (Boccaletto et al., 2018). Nowadays, there is very
little information of its frequency, the key players involved in m1A
regulation and its consequences in mRNA. Although its frequency in
cytosolic mRNA is controversial, it is accepted that m1A is less
abundant than m6A (about ten times) (Dominissini et al., 2016; Safra et
al., 2017). The m1A modification maps uniquely to GC-rich, 5’-UTRs
positions in coding transcripts (Safra et al., 2017). An aspect of
interest is that unlike m6A, m1A occurs in the Watson-Crick interface
carrying a positively charged base at this position (Roundtree et al.,
2017a). Alterations at protein-RNA interactions and RNA
secondary/tertiary structures could be expected. The role of m1A
modification is under elucidation, however, some recent works described
a function in the initiation of mRNA translation (Dominissini et al.,
2016; Li et al., 2016b) by facilitating non-canonical binding of the
exon-exon junction complex at 5’ UTRs devoid of 5’ proximal introns
(Cenik et al., 2017). Its role in the control of regulation is supported
by its high conservation in mouse and human cells (Cenik et al., 2017).
The only known m1A writer of cytosolic mRNA is the TRM6-TRM61 complex,
however, its activity does also cover m1A in the mitochondrial-encoded
transcripts (Li et al., 2017a; Safra et al., 2017). m1A modification can
be removed from mRNA by ALKBH3, a m1A demethylase both in mRNA and tRNA
(Dominissini et al., 2016; Li et al., 2016b; Esteve-Puig et al., 2020).
The YTH protein family of m6A readers could also interpret m1A signal.
Specifically, YTHDF1-3 and YTHDC1 were shown to bind directly to m1A in
mRNA in human cancer cells (Dai et al., 2018). New insights into the
functions of m1A in RNA biology are needed; so far, only a role in the
response to various types of cellular stress has been proposed
(Dominissini et al., 2016; Li et al., 2016b).
3.3. 5-Methylcytosine . Like DNA, all types of RNA molecules can
be methylated at carbon 5 of cytosine giving rise to 5-methylcytosine
(m5C) (Figure 1 ) covering diverse functions depending on the
RNA specie (Trixl and Lusser, 2019). The abundance of m5C in mRNA is
under strong debate and discrepancies come from the technical
difficulties to establish the transcriptome-mapping of m5C, mainly due
to incomplete conversion of cytidine and m5C during bisulfite treatment.
It is estimated that about 62-70% of \soutthe cytosine sites had low
methylation levels (<20% methylation), while 8-10% of the
sites were moderately or highly methylated (>40%
methylation) (Huang et al., 2019b). The location of m5C modifications
primarily maps to CDS, although an enrichment has been also observed in
the 5’-UTR and the 3’-UTR regions (Huang et al., 2019b).
The writers of RNA m5C modifications in mammals include seven members of
the NOL1/NOP2/SUN domain family member (NSUN) family (NSUN1-7), and DNA
methyltransferase-like 2 (DNMT2). However, so far only NSUN2 has been
proved to methylate mRNA (Yang et al., 2017b). In this regard, only
overexpression/suppression of NSUN2 but not of any other NSUN enzyme,
affected overall m5C levels in mRNA from HeLa cells (Yang et al.,
2017b). Regrettably, enzymes that remove 5mC from RNA species have not
yet been identified.
As we are only beginning to uncover the biology of m5C in mRNA, not much
is known about the potential functional consequences. A role for m5C in
the regulation of nuclear export has been discovered (Yang et al.,
2017b). Specifically, the activity of the nuclear export factor
ALYREF/THOC4 is strongly affected by the m5C level of its target mRNAs
(Yang et al., 2017b). The 5mC deposition is not a random event since 5mC
accumulates at translational start codon and in a CG sequence context.
In addition, m5C can act as a modulator of protein translation. Examples
include the m5C accumulation at 5’UTR of cyclin-dependent kinase
inhibitor p27KIP1 during replicative senescence (Tang
et al., 2015), or m5C deposition in the 3’ UTRs of the cell cycle
regulators CDK1 and p21 during the cell division cycle (Xing et al.,
2015).
Physiologically, NSUN2 is enrolled in multiple biological pathways. It
has been identified as a direct target gene of the transcription factor
Myc and its activation is relevant for the differentiation of primary
human keratinocytes (Frye and Watt, 2006). Mice models consisting ofNsun2 knockdown exhibit additional development defects, such as
impaired cerebral cortex organization, immature skeleton, among others
(Tuorto et al., 2012) . Nsun2 was also implicated in testis
differentiation (Hussain et al., 2013). The molecular mechanisms
connecting NSUN2 deficiencies and impaired cell differentiation were not
identified.
3.4. Pseudouridine. Pseudouridylation is the isomerization of
the uridine base via breakage of the glycosidic bond, 180°
base-rotation, and bond reformation (Hamma and Ferré-D’Amaré, 2006)
(Figure 1 ). It is the most frequent modification in total human
RNA; however, the mapping of pseudouridine (ψ) in mRNAs was recently
addressed (Penzo et al., 2017). Methodological limitations introduce
serious controversy on the distribution and abundance of ψ, but the
general consensus is that ψ sites in mRNA are much less abundant than
m6A (Schwartz et al., 2014). Besides mRNAs, non-coding RNAs (ncRNAs)
have emerged as highly interesting targets with ψ sites (Rintala-Dempsey
and Kothe, 2017). The enzymology associated with pseudouridylation is
very complex. In eukaryotes, uridine is transformed into ψ by a class of
enzymes known as pseudouridylases. Pseudouridylases are represented in
humans by pseudouridine synthases (PUS) encoded by 13 genes. Human PUS
enzymes are far less studied than their counterparts in other organisms
but recent discoveries allow a better identification of PUS enzymes,
including those acting on mRNA (PUS1, PUS3, PUS4, PUS6, PUS7 and PUS9)
(Penzo et al., 2017). Their mode of action or potential redundancy in
their functions has not yet been completely resolved (Carlile et al.,
2014a; Penzo et al., 2017). Currently, any specific eraser or reader
associated with ψ modifications have been identified (Barbieri and
Kouzarides, 2020).
It is well known that ψ enhances the function of tRNA and rRNA by
stabilizing the RNA structure as well as regulating the splicing process
by modifying specific snRNAs (Carlile et al., 2014b; Barbieri and
Kouzarides, 2020). The physiological relevance of ψ in mRNA is more
unclear with only a few evidences of its role. Mutations in genes
encoding human PUS enzymes cause inherited diseases affecting muscle and
brain function which reinforced their emerging role as regulators of
gene expression (Shaheen et al., 2019). Notably, ψ content in 3’UTR mRNA
is regulated in response to environmental signals, such as serum
starvation in human cells, suggesting a function in the flexible
adaptation of the genetic code through inducible mRNA modifications
(Carlile et al., 2014b). A role in mRNA translation throughout the
control of ribosome pausing and RNA localization has been also suggested
(Carlile et al., 2014b; Schwartz et al., 2014).
3.5. Adenosine-to-inosine editing. Another RNA modification in
mammals is the irreversible deamination of adenosine to inosine, a
process also known as A-to-I editing (Figure 1 ). A-to-I editing
occurs in multiple genomic sequences, ranging from coding regions of
mRNAs to non-coding regions (e.g., Alu repeats, pre-miRNAs or
pri-miRNAs) (Nishikura, 2016a). Inosine is interpreted at cellular level
like a guanine and, consequently, A-to I editing could alter the
biogenesis and/or function of miRNAs or mRNAs as well as proteins
(Nishikura, 2016b). However, a comparative study among animal A-to-I
modifications revealed that non-coding parts of the genome were the main
targets for the editing process. A role in protecting against activation
of innate immunity by self-transcripts have been proposed (Eisenberg and
Levanon, 2018). A second type of A-to-I editing is hyper-editing, which
could be understood as \soutan editing enriched regions (Porath et
al., 2014). A large proportion of adenosines in close proximity to each
other within the same transcript is a requisite for hyper-editing. In
mammals, this class of editing is mostly associated with regions of
repetitive sequences, intronic regions and 3′ UTRs (Porath et al.,
2017).
A-to-I edition is catalysed by \soutthe adenosine deaminase acting on
dsRNA family of proteins, ADAR. ADAR1 and ADAR2 are the catalytically
active proteins, whereas ADAR3 lacks editing activity and may act as a
negative regulator of ADAR1 and ADAR2 activity (Nishikura, 2016b). Both
ADAR1 and ADAR2 proteins have essential roles in cellular
differentiation. In mammals, ADAR1 is widely expressed, especially in
the myeloid component of the blood system, and plays a prominent role in
promiscuous editing of long dsRNA (Zipeto et al., 2016). Additional
studies indicate that ADAR1 forms a complex with Dicer to promote miRNA
processing (Ota et al., 2013). ADAR2 has a higher expression in brain
and is primarily required for site-specific editing of key transcripts
for central nervous system development (Behm et al., 2017). A role for
ADAR2 in the control of the circadian clock has been revealed (Terajima
et al., 2017).