Tài liệu Báo cáo khoa học: The RNA recognition motif, a plastic RNA-binding platform to regulate post-transcriptional gene expression - Pdf 10

MINIREVIEW
The RNA recognition motif, a plastic RNA-binding platform
to regulate post-transcriptional gene expression
Christophe Maris*, Cyril Dominguez* and Fre
´
de
´
ric H T. Allain
Institute for Molecular Biology and Biophysics, Swiss Federal Institute of Technology Zurich, ETH-Ho
¨
nggerberg, Zu
¨
rich, Switzerland
History – what deﬁnes an RRM?
The RNA recognition motif (RRM), also known as
the RNA-binding domain (RBD) or ribonucleopro-
tein domain (RNP), was ﬁrst identiﬁed in the late
1980s when it was demonstrated that mRNA precur-
sors (pre-mRNA) and heterogeneous nuclear RNAs
(hnRNAs) are always found in complex with proteins
(reviewed in [1]). Biochemical characterizations of the
mRNA polyadenylate binding protein (PABP) and
the hnRNP protein C shed light on a consensus
RNA-binding domain of approximately 90 amino
acids containing a central sequence of eight con-
served residues that are mainly aromatic and posi-
tively charged [2,3]. This sequence, termed the RNP
consensus sequence, was thought to be involved in
RNA interaction and was deﬁned as Lys ⁄ Arg-
Gly-Phe ⁄ Tyr-Gly ⁄ Ala-Phe ⁄ Tyr-Val ⁄ Ile ⁄ Leu-X-Phe ⁄ Tyr,
where X can be any amino acid. Later, a second

and classiﬁed the different structural elements of the RRM that are import-
ant for binding a multitude of RNA sequences and proteins. Common
structural aspects were extracted that allowed us to deﬁne a structural leit-
motif of the RRM–nucleic acid interface with its variations. Outside of the
two conserved RNP motifs that lie in the center of the RRM b-sheet, the
two external b-strands, the loops, the C- and N-termini, or even a second
RRM domain allow high RNA-binding afﬁnity and speciﬁc recognition.
Protein–RRM interactions that have been found in several structures rein-
force the notion of an extreme structural versatility of this domain support-
ing the numerous biological functions of the RRM-containing proteins.
Abbreviations
ACF, APOBEC-1 complementary factor; CBP, cap binding protein; CstF, cleavage stimulation factor; hnRNP, heterogeneous nuclear
ribonucleoprotein; HuD, Hu protein D; LRR, leucine rich repeat; MIF4G, middle domain of the translation initiation factor 4 G; PABP,
polyadenylate binding protein; PIE, polyadenylation inhibition element; PTB, polypyrimidine tract binding protein; RBD, RNA-binding domain;
RNP, ribonucleoprotein; RRM, RNA recognition motif; SR, serine/arginine rich proteins; TLS, translocated in liposarcoma; U1A, U2A¢, U2B¢:
U1 snRNP proteins A, A¢,B¢; U2AF, U2 snRNP auxiliary factor; UHM, U2AF homology motif; UPF, up-frameshift protein.
2118 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS
was deﬁned as Ile ⁄ Val ⁄ Leu-Phe⁄ Tyr-Ile ⁄ Val ⁄ Leu-X-
Asn-Leu. The ﬁrst consensus sequence was therefore
referred as RNP 1 and the second as RNP 2 (Fig. 1).
It was then shown that this protein domain was
necessary and sufﬁcient for binding RNA molecules
with a wide range of speciﬁcities and afﬁnities
(reviewed in [4–6]).
Here we review the structural properties of the
RRM domain in its isolated form and in complex with
RNAs and ⁄ or proteins. This review shows how such a
simple domain can modulate its fold to recognize
many RNAs and proteins in order to achieve a multi-
tude of biological functions often associated with post-

ded RNA [9,10]. The PABP and the WW domains [11]
are protein–protein interaction domains involved in
translation [12,13] and pre-spliceosome formation,
respectively [14]. By association with different types of
protein domains, the RRM domain can modulate its
RNP2 RNP1
α1 α2β1 β4β3β2
L1 L2 L5L3 L4
10 20 30 40 50 60 70 80
PTB (1SJQ) 60 VIHIRKLPIDVTEGEVISLGLP FGKVTNL LMLKG KNQAFIEMNTEEAANTMVNYYTSVTPVLRGQPIYIQ 147
PTB (1SRJ) 183 RIIVENLFYPVTLDVLH-QIFSK FGTVLKI ITFTKNN QFQALLQYADPVSAQHAKLSLDGQNIYNACCTLRID 282
PTB (1QM9) 338 VLLVSNLNPERVTPQSLFILFGV YGDVQRV KILFNK KENALVQMADGNQAQLAMSHLNGHKLH GKPIRIT 407
PTB (1QM9) 455 TLHLSNIPPSVSEEDLK-VLFSS NGGVVKG FKFFQKD RKMALIQMGSVEEAVQALIDLHNHDLG-ENHHLRVS 531
Cstf-64 (1P1T) 17 SVFVGNIPYEATEEQLK-DIFSE VGPVVSF RLVYDRETGKPKGYGFCEYQDQETALSAMRNLNGREFS GRALRVD 90
LA (1OWX) 244 LKFSGDLDDQTCREDLHILFSNH GEIK WIDFVRGA KEGIILFKEKAKEALGKAKDANNGNLQLRNKEVTWEV 305
TAP (1FO1) 121 KITIPYGRKYDK-AWLLSMIQSKCSVPFTPIEFHYENTRAQFFVEDASTASALKAVNYKILDRENRRISIIINSSAP PHS 290
A
LY (1NO8) 106 KLLVSNLDFGVSDADIQ-ELFAE FGTLKKA AVHYDRSGR-SLGTADVHFERKADALKAMKQYNGVPLD GRPMNIQ 178
hnRNP A1 (1UP1) 15 KLFIGGLSFETTDESLR-SHFEQ WGTLTDC VVMRDPNTKRSRGFGFVTYATVEEVDAAMNARP-HKVD GRVVEPK 87
hnRNP A1 (1HA1) 105 KIFVGGIKEDTEEHHLR-DYFEQ YGKIEVI EIMTDRGSGKKRGFAFVTFDDHDSVDKIVIQKY-HTVN GHNCEVR 177
HUD (1FXL) 47 NLIVNYLPQNMTQEEFR-SLFGS IGEIESC KLVRDKITGQSLGYGFVNYIDPKDAEKAINTLNGLRLQ TKTIKV 119
HUD (1FXL) 133 NLYVSGLPKTMTQKELE-QLFSQ YGRIITS RILVDQVTGVSRGVGFIRFDKRIEAEEAIKGLNGQKPSGATEPITVK 206
SXL (2SXL) 126 NLIVNYLPQDMTDRELY-ALFRA IGPINTC RIMRDYKTGYSYGYAFVDFTSEMDSQRAIKVLNGITVR NKRLKV 199
SXL (1SXL) 212 NLYVTNLPRTITDDQLD-TIFGK YGSIVQK NILRDKLTGRPRGVAFVRYNKREEAQEAISALNNVIPEGGSQPLSVR 290
PABP (1CVJ) 12 SLYVGDLHPDVTEAMLY-EKFSP AGPILSI RVCRDMITRRSLGYAYVNFQQPADAERALDTMNFDVIK GKPVRI 84
PABP (1CVJ) 99 NIFIKNLDKSIDNKALYDTFSAF GNILSCK VVCDENGSKGYGFVHFETQEAAERAIEKMNGMLLNDRKVFVGRFKS 175
Nucleolin (1FJE) 309 NLFIGNLNPNKSVAELKVAISEL FAKND LAVVDVRTGTNRKFGYVDFESAEDLEKAL-ELTGLKVF GNEIKLE 380
Nucleolin (1FJE) 396 LLAKNLSFNITEDELKEVFEDAL EIRLVSQ DGKSKGIAYIEFKS EADAEKNLEEKQGAEID GRSVSLY 463
U1A (1DZ5) 11 TIYINNLNEKIKKDELKKSLYAI FSQFGQI LDILVSRSLKMRGQAFVIFKEVSSATNALRSMQGFPFY DKPMRIQ 85
U2B" (1A9N) 8 TIYINNMNDKIKKEELKRSLYAL FSQFGHV VDIVALKTMKMRGQAFVIFKELGSSTNALRQLQGFPFY GKPMRI 81

plants, RRM proteins are present in chloroplasts and
are involved in 3¢ end processing of chloroplast mRNA
[15]. They have also been discovered in plant mito-
chondria. Their functions, however, remain unclear
[16]. Similarly, their roles in bacteria and viruses are
still unknown. The numerous three-dimensional struc-
tures of the RRM in isolation, and in complex with
RNA or other proteins, shed light on the function of
RRM proteins, as shown below.
The structure of the RRM, a babbab fold
with some variations and extensions
The RRM folds into an ab sandwich structure with a
b
1
a
1
b
2
b
3
a
2
b
4
topology (Figs 1 and 2) as demonstrated
by the ﬁrst structure of an RNA recognition motif,
the N-terminal RRM of U1A [17]. The fold is com-
posed of one four-stranded antiparallel b-sheet spa-
cially arranged in the order b
4

The loops between the secondary structure elements
(loops 1–5 as indicated in Figs 1 and 2) can have
different lengths and are often disordered in the free
form. An exception to this is loop 5 that often forms
a small two-stranded b-sheet (b
3
¢ and b
3
¢) (Fig. 2).
The N- and C-terminal regions, outside the RRM,
are usually poorly ordered in the isolated domains
with a few exceptions where they can adopt a secon-
dary structure (Fig. 2, PTB-RRM 3, La C-terminal
RRM and CstF-64). In the structures of La C-ter-
minal RRM [20], U1A N-terminal RRM [21] and
CstF-64 RRM [22], the C-terminus forms an a-helix
that lies on the b-sheet surface, while in PTB-
RRM 2 and 3 it extends the size of the b-sheet by
forming an extra b-strand (b5) antiparallel to b
2
[23,24]. CstF-64 RRM has also an additional short
a-helix in its N-terminal region (Fig. 2) [22]. Finally,
secondary structure elements of the domain can be
modiﬁed; for example a-helix 1 in U2AF
35
RRM
that is three times longer than in a canonical RRM
(Fig. 2). This unusual helix 1 is involved in protein–
protein interactions [25] (see the RRM–protein com-
plexes section).

rings located on b
1
(Phe108, RNP 2 position 2) and b
3
(Phe150, RNP 1 position 5) strands, respectively
(Fig. 3A). The contacts with these two RNP positions
result in a characteristic arrangement of the nucleic
acid strand on the b-sheet surface in which the 5¢ end
is located on the ﬁrst half of the b-sheet (b
4
b
1
) and
the 3¢ end on the second half (b
3
b
2
) (Fig. 3B). A third
aromatic residue located on b
3
(Phe148, RNP 1
position 3) interacts hydrophobically with the sugar
rings of A209 and G210. Finally, a positively charged
side chain (Arg146, RNP 1 position 1) forms a salt
bridge with the phosphate between A209 and G210.
This small set of RRM–nucleic acid interactions, in
the center of the domain, involving four conserved
protein side chains of the RRM consensus sequence
and two nucleotides, illustrates the perfect adaptation
of the RRM for effectively binding single-stranded

each RNP sequence numbering. The con-
served aromatic residues are highlighted by
green circles [34]. (B) Structural arrange-
ment of the DNA strand on the b-sheet of
hnRNPA1–RRM 2. (C) Hydrogen bond and
van der Waals interaction network confer-
ring base-binding speciﬁcity (hnRNPA1–
RRM 2 complex). This ﬁgure was generated
with the program
MOLMOL [56].
C. Maris et al. The RRM domain, a plastic RNA-binding platform
FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS 2121
interactions involving RNP 2 position 2 (always pre-
sent except in nucleolin RRM 2 [37]) and RNP 1 posi-
tion 5 (always present except in CBP 20 [36]). The
contacts between the sugars and RNP 1 position 3 are
present in ﬁve RRM–RNA complexes (CBP20, PABP
RRM 1, nucleolin RRM 1 and RRM 2 and sex-lethal
RRM 1). The RNP 1 position 1 residue does not
necessarily interact with the phosphate between the
dinucleotide because in all structures apart from
hnRNPA1 it contacts an RNA base or a phosphate
oxygen of other nucleotides. Also, the RRM inter-
actions with the sugar–phosphate backbone are fairly
AB C
DE F
Fig. 4. The RRM domain, a highly plastic platform for nucleic acid binding. (A) Nucleolin RRM 2-sNRE complex [28]. (B) Sex-lethal
RRM 1–polyU–Tra mRNA [31]. (C) Sex-lethal RRM 2–Tra mRNA precursor complex [31]. (D) hnRNPA1 RRM 1–telomeric DNA complex [34].
(E) Poly(A)-binding protein RRM 1–polyadenylate RNA complex [33]. (F) Heterodimeric nuclear cap binding complex 5¢ capped polymerase II
transcripts [36]. In all ﬁgures, the RNA is shown in yellow and the protein side chain in green. The ribbon of the RRM is shown in grey. The

A highly plastic domain to achieve high
RNA-binding afﬁnity and speciﬁcity
Many RRMs bind RNA with high afﬁnity (in the nm
range) and high sequence-speciﬁcity, in particular all
those whose structures have been determined to date.
Nevertheless, sequence-speciﬁcity does not necessarily
imply high afﬁnity, e.g. PTB that speciﬁcally recogni-
zes pyrimidine tracts but does not provide sufﬁcient
binding enthalpy to reach nm afﬁnity (F. C. Ober-
strass, S. D. Auweter and F. H T. Allain, unpublished
data). To achieve higher afﬁnity, some RRM proteins
use the two external b
4
and b
2
strands, while others
use the loops 1, 3 or 5, or the C- and N- termini [39].
In many proteins, multiple RRMs associate to bind
longer nucleotide stretches. In these cases, the interdo-
main linker is an essential component of RNA recogni-
tion. In addition, the RNA secondary structure can be
an important determinant of the protein binding afﬁn-
ity. All of these aspects are presented in detail below.
Role of the two external b-strands and the loops
The b-sheet surface of an RRM can be modulated by
using only one or up to four b-strands for RNA bind-
ing. Figure 4 clearly illustrates that the b-sheet surface
is not used to the same extent in each RRM–nucleic
acid complex. Exceptionally, in hnRNPA1 RRM 1,
each b-strand binds one nucleotide, the DNA being

hnRNPA1 RRM 1 [34] (Fig. 4D). The C-terminus of
hnRNPA1 RRM 1 is particularly interesting because it
is unstructured in the free form and becomes ordered
upon DNA binding forming a 3
10
helix. This structural
rearrangement reinforces the concept of binding by
induced ﬁt, initially proposed with the structure of the
U1A–RNA complex [27]. Side chain residues of this
helix, His101 and Arg92, stack over A203 and G204,
respectively (Fig. 4D) [34].
The C-terminus can also contribute to differentiating
RNA from DNA by interacting with the 2¢OH group
of the sugar ring as shown in Fig. 4B,E. The hydroxyl
group can act as a hydrogen bond acceptor interacting
with protein side chains (Fig. 4E, Arg94; Fig. 4B
Arg202) as well as with the backbone amide (Fig. 4B,
Gly205) and ⁄ or as a hydrogen bond donor interacting
with the carbonyl oxygen of the protein backbone [38].
Other parts of the RRM domain, such as the b
2
-strand
and the loops, also interact with the 2¢OHs and help
to discriminate RNA from DNA [26,31,33,35].
The C-terminal region does not always enhance, but
can also inhibit RNA binding as shown in the struc-
ture of CBP20 [36] (Fig. 4F). Two residues (Asn116
and Arg123) of the C-terminus form a salt bridge
located above the RNP 1 residue at position 5 (Phe85)
preventing any RNA binding at this key position.

tide sequence, AUUGCA, as U1A but within a differ-
ent stem loop (U2 snRNA hairpin IV) and only when
in complex with U2A¢ (Fig. 5B). The adaptability of
the RRM domain is further illustrated here, as the key
residue Arg52 still interacts with the RNA stem
although the closing base pair is a UU base pair in
U2snRNA SLIV instead of a GC in U1snRNA SLII.
While both U1A and U2B¢ recognize the bases at
the top of the stem through numerous hydrogen
bonds, nucleolin contacts the nucleolin recognition ele-
ment (sNRE) RNA stem essentially by van der Waals
interactions [28] (Fig. 5C). The two RRMs of nucleolin
sandwich the seven nucleotide loop and RRM 1 and
its C-terminal part recognize the unusual loop E struc-
ture [28]. The substitution of the loop E by two GC
base pairs separated by a bulge increases the dissoci-
ation constant more than 100-fold (from 5 nm to
0.8 lm) [30] and, as shown in Fig. 5D, this substitution
annihilates all van der Waals interactions (only one
hydrogen bond from Lys95 is retained). The double-
stranded stem is important for two reasons: ﬁrst, it
restricts the conformation of the RNA loop and redu-
ces the entropy loss accompanying protein binding;
and second, some structural features of the RNA such
as the base pair (U1A and U2B¢) or loop E (nucleolin)
that closes the RNA loop, are crucial for positioning
the RRM onto the RNA. It was postulated that the
RNA structure is essential because it induces conform-
ational changes in order to reach the bound state
[27,40].

(S. D. Auweter and F. H T. Allain, unpublished
data). Thus, recognition of a longer single-stranded
DNA or RNA requires more than one RRM to form
a larger binding platform. Four structures of two con-
secutive RRMs in complex with RNA (sex-lethal [31],
HuD [35], PABP [33] and nucleolin [28,30]) and one
with DNA (hnRNPA1 [34]) have been determined. In
all ﬁve cases, the two RRMs and the interdomain lin-
ker cooperatively bind RNA providing high afﬁnity
and speciﬁcity. In the free forms of sex-lethal and
nucleolin, the linkers are disordered and the two RRM
domains tumble independently [37,41]. In some cases
(PABP, nucleolin), the interdomain linker (that is the
C-terminal region of the N-terminal RRM as described
above) acts as a bridge, mediating the cooperative
binding of two RRM domains with the RNA. More
interesting is the range of new possible conformations
provided by the association of two RRMs (Fig. 6). In
PAPB, a large binding platform is created for the
RNA; in sex-lethal and HuD, the two RRMs form a
cleft in which the RNA lies; and in nucleolin the RNA
is sandwiched between the RRMs. As a consequence
of the relative arrangement of the two domains in sex-
lethal, HuD and nucleolin, several intra-RNA inter-
actions are created upon RNA binding that contribute
to the overall enthalpy of the complex, while in PABP
almost no intra-RNA interactions are present. On the
contrary, hnRNPA1 RRMs 1–2 and PTB RRMs 3–4
(F. C. Oberstrass, S. D. Auweter and F. H T. Allain,
unpublished results) are arranged in such a way that

RRM 1
RRM 2
5'
3'
PABP
5'
5'
3'
3'
U1A
Fig. 6. The RRM–RRM interactions. Several
protein structures either free or in a com-
plex in which two RRM domains interact
are shown. Structures of (A) UP1 in the free
form [53] (pdb:1 lp1), (B) nucleolin in com-
plex with RNA [28] (pdb:1fje), (C) sex-lethal
in complex with RNA [31] (pdb:1b7f),
(D) PABP in complex with RNA [33]
(pdb:1cvj), and (E) U1A homodimer in com-
plex with RNA [29] (pdb:1dz5). The RNA
backbone is shown in yellow (A–E), the
N-terminal RRM domain is displayed green,
C-terminal domain blue, and linker region
red. (F) One monomer of U1A is displayed
green and the other blue. In all cases,
important residues for the protein–protein
interaction are displayed as balls and sticks.
This ﬁgure was generated using the pro-
grams
MOLSCRIPT and RASTER3D [57,58].

and the two RRM domains are independent [28,41].
However, upon RNA binding, the two RRM domains
adopt a ﬁxed orientation and contact each other. In
the nucleolin structure, the RRMs interact via two salt
bridges located in the loops (Fig. 6B) and in the struc-
ture of hnRNPA1, the RRMs interact by salt bridges
located in the a
2
-helix. Other examples of RNA indu-
cing RRM–RRM interactions have also been described
in the case of sex-lethal [31], PABP [33], and HuD
[35]. In sex-lethal and HuD, the interdomain inter-
action is mainly governed by two hydrogen bonds
between residues located in b
1
and b
4
of RRM 1 and
in b
2
of RRM 2 (Fig. 6C). Furthermore, additional
contacts between RRM 2 and the linker region are
observed. In the case of PABP, the interdomain inter-
actions are mediated through many salt bridges and
van der Waals contacts between a
2
and b
4
of RRM 1
and b

Both U2B¢ and CBP20 need a cofactor, U2A¢ and
CBP80, respectively, to recognize RNA. Ternary
structures of these complexes have been solved that
partially explain the importance of a cofactor in
RNA–RRM binding [32,43–45]. U2A¢ consists of ﬁve
consecutive leucine-rich repeats, and CBP80 of three
helical hairpin repeats very similar to the fold of the
middle domain of the translation initiation factor 4G
(MIF4G) domain. In both cases, the RRM domains of
U2B¢ and CBP20 interact with the leucine rich repeat
(LRR) motif or the MIF4G domain through their
a-helices and loop 4, keeping the b-sheet accessible for
RNA-binding (Fig. 7). The interactions, however, are
different as they are governed mainly by hydrophobic
contacts in the U2B¢–U2A¢ complex, and salt bridges
and hydrogen bonds in the CBP20–CBP80 complex.
Furthermore, in the case of CBP20, the N- and C-ter-
minal extensions ﬂanking the RRM domain become
structured only when in complex with both RNA and
CBP80. As for RRM–RRM interactions, these RRM–
protein interactions contribute to RNA-binding specif-
icity, U2A¢ contacting the RNA and CBP80 stabilizing
both the N- and C-termini of CBP20 RRM, two key
components of CBP20–RNA recognition (Fig. 4) [44].
RRM domains involved only in protein
recognition
Some proteins containing RRM domains are involved
in protein–protein but not in protein–RNA interactions.
The RRM domain, a plastic RNA-binding platform C. Maris et al.
2126 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS

[46–48] (Fig. 8). This particular complex formation of
the RRM neatly explains why some RRM domains
do not have RNA-binding activities. Similarly, in the
structure of the UPF2–UPF3 complex involved in
non-sense mediated mRNA decay, the b-sheet of the
N-terminal RRM domain of UPF3 binds UPF2 [50].
Although the two RRM proteins both interact through
their b-sheet, their interacting proteins, Magoh and
UPF2, adopt a completely different fold. UPF2 has a
totally a-helical MIF4G fold very similar to CBP80,
while Magoh has an ab fold (Fig. 8). Also striking is
the fact that both UPF2 and CBP80 adopt a MIF4G
fold, but recognize RRM in a totally different manner,
UPF2 recognizing the RRM b-sheet and CBP80 the
RRM a-helices.
The structures of the splicing factors U2AF
35
–
U2AF
65
and U2AF
65
–SF1 are another example of
the diversity encountered in protein–RRM recogni-
tion. U2AF
65
contains three RRM domains, the two
U2B"-U2A'
A
B

in complex with the N-ter-
minal domain of SF1 have been solved [46,51].
Surprisingly, in this case, the b-sheet of the RRM
domain is not implicated in protein interaction as for
other non-RNA-binding RRM domains, but involves
the two a-helices. Analysis of the RRM fold in these
two structures shows striking differences from the
canonical RRM domains, mainly consisting of a
longer helix a
1
(Fig. 2) and the absence of aromatic
residues in the RNP 1 and 2 motifs. The authors
therefore proposed a novel class of protein recogni-
tion motif that they named U2AF homology motif
(UHM) [25].
The examples described above deﬁne a novel class
of RRM domains that are involved in protein but
not RNA interactions, suggesting that RRM
domains might have evolved from RNA to protein
recognition. Although these RRM proteins do not
bind RNA, they are all implicated in RNA-related
functions such as recognition of the exon junction
(Y14), mRNA decay (UPF3) or pre-mRNA splicing
(U2AF
35
and U2AF
65
). This evolutionary process
can be accompanied by amino acid substitutions in
the RNA-binding regions, namely RNP 1 and 2, as

each complex and the sequence-speciﬁcity cannot eas-
ily be predicted. Thus, more structures of RRM–
RNA complexes are needed to fully understand the
determinants of this speciﬁcity. Second, RRM
domains are able to bind RNA with afﬁnities ran-
ging from very high to weak, and the structural and
thermodynamic determinants of the RNA-binding
afﬁnity still need to be elucidated. Third, as it is
now demonstrated that some RRM domains are spe-
ciﬁc to protein recognition rather than RNA binding,
which of the identiﬁed RRM domains are true
RNA-binding domains and which ones are not? In
some cases, the primary sequence can differentiate
between these behaviors, as for the novel UHM
domain, but in other cases, such as Y14 and UPF2,
structural determinants other than the amino acid
sequence must be present but are still unknown and
need to be identiﬁed. Fourth, it is established that a
high number of proteins contain both RRM and
auxiliary domains, such as zinc ﬁngers, also involved
in nucleic acid binding. No structural studies, how-
ever, indicate if these two RNA-binding domains
within the same protein inﬂuence each other for
RNA binding. Finally, it has recently been discov-
ered that the RRM domain, for a long time thought
to belong exclusively to the eukaryotic world, is also
present in bacteria, viruses and mitochondria. From
an evolutionary point of view, it would be very
interesting to investigate the function of this domain
in such organisms and maybe discover their common

7, 1731–1739.
4 Bandziulis RJ, Swanson MS & Dreyfuss G (1989)
RNA-binding proteins as developmental regulators.
Genes Dev 3, 431–437.
5 Kenan DJ, Query CC & Keene JD (1991) RNA recog-
nition: towards identifying determinants of speciﬁcity.
Trends Biochem Sci 16, 214–220.
6 Birney E, Kumar S & Krainer AR (1993) Analysis of
the RNA-recognition motif and RS and RGG domains:
conservation in metazoan pre-mRNA splicing factors.
Nucleic Acids Res 21, 5803–5816.
7 Maruyama K, Sato N & Ohta N (1999) Conservation
of structure and cold-regulation of RNA-binding pro-
teins in cyanobacteria: probable convergent evolution
with eukaryotic glycine-rich RNA-binding proteins.
Nucleic Acids Res 27, 2029–2036.
8 Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L,
Eddy SR, Grifﬁths-Jones S, Howe KL, Marshall M &
Sonnhammer EL (2002) The Pfam protein families data-
base. Nucleic Acids Res 30, 276–280.
9 Hudson BP, Martinez-Yamout MA, Dyson HJ &
Wright PE (2004) Recognition of the mRNA AU-rich
element by the zinc ﬁnger domain of TIS11d. Nat Struct
Mol Biol 11, 257–264.
10 De Guzman RN, Wu ZR, Stalling CC, Pappalardo L,
Borer PN & Summers MF (1998) Structure of the HIV-
1 nucleocapsid protein bound to the SL3 psi-RNA
recognition element. Science 279, 384–388.
11 Sudol M, Sliwa K & Russo T (2001) Functions of WW
domains in the nucleus. FEBS Lett 490, 190–195.

of the nuclear factor ALY: insights into post-transcrip-
tional regulatory and mRNA nuclear export processes.
Biochemistry 42, 7348–7357.
20 Jacks A, Babon J, Kelly G, Manolaridis I, Cary PD,
Curry S & Conte MR (2003) Structure of the C-term-
inal domain of human La protein reveals a novel RNA
recognition motif coupled to a helical nuclear retention
element. Structure (Camb) 11, 833–843.
21 Avis JM, Allain FH, Howe PW, Varani G, Nagai K &
Neuhaus D (1996) Solution structure of the N-terminal
RNP domain of U1A protein: the role of C-terminal
residues in structure stability and RNA binding. J Mol
Biol 257, 398–411.
22 Perez Canadillas JM & Varani G (2003) Recognition of
GU-rich polyadenylation regulatory elements by human
CstF-64 protein. EMBO J 22, 2821–2830.
23 Conte MR, Grune T, Ghuman J, Kelly G, Ladas A,
Matthews S & Curry S (2000) Structure of tandem
RNA recognition motifs from polypyrimidine tract
binding protein reveals novel features of the RRM fold.
EMBO J 19, 3132–3141.
24 Simpson PJ, Monie TP, Szendroi A, Davydova N, Tyz-
ack JK, Conte MR, Read CM, Cary PD, Svergun DI,
Konarev PV, Curry S & Matthews S (2004) Structure
and RNA interactions of the N-terminal RRM domains
of PTB. Structure (Camb) 12, 1631–1643.
25 Kielkopf CL, Lucke S & Green MR (2004) U2AF
homology motifs: protein recognition in the RRM
world. Genes Dev 18, 1513–1526.
26 Oubridge C, Ito N, Evans PR, Teo CH & Nagai K

645–650.
33 Deo RC, Bonanno JB, Sonenberg N & Burley SK
(1999) Recognition of polyadenylate RNA by the
poly(A)-binding protein. Cell 98 , 835–845.
34 Ding J, Hayashi MK, Zhang Y, Manche L, Krainer
AR & Xu RM (1999) Crystal structure of the two-
RRM domain of hnRNP A1 (UP1) complexed with
single-stranded telomeric DNA. Genes Dev 13, 1102–
1115.
35 Wang X & Tanaka Hall TM (2001) Structural basis for
recognition of AU-rich element RNA by the HuD pro-
tein. Nat Struct Biol 8, 141–145.
36 Mazza C, Segref A, Mattaj IW & Cusack S (2002)
Large-scale induced ﬁt recognition of an m (7) GpppG
cap analogue by the human nuclear cap-binding com-
plex. EMBO J 21, 5548–5557.
37 Allain FH, Gilbert DE, Bouvet P & Feigon J (2000)
Solution structure of the two N-terminal RNA-binding
domains of nucleolin and NMR study of the interaction
with its RNA target. J Mol Biol 303, 227–241.
38 Allers J & Shamoo Y (2001) Structure-based analysis of
protein–RNA interactions using the program ENTAN-
GLE. J Mol Biol 311, 75–86.
39 Varani G & Nagai K (1998) RNA recognition by RNP
proteins during RNA processing. Annu Rev Biophys
Biomol Struct 27, 407–445.
40 Showalter SA & Hall KB (2004) Altering the RNA-
binding mode of the U1A RBD1 protein. J Mol Biol
335, 465–480.
41 Crowder SM, Kanaar R, Rio DC & Alber T (1999)

49 Bono F, Ebert J, Unterholzner L, Guttler T, Izaurralde
E & Conti E (2004) Molecular insights into the interac-
tion of PYM with the Mago-Y14 core of the exon junc-
tion complex. EMBO Report 5, 304–310.
50 Kadlec J, Izaurralde E & Cusack S (2004) The struc-
tural basis for the interaction between nonsense-
mediated mRNA decay factors UPF2 and UPF3. Nat
Struct Mol Biol 11 , 330–337.
51 Kielkopf CL, Rodionova NA, Green MR & Burley SK
(2001) A novel peptide recognition mode revealed by
the X-ray structure of a core U2AF35 ⁄ U2AF65 hetero-
dimer. Cell 106, 595–605.
52 Xu RM, Jokhan L, Cheng X, Mayeda A & Krainer AR
(1997) Crystal structure of human UP1, the domain of
hnRNP A1 that contains two RNA-recognition motifs.
Structure 5, 559–570.
53 Shamoo Y, Krueger U, Rice LM, Williams KR & Steitz
TA (1997) Crystal structure of the two RNA binding
domains of human hnRNP A1 at 1.75 A
˚
resolution.
Nat Struct Biol 4 , 215–222.
54 van Gelder CW, Gunderson SI, Jansen EJ, Boelens
WC, Polycarpou-Schwarz M, Mattaj IW & van Ven-
rooij WJ (1993) A complex secondary structure in U1A
pre-mRNA that binds two molecules of U1A protein is
required for regulation of polyadenylation. EMBO J 12,
5191–5200.
The RRM domain, a plastic RNA-binding platform C. Maris et al.
2130 FEBS Journal 272 (2005) 2118–2131 ª 2005 FEBS

Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Tài liệu Báo cáo khoa học: The RNA recognition motif, a plastic RNA-binding platform to regulate post-transcriptional gene expression - Pdf 10

Tài liệu, ebook tham khảo khác

Học thêm