Tài liệu Báo cáo khoa học: A knowledge-based potential function predicts the speciﬁcity and relative binding energy of RNA-binding proteins - Pdf 10

A knowledge-based potential function predicts the
speciﬁcity and relative binding energy of RNA-binding
proteins
Suxin Zheng
1,
*, Timothy A. Robertson
2,
* and Gabriele Varani
1,2
1 Department of Chemistry, University of Washington, Seattle, WA, USA
2 Department of Biochemistry, University of Washington, Seattle, WA, USA
The sequence-speciﬁc recognition of RNA by proteins
plays a fundamental role in gene expression by direct-
ing different cellular RNAs to speciﬁc processing path-
ways or subcellular locations. Many experimental
studies have explored the molecular basis for the
sequence dependence of protein–RNA recognition [1–
4]; more recently, a few studies have explored this prob-
lem from a computational perspective as well [5–16].
However, these early studies have emphasized qualita-
tive descriptions of the recognition process; relatively
few attempts have been made to quantify the character-
istics of protein–RNA interactions using computational
approaches [17]. Here, we present a new approach for
predicting the speciﬁcity of RNA-binding proteins and
to evaluate the contribution of individual amino acids
to the energetic of protein–RNA complexes.
Knowledge-based potential functions have been
employed in protein structure prediction [18–27], as
well as in the prediction of protein–protein [25,28–30]
and protein–ligand interactions [30–33]. A few studies

ber 2007, accepted 19 October 2007)
doi:10.1111/j.1742-4658.2007.06155.x
RNA–protein interactions are fundamental to gene expression. Thus, the
molecular basis for the sequence dependence of protein–RNA recognition
has been extensively studied experimentally. However, there have been very
few computational studies of this problem, and no sustained attempt has
been made towards using computational methods to predict or alter the
sequence-speciﬁcity of these proteins. In the present study, we provide a
distance-dependent statistical potential function derived from our previous
work on protein–DNA interactions. This potential function discriminates
native structures from decoys, successfully predicts the native sequences
recognized by sequence-speciﬁc RNA-binding proteins, and recapitulates
experimentally determined relative changes in binding energy due to muta-
tions of individual amino acids at protein–RNA interfaces. Thus, this work
demonstrates that statistical models allow the quantitative analysis of
protein–RNA recognition based on their structure and can be applied to
modeling protein–RNA interfaces for prediction and design purposes.
Abbreviations
KH, K homology; MD, molecular dynamics; PDB, Protein Data Bank; RRM, RNA recognition motif; SRP, signal recognition particle.
6378 FEBS Journal 274 (2007) 6378–6391 ª 2007 The Authors Journal compilation ª 2007 FEBS
bonds represent only approximately 25% of contacts
between protein and RNA [12], we reasoned that a
more comprehensive approach would describe these
interactions more effectively.
In the present study, we report the application of an
all-atom, distance-dependent statistical potential to the
prediction of sequence-speciﬁc recognition between
proteins and RNA. We demonstrate that this approach
can discriminate native structures of complexes from
even close docking decoys, recapitulate experimentally

primary difference is the introduction of a new pseud-
count correction, where an optimized number of
pseudocounts are added to the observed counts for
each atom pair (for additional details, see Experimen-
tal procedures). As a control, we also tested a simple
contact-counting method, wherein every contact
between protein and RNA (within a given distance
cut-off) was assigned the same score of )1.
Docking decoy discrimination
An important property of any potential function is its
ability to discriminate cognate (native crystallographi-
cally determined structures) from noncognate (decoy)
structures [38]. As a preliminary test of our method,
and a direct comparison with previous work, we used
our distance-dependent potential to evaluate ﬁve sets
of docking decoys generated for the application of
the rosetta physical potential function to protein–
RNA interactions [17]. These decoys were created
using a combination of rigid-body docking and pro-
tein side-chain repacking, and range in rmsd (relative
to the native structure) from 0.2 A
˚
to over 20 A
˚
.
Thus, they represent a solid basis for comparison to
a much more complex scoring method (the multiterm,
hybrid physical ⁄ statistical potential function used by
rosetta).
When scored with the distance-dependent potential,

anticipate that the increasing availability of protein–
RNA structures, together with the availability of data
on speciﬁcity, will further improve the performance of
the knowledge-based predictive method presented here.
We retained the all atom representation because it is
already slightly better than the reduced atom
approach.
The protein–RNA score has distinctive properties
compared to the protein–DNA potential. When we
scored the protein–RNA decoy set using the protein–
S. Zheng et al. A knowledge-based potential function
FEBS Journal 274 (2007) 6378–6391 ª 2007 The Authors Journal compilation ª 2007 FEBS 6379
DNA potential, the average Z-score was approxi-
mately half that obtained with the protein–RNA
potential ()2.84 versus )5.45; see also supplementary
Table S4). Thus, although the chemistry of RNA and
DNA are very similar, the structure of RNA allows
for different interactions between proteins and the two
nucleic acids that are reﬂected in this result.
To investigate whether the statistical potential is not
simply reﬂecting the size of an interface or the number
of intermolecular contacts, we also used a very simple
contact-counting potential to evaluate the same decoys;
in this method, the ﬁtness of an interface is evaluated
by counting the number of close approaches between
the protein and RNA. Satisfactorily, this method was
A B
C
E
D

discriminate near-native protein–RNA structures with
that of the force ﬁeld implemented in the amber 8
molecular simulation package. We generated near-
native protein–RNA decoys for 21 protein–RNA
complexes by conducting molecular dynamics (MD)
simulations of the native complexes, and by selecting
multiple time-steps from the resulting trajectories for
each structure. We then scored these structures using
the distance-dependent potential function, and exam-
ined the correlations between distance scores and
amber energies for each decoy set.
This is a difﬁcult test of score performance because
the structures are very close to native. Indeed, neither
the distance-dependent score, nor the amber potential
appears to be able to discriminate native structures
from these very near-native, MD-generated decoys
(average Z-score of )0.69 versus )0.59; Table 2).
Although there is no correlation of the either score
with rmsd, the distance-dependent statistical potential
is somewhat correlated (average R
2
¼ 0.41) with the
energy values predicted by the amber force ﬁeld. Thus,
it remains very difﬁcult for either approach to discrim-
inate the native structure from structures that are close
to it in energy.
Identifying RNA-binding sequences from
structure
Having established the performance of the statistical
potential function in decoy discrimination, we investi-

1CVJ )7.02 )1.19 )5.11 )2.44
1EC6 )6.46 )1.09 )6.53 )3.00
1FXL )2.66 )1.55 )2.70 )1.26
1JID )6.29 )1.36 )9.12 )3.09
1URN )4.80 )1.35 )8.39 )3.39
Mean ± SD )5.45 ± 1.76 )1.31 ± 0.18 )6.37 ± 2.58 )2.64 ± 0.84
a
Using a 6 A
˚
contact cut-off.
b
From Chen et al. [17] and referring to
a potential lacking the directional component of hydrogen bonding
(HB) interactions.
c
From Chen et al. [17] and referring to the com-
plete potential function.
Table 2. Z-scores and correlations for near-native decoys generated
by MD simulation.
Largest
rmsd (A
˚
)
Z-scores
Distance-
dependent
versus
AMBER (R
2
)

the structure of existing KH domains bound to RNA
[6,41–44]. As a consequence of the assumptions of
the model, complexes containing two RNA-binding
domains were divided into independent structures
(e.g. 1CVJ_1 and 1CVJ_2 represent the ﬁrst and sec-
ond Poly A binding protein domain of structure
1CVJ, respectively), and the two domains were con-
sidered structurally and thermodynamically unrelated.
Because the model assumes that each RRM and KH
domain binds to each of four nucleotides indepen-
dently, we generated a set of 4
4
(256) different
structures for each protein–RNA complex by compu-
tationally ‘threading’ all possible four-nucleotide com-
binations onto the RNA bases nearest the center of
the b-sheet structure of the RRM. We then scored
these sequence-variant structures with the distance-
dependent potential function.
Figure 2 shows the results of this analysis. If the
potential and model of recognition were perfect, and if
each structure was sequence-speciﬁc and corresponded
to the most favorable sequence recognized by a given
domain, the cognate sequences of the tested structures
would be expected to rank as number 1. Because it is
unlikely that the cognate recognition sequences for all
domains will be consistently assigned the best score,
we expressed sequence-discrimination performance in
terms of percentiles (where perfect discrimination of
the cognate recognition sequence would result in a

(1FXL_1, rank 32). Both Pab and HuD utilize two
domains to achieve sequence-speciﬁc recognition in a
cooperative manner and do not discriminate well
between sequences that are related to their cognate rec-
ognition motif (A-rich and AU-rich sequences, respec-
tively) [46]. Notably, however, the nonsequence-speciﬁc
RNA helicase protein (PDB code: 2DB3, included as a
negative control) had an expectedly poor cognate
sequence rank of 226 ⁄ 256.
Estimating experimentally determined relative
RNA-binding afﬁnities
A second very important property of any potential
function is the ability to recapitulate the sequence
dependence of experimental binding energies; this is a
prerequisite if the potential is to be applied to prob-
lems of protein–RNA interface prediction or design.
Fortunately, a few structures have a relatively dense
set of experimentally determined binding constants for
interface mutations. We used these experimentally
characterized mutants to create a set of computation-
ally ‘mutated’ structures of the complexes (Table 3),
Fig. 2. Structure-based identiﬁcation of RRM recognition sequen-
ces. The cognate sequence is ranked by the distance potential
(cut-off ¼ 6A
˚
) for RRM ⁄ KH domain proteins. The red line repre-
sents the rank of cognate recognition sequences using the contact-
counting score; the blue line represents the rank of these
sequences using the distance-dependent potential. The points in
each colored line are sorted independently by rank; the x-axis is the

2
¼ 0.97,
Fig. 3B), and statistically signiﬁcant at the 95% conﬁ-
dence level. Figure 3C shows a likely explanation for
this result: an intramolecular hydrogen bond formed
by the cytosine at position )5 [47]. When this nucleo-
tide is mutated to any other base, the intramolecular
hydrogen bond is lost, leading to a reorganization of
the RNA structure.
This result does not provide direct information on
the relative contribution of that hydrogen bond to
the overall binding energy; it is simply implied that
Table 3. Correlations between the distance-dependent score and
the experimental free energy of binding for several mutant protein–
RNA complexes.
Distance-dependent Contact counting
6A
˚
10 A
˚
12 A
˚
6A
˚
10 A
˚
12 A
˚
Protein mutations
MS2 (no cytosine

between protein mutants and RNA-containing nucleotides other
than cytosine at position )5. (B) Complexes between protein
mutants and RNA containing cytosine at position )5. (C) The char-
acteristic intramolecular hydrogen bond between the amino group
of C5 and the O1P atom of U6 observed in the structure of the
MS2–RNA complex containing a cytosine at position )5 that helps
organize the RNA structure for protein binding [47].
S. Zheng et al. A knowledge-based potential function
FEBS Journal 274 (2007) 6378–6391 ª 2007 The Authors Journal compilation ª 2007 FEBS 6383
mutations must be segregated into two groups to
obtain a clear correlation between experimental and
predicted relative afﬁnities. The most likely explana-
tion for this result is that, at present, the statistical
potential does not consider RNA intramolecular con-
tacts; therefore, contributions to binding energy due to
changes in RNA structure (i.e. that occur when that
hydrogen bond is lost) cannot be captured by our cur-
rent approach.
A second example that reinforces our interpretation
of the results obtained with MS2 is provided by Fox-1
protein, which regulates alternative splicing of tissue-
speciﬁc exons by binding to the GCAUG sequence
[49]. The structure of the complex (PDB code: 2ERR)
and the experimental binding constants for two sets of
related mutations have been reported [49]: one set for
mutations on the Fox-1 protein and a second set for
mutations to its target RNA molecule. A moderately
strong correlation was observed between the distance
score and the protein mutation data (R
2

binds to RNA by forming intermolecular interactions
that are not commonly observed in the database of
training structures. This hypothesis is supported by
the observation that the inclusion of a close U1A
homolog (the U2B¢–U2A¢ complex) in the training set
improves the results of this test as well (R
2
increases
from 0.04 to 0.39; Table 3). Thus, it appears that the
structure of the U1A or of its homologous complex
contains a set of protein–RNA atomic contacts (i.e.
interatomic distances) that are not well represented in
the 71 other protein–RNA complexes in our training
set.
Figure 5 shows the ﬁnal example, a universally con-
served component of the core of the signal recognition
particle (SRP). The structure of the complex (PDB
code: 1HQ1) and the binding afﬁnity of a series of
RNA mutants have been determined [54]. The distance
potential results in scores that correlate signiﬁcantly
(R
2
¼ 0.52, P £ 0.05) with experimental binding afﬁni-
ties for mutations involving substitutions of deoxy-
nucleotides for their corresponding ribonucleotides.
However, as observed for Fox-1, no signiﬁcant
Ade-4
Cyt-3
Ura-1
Gua-2

these interactions more comprehensively. Thus, our
understanding of the mechanisms driving protein–
RNA recognition is still largely descriptive [11].
Recent work on protein–DNA interactions has
shown that quantitative models of protein–nucleic
acid recognition can provide insight into the mecha-
nisms of gene regulation [58,59], and, in the not too
distant future, promise to allow the rational design
of DNA-binding proteins with altered speciﬁcity [60].
The development of computational tools capable of
predicting the speciﬁcity of RNA-binding proteins
across entire families (such as the RRM superfam-
ily), or of redesigning the speciﬁcity of these pro-
teins, would be of equal importance in dissecting
post-transcriptional regulatory mechanisms, and in
providing new tools to interrogate gene expression
pathways.
In a previous study, our group demonstrated that a
statistical potential function could be surprisingly accu-
rate when used to predict protein–DNA interactions
from structure [36]; this result was corroborated by a
similar study published concurrently by another group
[37]. Given these results, we hypothesized that the
same approach would be equally successful with pro-
tein–RNA interfaces. Indeed, although various statisti-
cal techniques have been used by a number of groups
for the prediction of protein structures, protein–DNA
and protein–ligand interactions [18–35], such an
approach has never been applied to protein–RNA
interactions.

functions.
The question of how to generate and discriminate
near-native decoys is still an open challenge for many
areas of computational structural biology [61,62]. The
docking decoy set used here contains many near-native
decoys (e.g. < 1 A
˚
rmsd) that can be discriminated by
the distance-dependent potential (Fig. 1). However,
when testing against the exceptionally near-native
Fig. 5. Correlation between scores generated by the distance-
dependent statistical potential and experimental binding free
energies (logK
d
) for ribose-to-deoxyribose mutants of a universally
conserved protein component of the SRP.
S. Zheng et al. A knowledge-based potential function
FEBS Journal 274 (2007) 6378–6391 ª 2007 The Authors Journal compilation ª 2007 FEBS 6385
decoys generated by extracting snapshots from MD
simulations (Table 2), we found that near non-native
decoys could not be reliably discriminated from native
structures, not even by amber, which was used to con-
duct the MD simulations. Thus, the question of how
to create a potential that is sensitive to the extremely
subtle structural variations present in very near-native
decoys remains a challenging and important area of
research. We are hopeful that the incorporation of
terms describing the higher-order geometric preferences
of protein–RNA interfaces (e.g. the incorporation of a
directional hydrogen-bonding potential) [17] may

in our study, replicate experiments were conducted
using 6 A
˚
,10A
˚
and 12 A
˚
distance cut-offs. In nearly
all of our tests, the use of a shorter contact cut-off
(6 A
˚
) results in greater selectivity for structural details
of the interface (Table 1). For the prediction of
mutation energies, however, a longer cut-off appears
to outperform shorter cut-off values for some sets of
mutation data (Table 3). Some of these mutations are
not near the protein–RNA interface (e.g. one of the
U1A mutations, D79V, is 9 A
˚
from the RNA mole-
cule), and only the use of a longer cut-off value can
capture these effects. In light of the differing conclu-
sions of previous research [21,23,36], these results
imply that a ‘one size ﬁts all’ approach to energy
function design may be limiting. In other words, it
may be possible to signiﬁcantly improve potential
functions by customizing their parameterization to
particular problems.
Prediction of RNA recognition sequences from
protein–RNA complex structures

data had to be divided into two classes based on the
presence or absence of a cytosine at position )5 in the
RNA. A likely explanation for the importance of
the )5 cytosine mutation is offered by the observation
that the amino group of the cytosine at position )5
makes an intramolecular hydrogen bond that increases
the propensity of the free RNA to adopt the structure
seen in the complex [48] (Fig. 3C). Because the dis-
tance potential currently measures only intermolecular
interactions, it is unable to capture the thermodynamic
effect of interactions within the RNA or protein, and
of mutation-induced changes in RNA (or protein)
structure. The good correlations of distance potential
with experimental binding energies (i.e. when sequence
A knowledge-based potential function S. Zheng et al.
6386 FEBS Journal 274 (2007) 6378–6391 ª 2007 The Authors Journal compilation ª 2007 FEBS
mutations are grouped according to the base identity
at position )5) strongly suggests that the potential cap-
tures the energetic contributions of intermolecular
interactions well.
The same limitations observed in the MS2 mutation
data led to the failures in prediction for RNA mutations
in the Fox-1 and SRP complexes. In the structure of the
Fox-1 complex, nucleotide U1 interacts with C3 by
forming an intramolecular hydrogen bond, whereas G2
and A4 form a non-Watson–Crick base pair [49]
(Fig. 4). Four out of seven Fox-1 RNA mutations that
were tested directly affect these intramolecular interac-
tions, which are not evaluated by the statistical potential
used in the present study. In the case of the RNA muta-

employed: the tested structure was always excluded
from the training set. Thus, every test in the present
study was conducted with a different score, and
trained using only those structures that were not
homologous to the tested protein–RNA complex.
This strategy cannot be avoided at the present time,
yet it leads to situations where the training data does
not contain enough information to capture particular
structural phenomena. For example, we observed vir-
tually no correlation between the distance-dependent
score and the experimental binding afﬁnity for muta-
tions of U1A protein until the U1A complex structure
was added to the training set (Table 3). Addition of
the homologous U2B¢ complex structure (PDB code:
1A9N) to the training set improved these results con-
siderably, indicating that the training set was missing
critical structural information that would help to dis-
criminate native-like contacts unique to the U1A com-
plex (an unusually high-afﬁnity RRM, with a long,
seven-nucleotide recognition sequence) [52]. We antici-
pate that the performance of the method will improve
with the size of the structural database, as more high-
resolution protein–RNA structures become available.
Conclusions
We have introduced a statistical potential function that
discriminates the structures of native protein–RNA
complexes from decoys, reproduces experimentally
determined relative binding afﬁnities for a number of
RNA-binding proteins, and predicts cognate binding
sequences for a large set of protein–RNA complexes.

(see supplementary Table S3). These pseudocounts are
allocated over distance bins in proportion to the back-
ground frequency f(d
ij
) values, as calculated using Eqn (4)
from a previous study [36], leading to an updated expres-
sion for f(d
ij
, t
i
, t
j
):
f ðd
ij
; t
i
; t
j
Þ
adj
¼
N
obs
ðd
ij
; t
i
; t
j

) represents the
number of atoms of types t
i
and t
j
observed in the structure
training set, separated by a distance of at least d
ij
.
As a control, we also tested a simple, contact-counting
method, wherein every contact between protein and RNA
(within a given distance cut-off) was assigned a same score
of )1.
Atom type selection
Atom score types were assigned using the method of Rob-
ertson and Varani [36]. Brieﬂy, the all-atom potential treats
every atom, in every residue, as a unique type (e.g. ala-
nine Cb and arginine Cb are considered as unique atom
types under this scheme), resulting in a total of 158 protein,
and 81 RNA atom types. Using a 10 A
˚
cut-off, there are
total of 1639 295 counts; with this representation, they are
distributed over 158 · 81 · 8 bins, for an average of nearly
16 counts in each bin. When using a reduced atom repre-
sentation, chemically similar atoms were group together
based on the CHARM atom deﬁnition, as previously
described [30,36].
Selection of protein–RNA training set
The training set contains crystal structures of protein–RNA

ated using amber 8 in a deformation-like process with the
ff99 force ﬁeld [68]. These MD-generated decoys are espe-
cially near-native structures; the maximum decoy rmsd for
21 sets is below 4 A
˚
, and only seven decoy sets have a max-
imum rmsd greater than 3 A
˚
.
To generate these decoy sets, the initial structure of each
native complex was ﬁrst minimized in 500 steps (250 steps
of steepest-descent and 250 steps of conjugate gradient min-
imization), then heated from 0–400 °K in 20 ps using a
Langevin dynamics algorithm [69,70]. Snapshots were taken
every 0.05 ps, and a total of 400 structures were extracted
from each MD simulation. The binding free energy was cal-
culated using the mm_gbsa module of amber 8 as:
DG
bind
¼ G
complex
ÀðG
protein
þ G
RNA
Þ
where G
complex
, G
protein

Some RRM and KH domains in complex with single
strand DNA (PDB codes: 2UP1, 1WTB, 1X0F, 1ZZI and
1ZZJ) were also included in the test set because recognition
of single stranded RNA and DNA are mechanistically simi-
lar. Protein mutations were modeled using moe (Chemical
Computing Group, Montreal, Canada), followed by energy
minimization with amber; the conformation of the mutated
residue with side chain conformation most similar to the
native residue was retained.
Acknowledgements
We wish to thank Dr Yu Chen for providing the pro-
tein–RNA decoy sets and Mr Daniel Bjerre for many
valuable discussions. The study was supported by
grants from NIH.
References
1 Amosova O, Broitman SL & Fresco JR (2003) Alanine-
scanning mutagenesis of the predicted rRNA-binding
domain of ErmC¢ redeﬁnes the substrate-binding site
and suggests a model for protein–RNA interactions.
Nucleic Acids Res 31, 4941–4949.
2 Law MJ, Rice AJ, Lin P & Laird-Offringa IA (2006)
The role of RNA structure in the interaction of U1A
protein with U1 hairpin II RNA. RNA 12, 1168–1178.
3 Xia T, Wan C, Roberts RW & Zewail AH (2005)
RNA–protein recognition: single-residue ultrafast
dynamical control of structural speciﬁcity and function.
PNAS 102, 13013–13018.
4 White SA, Hoeger M, Schweppe JJ, Shillingford A,
Shipilov V & Zarutskie J (2004) Internal loop mutations
in the ribosomal protein L30 binding site of the yeast

15 Steﬂ R, Skrisovska L & Allain FH-T (2005) RNA
sequence- and shape-dependent recognition by pro-
teins in the ribonucleoprotein particle. EMBO Rep 6,
33–38.
16 Frankel AD (2000) Fitting peptides into the RNA
world. Curr Opin Struct Biol 10, 332–340.
17 Chen Y, Kortemme T, Robertson T, Baker D & Varani
G (2004) A new hydrogen-bonding potential for the
design of protein–RNA interactions predicts speciﬁc
contacts and discriminates decoys. Nucleic Acids Res 32,
5147–5162.
18 Sippl M, Ortner M, Jaritz M, Lackner P & Flo
¨
ckner H
(1996) Helmholtz free energies of atom pair interactions
in proteins. Fold Des 1, 289–298.
19 Sippl M (1993) Boltzmann’s principle, knowledge-based
mean ﬁelds and protein folding. An approach to the
computational determination of protein structures.
J Comput Aided Mol Des 7, 473–501.
20 Sippl MJ (1990) Calculation of conformational ensem-
bles from potentials of mean force: an approach to the
knowledge-based prediction of local structures in globu-
lar proteins. J Mol Biol 213, 859–883.
21 Samudrala R & Moult J (1998) An all-atom distance-
dependent conditional probability discriminatory func-
tion for protein structure prediction. J Mol Biol 275 ,
895–916.
22 Skolnick J, Kolinski A & Ortiz A (2000) Derivation of
protein-speciﬁc pair potentials based on weak sequence

2325–2335.
31 Ishchenko AV & Shakhnovich EI (2002) SMall Mole-
cule Growth 2001 (SMoG2001): an improved knowl-
edge-based scoring function for protein–ligand
interactions. J Med Chem 45, 2770–2780.
32 Velec HFG, Gohlke H & Klebe G (2005) Drug-
Score
CSD
-knowledge-based scoring function derived
from small molecule crystal data with superior recogni-
tion rate of near-native ligand poses and better afﬁnity
prediction. J Med Chem 48, 6296–6303.
33 DeWitte RS & Shakhnovich EI (1996) SMoG: de novo
design method based on simple, fast, and accurate free
energy estimates. 1. Methodology and supporting evi-
dence. J Am Chem Soc 118, 11733–11744.
34 Liu Z, Mao F, Guo J-T, Yan B, Wang P, Qu Y & Xu
Y (2005) Quantitative evaluation of protein–DNA inter-
actions using an optimized knowledge-based potential.
Nucleic Acids Res 33, 546–558.
35 Kono H & Sarai A (1999) Structure-based prediction of
DNA target sites by regulatory proteins. Proteins:
Struct Funct Genet 35, 114–131.
36 Robertson TA & Varani G (2007) An all-atom, dis-
tance-dependent scoring function for the prediction of
protein–DNA interactions from structure. Proteins:
Struct Funct Bioinform 66, 359–374.
37 Donald JE, Chen WW & Shakhnovich EI (2007) Ener-
getics of protein–DNA interactions. Nucleic Acids Res
35, 1039–1047.

a degenerate RNA pool presented in various structural
contexts. Nucleic Acids Res 19, 4931–4936.
46 Lunde BM, Moore C & Varani G (2007) RNA-binding
proteins: modular design for efﬁcient function. Nat Rev
Mol Cell Biol 8, 479–490.
47 Valegard K, Murray JB, Stonehouse NJ, van den Worm
S, Stockley PG & Liljas L (1997) The three-dimensional
structures of two complexes between recombinant MS2
capsids and RNA operator fragments reveal sequence-
speciﬁc protein–RNA interactions. J Mol Biol 270, 724–
738.
48 Johansson HE, Dertinger D, LeCuyer KA, Behlen
LS, Greef CH & Uhlenbeck OC (1998) A thermody-
namic analysis of the sequence-speciﬁc binding of
RNA by bacteriophage MS2 coat protein. PNAS 95,
9244–9249.
49 Auweter SD, Fasan R, Reymond L, Underwood JG,
Black DL, Pitsch S & Allain FH-T (2006) Molecular
basis of RNA recognition by the human alternative
splicing factor Fox-1. EMBO J 25, 163–173.
50 Oubridge C, Ito N, Evans PR, Teo CH & Nagai K
(1994) Crystal structure at 1.92 A resolution of the
RNA-binding domain of the U1A spliceosomal
protein complexed with an RNA hairpin. Nature 372,
432–438.
51 Allain FHT, Gubser CC, Howe PWA, Nagai K,
Neuhaus D & Varani G (1996) Speciﬁcity of ribonucleo-
protein interaction determined by RNA folding during
complex formation. Nature 380, 646–650.
52 Timm H, Jessen Oubridge C, Teo CH, Pritchard C &

zipper protein with new DNA contacting region.
Biochemistry 41, 2177–2183.
60 Ashworth J, Havranek JJ, Duarte CM, Sussman D,
Monnat RJ, Stoddard BL & Baker D (2006) Computa-
tional redesign of endonuclease DNA binding and
cleavage speciﬁcity. Nature 441, 656–659.
61 Gray JJ (2006) High-resolution protein–protein docking.
Curr Opin Struct Biol 16, 183–193.
62 Wang K, Fain B, Levitt M & Samudrala R (2004)
Improved protein structure selection using decoy-
dependent discriminatory functions. BMC Struct Biol
4,8.
63 Sippl MJ (1993) Recognition of errors in three-dimen-
sional structures of proteins. Proteins: Struct Funct
Genet 17, 355–362.
64 Berman H, Henrick K & Nakamura H (2003) Announc-
ing the worldwide Protein Data Bank. Nat Struct Mol
Biol 10, 980–980.
65 Notredame C. ExPASy sequence-redundancy tool.
Available at />redundancy.cgi.
66 Gray JJ, Moughon SE, Kortemme T, Schueler-
Furman O, Misura KMS, Morozov AV & Baker D
(2003) Protein–protein docking predictions for the
CAPRI experiment. Proteins: Struct Funct Genet 52,
118–122.
67 Gray JJ, Moughon S, Wang C, Schueler-Furman O,
Kuhlman B, Rohl CA & Baker D (2003) Protein–pro-
tein docking with simultaneous optimization of rigid-
body displacement and side-chain conformations. J Mol
Biol 331, 281–299.

discrimination.
This material is available as part of the online article
from
Please note: Blackwell Publishing is not responsible
for the content or functionality of any supplementary
materials supplied by the authors. Any queries (other
than missing material) should be directed to the corre-
sponding author for the article.
S. Zheng et al. A knowledge-based potential function
FEBS Journal 274 (2007) 6378–6391 ª 2007 The Authors Journal compilation ª 2007 FEBS 6391

Nhờ tải bản gốc

Tài liệu, ebook tham khảo khác

Tài liệu Báo cáo khoa học: A knowledge-based potential function predicts the speciﬁcity and relative binding energy of RNA-binding proteins - Pdf 10

Tài liệu, ebook tham khảo khác

Học thêm