MINIREVIEW
Top-down MS, a powerful complement to the high
capabilities of proteolysis proteomics
Fred W. McLafferty
1
, Kathrin Breuker
2
, Mi Jin
1
, Xuemei Han
1
, Giuseppe Infusini
1
, Honghai Jiang
1
,
Xianglei Kong
1
and Tadhg P. Begley
1
1 Department of Chemistry and Chemical Biology, Baker Laboratory, Cornell University, Ithaca, NY, USA
2 Institute of Organic Chemistry and Center for Molecular Biosciences Innsbruck (CMBI), University of Innsbruck, Austria
Introduction
The MS techniques of ESI [1] and MALDI [2] have
been available for only two decades, but they have rev-
olutionized the introduction of large, nonvolatile mole-
cules such as proteins into the mass spectrometer [3,4].
Here we discuss two general types of such MS ‘proteo-
mics’ applications: (a) the identification of a protein
from among those predicted from the parent genome’s
DNA; and (b) the structural characterization of a pro-
multiply modified isomers is more efficient. Bottom-up proteolysis destroys
the information on the size of the protein and the connectivities of the pep-
tide fragments, but it has no size limit for protein digestion. In contrast,
the top-down approach has a 500 residue, 50 kDa limitation for the
extensive molecular ion dissociation required. Basic studies indicate that
this molecular ion intractability arises from greatly strengthened electro-
static interactions, such as hydrogen bonding, in the gas-phase molecular
ions. This limit is now greatly extended by variable thermal and collisional
activation just after electrospray (‘prefolding dissociation’). This process
can cleave 287 inter-residue bonds in the termini of a 1314 residue
(144 kDa) protein, specify previously unidentified disulfide bonds between
eight of 27 cysteines in a 1714 residue (200 kDa) protein, and correct
sequence predictions in two proteins, one of 2153 residues (229 kDa).
Abbreviations
BCA, bovine carbonic anhydrase; CAD, collisionally-activated dissociation; ECD, electron-capture dissociation; HAD, 3-hydroxyanthranilate-
3,4-dioxygenase; IRMPD, infrared multiphoton dissociation; PFD, prefolding dissociation; PTM, post-translational modification;
PurL, formylglycinamide ribonucleotide amidotransferase.
6256 FEBS Journal 274 (2007) 6256–6268 ª 2007 The Authors Journal compilation ª 2007 FEBS
here directly introduces the proteins into the mass
spectrometer, providing far higher specificity at the
expense of far higher experimental requirements. As
predicted in a prescient 2004 review [6], the top-down
method is being exploited increasingly in unique appli-
cations, with 18% of proteomics papers ⁄ posters at the
2007 meeting of the American Society for Mass Spec-
trometry concerning this newer approach.
Although ESI spectra of proteins larger than mega-
daltons have been reported [7,8], the great majority
of ESI spectra measured are those of the small
(< 3 kDa) peptides produced by the bottom-up prote-
gene. If, however, more extensive or specific data are
needed, such as on polymorphisms or PTMs, the com-
plementary top-down approach can often provide
these in a very straightforward manner. This review
also discusses alleviation of a serious previous prob-
lem: top-down molecular ion dissociations have given
few product ions for proteins > 50 kDa. The far
higher masses measured with the top-down approach
require correspondingly higher MS resolving power,
so the instrument of choice has been the expensive
Fourier transform mass spectrometer (FT MS)
[3,5,23,24]. FT MS has the added advantage that it
can give MS ⁄ MS spectra by electron-capture dissocia-
tion (ECD) [25–27], which provides far more fragment
ion information than either collisionally-activated
dissociation (CAD) [28] or infrared multiphoton disso-
ciation (IRMPD) [29]. However, ECD’s descendant,
electron-transfer dissociation [30], works well with less
expensive MS instruments, and can be applied to pep-
tides and smaller proteins [31] with versatile ion–ion
reactions [32]. Of special promise for routine top-down
applications is the recently developed Orbitrap mass
spectrometer, which has resolution and mass accuracy
capabilities approaching those of FT MS, with very
promising cost advantages [33]. ECD and electron-
transfer dissociation are less sensitive than CAD or
IRMPD, in part because they produce far more
product ions.
Identification
To date, by far the largest use of MS proteomics has
characteristic of that protein’s sequence and molecular
mass value. Thus, top-down data can give an accuracy
F. W. McLafferty et al. Top-down MS of proteins
FEBS Journal 274 (2007) 6256–6268 ª 2007 The Authors Journal compilation ª 2007 FEBS 6257
of identification that is orders of magnitude higher
[6,23,35]. For example, Begley and co-workers [21] iso-
lated an enzyme YjbV involved in the B. subtilis thia-
mine biosynthesis pathway for which 1D SDS-PAGE
analysis indicated an approximate mass of 3 kDa
(Fig. 1). The top-down ESI ⁄ FT MS spectrum of this
protein with nozzle-skimmer CAD dissociation (Fig. 1)
confirmed the YjbV sequence and demonstrated the
absence of any post translational modifications. Not
only does the measured molecular mass value of
31 407.1 agree with the predicted mass from the DNA
sequence at 31 406.9 Da, within the limits of experi-
mental accuracy, but also there are 23 top-down frag-
ment mass values that agree with those expected from
single backbone cleavages (Fig. 1). Thus, each frag-
ment contains either the N-terminus or C-terminus,
providing extensive confirmatory sequence information
(see below) for this SDS ⁄ PAGE-purified protein.
For protein identifications in complex mixtures, a
dramatic advantage of the top-down approach is
that a final separation stage can be done in the
FT MS instrument. For example, after rough separa-
tion of the proteins from Arabidopsis thaliana, the
stromal protein fraction was introduced directly by
ESI into the FT MS instrument to yield an ESI
mass spectrum in which the molecular ions from 14
anhydrase (BCA) molecular ions in 25 ECD ⁄ CAD
spectra [19], with 183 bonds being cleaved in a single
‘plasma ECD’ spectrum (Fig. 4) [26]. Obviously, this
amount of mass spectral information makes possible
even higher identification reliabilities, and also extensive
de novo sequencing and structural characterization.
Fig. 1. Left: 1D SDS ⁄ PAGE chromatograms
of ThiD from E. coli and of unknown YjbV
from B. subtilis. Right, above: ESI spectrum
of YjbV, molecular ion isotopic peaks. Right,
below: nozzle-skimmer dissociation spectral
data, YjbV fragment peaks. The ‘) 20 ’ after
the molecular mass value signifies that the
main component ion of the most abundant
isotopic peak contains 20
13
C atoms and
has this mass value.
Top-down MS of proteins F. W. McLafferty et al.
6258 FEBS Journal 274 (2007) 6256–6268 ª 2007 The Authors Journal compilation ª 2007 FEBS
Characterization
The high specificity of the top-down approach for pro-
tein structural characterization is due to the extensive
molecular connectivity information that it provides;
this is not destroyed by proteolysis. The peptides from
proteolysis usually represent substantially less than
100% coverage of the protein sequence, so that even
when their mass information is consistent with a previ-
ously identified protein, the sample protein could have
missing or extra parts. In the top-down approach, an
that the predicted sequence is correct. In an early
(1993) example of top-down identification [23], our
measured molecular mass value, 29 024.2 Da, of BCA
matched well the value that was calculated,
29 024.7 Da, from the published sequence. Further-
more, MS ⁄ MS (nozzle-skimmer CAD) of the molecu-
lar ions gave 21 terminal fragment ions that were also
consistent with the published sequence. However, our
2003 plasma ECD spectrum of BCA (Fig. 4; 183
cleavage sites) gave 512 mass values [26], of which 45
were in error by ) 1 Da; these values all represented
cleavages in the region of residues 10–31. This is
strong evidence that the residue reported as Asp10
should be Asn10, and Asn31 should be Asp31
(Asp CO-OH, Asn CO-NH
2
, Dm ¼ –1 Da; note that
these changes do not affect the molecular mass value
of the protein). Detecting this error in the usual bot-
tom-up approach would be difficult, as peptides that
incorporate residues 10 or 31 would not match a pre-
dicted sequence and so would be ignored. Worse yet,
in our 1999 top-down study of BCA [5], + 1.00 Da
and + 0.99 Da errors found for peptides Phe19–
Asp33 and Asp18–Lys35 were termed ‘unexpected
(and unexplained) anomalies’. Obviously, the precision
of locating such sequence errors or PTMs is depen-
dent on obtaining fragment ion masses representing
nearby dissociations on either side of the error; in the
unusual Fig. 4 case of nearby offsetting errors, having
the molecular mass value and a sequence
consistent with those of sheep and human
carbonic anhydrases.
Top-down MS of proteins F. W. McLafferty et al.
6260 FEBS Journal 274 (2007) 6256–6268 ª 2007 The Authors Journal compilation ª 2007 FEBS
destroying information on their backbone location.
However, the energetic (‘nonergodic’) dissociation of
ECD is localized on the backbone, with little accompa-
nying cleavage of weaker side-chain modifications such
as glycosylated [37] and phosphorylated structures [38]
(and even of noncovalent bonding and conformational
tertiary protein structures; see below). Top-down ECD
and CAD of b-casein gave 126 out of the possible 208
backbone cleavages (Fig. 6); the ECD cleavages not
only indicate the five phosphorylation sites without
loss of these side chains, but also that these cleavages
are so positioned that they would have specified phos-
phorylation if it had occurred at any of the other 21
possible sites (Ser, Thr, Tyr) of b-casein [38]. Although
ECD requires the more expensive FT MS instrumen-
tation, it measures all product ions simultaneously,
which is of particular value for repeated quantitative
measurements, e.g. variable phosphorylation of isolated
b-casein samples.
Unexpected modifications are especially difficult for
classic and bottom-up methods, which must be selected
or tailored for the specific PTM. In the biosynthesis of
NAD, the enzyme 3-hydroxyanthranilate-3,4-dioxygen-
ase (HAD) catalyzes the oxidative ring opening of
3-hydroxyanthranilate, which, with cyclization, forms a
value.
F. W. McLafferty et al. Top-down MS of proteins
FEBS Journal 274 (2007) 6256–6268 ª 2007 The Authors Journal compilation ª 2007 FEBS 6261
Cys146, after which they are low by 4 Da, the
decrease of the molecular mass value. The most proba-
ble reason for a 2 Da decrease is the formation of an
S–S bond; although this was totally unexpected and
unprecedented, the top-down approach efficiently gave
a specific characterization of the inhibitor mechanism
[39]. Even if two S–S bonds had been suspected, identi-
fying for each their two specific cysteines cut of the
10 possible for the five cysteines (including Cys127),
would be difficult by classic or bottom-up methods.
Deamidation of Asn or Gln in proteins has impor-
tant effects on enzyme activity and folding, and has
even been proposed as a biological clock [40]. How-
ever, changing –CO-NH
2
to –CO-OH only produces a
mass increase of 1 Da; as in Fig. 4, this makes the
ability of FT MS to resolve protein ion isotopic peaks
of critical importance for such a mass shift determina-
tion. The most abundant of the 13+ molecular ions of
reduced RNase A before deamidation (Fig. 8A) shows
Fig. 7. ECD, CAD and IRMPD spectral data of HAD treated with
inhibitor [22]. C-terminal fragment ions 1–4 Da below the mass
values predicted for untreated HAD clearly indicate the unexpected
S–S bonds Cys146 to Cys149 and Cys183 to Cys186.
Fig. 6. Inter-residue backbone fragmentations from the ECD spect-
rum of b-casein’s three variants, molecular masses 24 008.2 Da,
ment ions were determined similarly and are plotted
for the four product samples in Fig. 9 as mass
increases (decreases) for the N-terminal (C-terminal)-
containing fragment ions. Thus, for the + 1.0 Da sam-
ple (Fig. 8B), the N-terminal fragment ions show little
increase in mass with increasing size until Asn67, with
this increase of 1.0 Da staying constant for larger
N-terminal ions and with the C-terminal ions showing
the complementary decrease. This demonstrates
directly that Asn67, the only deamidation site found
previously, is indeed deamidated before any other resi-
due. In a similar fashion, the samples with 1.8, 3.7 and
4.4 Da increases show that Asn71 and Asn94 are
nearly equally reactive as the next sites, followed by
Asn34 and then Gln74 [40]. Other examples show the
utility of top-down MS ⁄ MS for such kinetic studies
[17,22,41].
Top-down quantitative analysis
Measuring the differences in protein expression levels
that result from disease states, environment, etc. is
critically important in many biomedical investiga-
tions. The protein quantities in cases of normal and
perturbed expression are compared accurately by iso-
topically labeling the proteins from one and compar-
ing in their mixture the corresponding peaks of their
respective peptides, usually differing by three or
more mass units [9–12]. The kinetic deamidation
study above (Fig. 9), in a similar fashion, compares
the quantities of proteins differing in the position of
deamidation (only a + l Da change) with the multi-
exchange identifying reactive regions of the conforma-
tion [43–45], ion mobility measuring conformational
cross-sections [46,47], ECD identifying regions of ter-
tiary noncovalent bonding, as these are preserved when
backbone bonds are cleaved [48,49], and infrared
photodissociation spectroscopy characterizing func-
tional group environments [44,50]. For example,
charge sites, such as the protonated side chains of
basic residues, in solution are solvated out into the
aqueous phase, while in the gas phase they are instead
solvated onto the protein backbone, with this appar-
ently favored if the backbone is in an a-helical struc-
ture [44–50].
ECD itself causes negligible cleavage of this tertiary
structure. However, its noncovalent bonds have sub-
stantially lower bond dissociation energies, in general,
so that limited activation by earlier or concurrent
CAD or IRMPD can denature the tertiary structure
sufficiently to produce fragment ions by ECD back-
bone cleavage (activated ion ECD [27]), without this
activation also forming abundant CAD products.
However, for protein molecular ions larger than
50 kDa, electrosprayed from denatured solutions,
this tertiary structure has become so strong and exten-
sive that conventional activation by CAD or IRMPD
gives few or no backbone cleavages, making the top-
down approach ineffective [51].
A possible solution to this problem was indicated by
the study of conformational changes occurring during
solvent evaporation immediately after electrospray
pre
) and in the
10
)3
Torr region after the skimmer (V
post
). In gen-
eral, V
pre
produces many low-energy collisions to
cleave noncovalent bonds, whereas V
post
produces
fewer collisions with energies approaching the acceler-
ating voltage to cleave backbone bonds. Different
combinations of V
pre
, V
post
and capillary temperature
values in 11 PFD spectra gave 173 different inter-resi-
due backbone cleavages (Fig. 11). In a serendipitous
discovery, additives to the ESI solution such as ammo-
nium tartrate increased the number of cleavages by
50%, with a total of 21 spectra showing 287 differ-
ent cleavages (Fig. 11) [36]. These are only between the
first 240 residues from each end, so that here they
provide extensive ( 60%) sequence coverage. For
example, these data clearly show that the predicted
N-terminal Met is not present; this changes the pre-
the S–S bond. With this, eight additional S–S bonds
could be specified [36].
The largest protein examined, mycoserosic acid
synthase, had a predicted [56] molecular mass of
229 067 da (2154 residues), whereas ESI gave
228 934 ± 60 Da. Five PFD spectra designated 62
cleavages by omitting the predicted N-terminal Met,
correcting the molecular mass value to 228 936 Da to
agree with that measured. Its ‘ball of spaghetti’ is more
difficult to unravel; cleavages were limited to 134 and
182 residues from the N-terminus and C-terminus,
respectively. Very recently in collaboration with
M. Boyne and N. Kelleher, (University of Illinois,
Urbana, IL) PFD has also been implemented on an
8.4 Tesla FT MS instrument, despite its substantially
different ion entrance system, which includes an ion
funnel and octupole for ion storage.
Conclusions
The top-down and bottom-up proteomics approaches
are obviously complementary. The identification of
proteins from among those predicted by the DNA
sequence still has by far the largest sample demands.
In most cases, the bottom-up approach, requiring less
sophisticated instrumentation and expertise, should be
tried first for qualitative identification, although increas-
ing demands for more accurate quantitation provide a
promising area for the top-down approach [36,57,58].
Reliability of identification can be far superior with the
Fig. 11. PFD spectral data of PurL. Inter-residue backbone fragmentations are indicated by: N-terminal-containing b fragment ions (left, above
line); C-terminal-containing y ions (right, above line); and secondary fragment ions (below line). Top line: 173 different fragmentations from
2 Karas M & Hillenkamp F (1988) Laser desorption ioni-
zation of proteins with molecular masses exceeding
10,000 daltons. Anal Chem 60, 2299–2301.
3 Henry KD, Williams ER, Wang BH, McLafferty FW,
Shabanowitz J & Hunt DF (1989) Fourier-transform
mass spectrometry of large molecules by electrospray
ionization. Proc Natl Acad Sci USA 86, 9075–9078.
4 Tanaka K, Waki H, Ido Y, Akita S, Yoshida Y &
Yoshida T (1988) Protein and polymer analyses up to
m ⁄ z 100,000 by laser ionization time-of-flight mass
spectrometry. Rapid Commun Mass Spectrom 2,
151–153.
5 Kelleher NL, Lin HY, Valaskovic GA, Aaserud DJ,
Fridriksson EK & McLafferty FW (1999) Top down
versus bottom up protein characterization by tandem
high-resolution mass spectrometry. J Am Chem Soc 121,
806–812.
6 Kelleher NL (2004) Top-down proteomics. Anal Chem
76, 197A–203A.
7 Rostom AA, Fucini P, Benjamin DR, Juenemann R,
Nierhaus KH, Hartl FU, Dobson CM & Robinson CV
(2000) Detection and selective dissociation of intact
ribosomes in a mass spectrometer. Proc Natl Acad Sci
USA 97, 5185–5190.
8 Loo JA, Berhane B, Kaddis CS, Wooding KM, Xie Y,
Kaufman SL & Chernushevich IV (2005) Electrospray
ionization mass spectrometry and ion mobility analysis
of the 20S proteasome complex. J Am Soc Mass
Spectrom 16, 998–1008.
9 Henzel WJ, Watanabe C & Stults JT (2003) Protein
18 Horn DM, Zubarev RA & McLafferty FW (2000)
Automated de novo sequencing of proteins by tandem
high-resolution mass spectrometry. Proc Natl Acad Sci
USA 97, 10313–10317.
19 Sze SK, Ge Y, Oh HB & McLafferty FW (2002) Top
down mass spectrometry of a 29 kDa protein for charac-
terization of any posttranslational modification to within
one residue. Proc Natl Acad Sci USA 99, 1774–1779.
20 Zabrouskov V, Giacomelli L, van Wijk KJ & McLafferty
FW (2003) A new approach for plant proteomics.
Characterization of chloroplast proteins of Arabidopsis
thaliana by top-down mass spectrometry. Mol Cell
Proteomics 2, 1253–1260.
21 Park JH, Burns K, Kinsland C & Begley TP (2004)
Characterization of two kinases involved in thiamine
pyrophosphate and pyridoxal phosphate biosynthesis in
Bacillus subtilis: 4-amino-5-hydroxymethyl-2-methyl-
pyrimidine kinase and pyridoxal kinase. J Bacteriol
186, 1571–1573.
22 Xu G, Zhai H, Narayan M, McLafferty FW & Scheraga
HA (2004) Simultaneous characterization of the
Top-down MS of proteins F. W. McLafferty et al.
6266 FEBS Journal 274 (2007) 6256–6268 ª 2007 The Authors Journal compilation ª 2007 FEBS
reductive unfolding pathways of RNase B isoforms by
top-down mass spectrometry. Chem Biol 11, 517–524.
23 Beu S, Senko MW, Quinn JP & McLafferty FW (1993)
Improved Fourier-transform ion-cyclotron-resonance
mass spectrometry of large biomolecules. J Am Soc
Mass Spectrom 4, 190–192.
24 Patrie SM, Charlebois JP, Whipple D, Kelleher NL,
Ausio J, Shabanowitz J & Hunt DF (2005) Protein
identification using sequential ion ⁄ ion reactions and
tandem mass spectrometry. Proc Natl Acad Sci USA
102, 9463–9468.
32 Pitteri SJ & McLuckey SA (2005) Recent developments
in the ion ⁄ ion chemistry of high-mass multiply charged
ions. Mass Spectrom Rev 24 , 931–958.
33 McAlister GC, Phanstiel D, Good DM, Berggren WT
& Coon JJ (2007) Implementation of electron-transfer
dissociation on a hybrid linear ion trap–orbitrap mass
spectrometer. Anal Chem 79, 3525–3534.
34 Siuti N & Kelleher NL (2007) Decoding protein modifi-
cations using top-down mass spectrometry. Nat Methods
4, 817–821.
35 Zamdborg L, LeDuc RD, Glowacz KJ, Kim YB,
Viswanathan V, Spaulding IT, Early BP, Bluhm EJ,
Babai S & Kelleher NL (2007) ProSight PTM 2.0:
improved protein identification and characterization for
top down mass spectrometry. Nucleic Acids Res 35,
701–706.
36 Han X, Jin M, Breuker K & McLafferty FW (2006)
Extending top-down mass spectrometry to proteins with
masses >200 kDa. Science 314, 109–112.
37 Mirgorodskaya E, Roepstorff P & Zubarev RA (1999)
Localization of O-glycosylation sites in peptides by elec-
tron capture dissociation in a Fourier transform mass
spectrometer. Anal Chem 71, 4431–4436.
38 Shi SDH, Hemling ME, Carr SA, Horn DM, Lindh I &
McLafferty FW (2001) Phosphopeptide ⁄ phosphoprotein
mapping by electron capture dissociation mass spec-
relating ion cross sections to H ⁄ D exchange measure-
ments. J Am Soc Mass Spectrom 16, 1427–1437.
46 Hoaglund-Hyzer CS, Counterman AE & Clemmer DE
(1999) Anhydrous protein ions. Chem Rev 99, 3037–3079.
47 Koeniger SL & Clemmer DE (2007) Resolution and
structural transitions of elongated states of ubiquitin.
J Am Soc Mass Spectrom 18, 322–331.
48 Breuker K, Oh HB, Horn DM, Cerda BA & McLaffer-
ty FW (2002) Detailed unfolding and folding of gaseous
ubiquitin ions characterized by electron capture dissoci-
ation. J Am Chem Soc 124, 6407–6420.
49 Breuker K, Oh HB, Lin C, Carpenter BK & McLafferty
FW (2004) Nonergodic and conformational control in
F. W. McLafferty et al. Top-down MS of proteins
FEBS Journal 274 (2007) 6256–6268 ª 2007 The Authors Journal compilation ª 2007 FEBS 6267
electron capture dissociation of protein ions. Proc Natl
Acad Sci USA 101, 14011–14016.
50 Oh HB, Lin C, Hwang HY, Zhai H, Breuker K,
Zabrouskov V, Carpenter BK & McLafferty FW (2005)
Infrared photodissociation spectroscopy of electro-
sprayed ions in a Fourier-transform mass spectrometer.
J Am Chem Soc 127, 4076–4083.
51 Hicks LM, Mazur MT, Miller LM, Dorrestein PC,
Schnarr NA, Khosla C & Kelleher NL (2006)
Investigating nonribosomal peptide and polyketide
biosynthesis by direct detection of intermediates on
>70 kDa polypeptides using Fourier-transform mass
spectrometry. Chembiochem 7, 904–907.
52 Zhai H, Han X, Breuker K & McLafferty FW (2005)
Consecutive ion activation for top down mass spectrom-
1537–1546.
59 Garcia BA, Joshi S, Thomas CE, Chitta RK, Diaz RL,
Busby SA, Andrews PC, Ogorzalek Loo RR, Shabano-
witz J, Kelleher NL et al. (2006) Comprehensive phos-
phoprotein analysis of linker histone H1 from
Tetrahymena thermophila. Mol Cell Proteomics 5,
1593–1609.
60 Thomas CE, Mizzen CA & Kelleher NL (2006) Mass
spectrometric characterization of human histone H3: a
bird’s eye view. J Proteome Res 5, 240–247.
61 Jiang L, Smith JN, Anderson SL, Ma P, Mizzen CA &
Kelleher NL (2007) Global assessment of combinatorial
post-translational modification of core histones in yeast
using contemporary mass spectrometry. LYS4 trimethy-
lation correlates with degree of acetylation on the same
H3 tail. J Biol Chem 21, 27923–27934.
62 Calderone CT, Iwig DF, Dorrestein PC, Kelleher NL &
Walsh CT (2007) Incorporation of nonmethyl branches
by isoprenoid-like logic: multiple beta-alkylation events
in the biosynthesis of myxovirescin A1. Chem Biol 14,
835–846.
Top-down MS of proteins F. W. McLafferty et al.
6268 FEBS Journal 274 (2007) 6256–6268 ª 2007 The Authors Journal compilation ª 2007 FEBS