INTRODUCTION
Spike is a trimeric surface glycoprotein from the SARS-CoV-2 virus
causing the COVID-19 pandemic (1, 2). Spike’s binding to human
angiotensin-converting enzyme 2 (ACE2) is critical for SARS-CoV-2
penetration of a host cell and initiation of infection (3, 4, 5). Spike
is the surface antigen and target of all currently available COVID-19
vaccines (6), and improved understanding of spike’s structures and
functions is likely to result in more effective SARS-CoV-2 vaccines (7).
Spike trimers engage in several dynamic structural changes related to
binding ACE2, cleavage by protease TMPRSS2, unfolding their S2 domains,
and finally re-folding to pull viral and target cell membranes into
juxtaposition (8). Spike’s structural re-arrangements are required steps
during SARS-CoV-2 infection (9) and are also potential mechanistic
targets for spike-neutralizing reagents such as the host’s antibodies
(10, 11). A well-established methodology to measure changes in
(glyco)protein higher-order structure is hydrogen/deuterium exchange
mass spectrometry (HDX-MS) (12). Previous publications have described
HDX-MS analyses of spike, including its interaction with ACE2 (4, 5, 13,
14) and changes in structural dynamics specific to different variants of
concern, including Alpha, Beta, Delta, and Omicron (1, 13, 15, 16).
HDX-MS provides a framework of sample preparation, proteolytic
digestion, peptide identification, and deuterium uptake measurement over
time to identify changes in (glyco)protein conformation (12, 17). In an
HDX-MS analysis comparing two states of the (glyco)protein of interest,
changes in a peptide’s amide backbone hydrogen exchange are interpreted
as indications of movement or stabilization of α-helices, β-sheets, and
other hydrogen bonds contributing to secondary structure (18). Solvent
accessibility also plays a role in the deuterium labeling of proteins
(19).
An important goal of HDX-MS analysis is maximizing sequence coverage of
the (glyco)protein of interest, since gaps could include regions with
informative deuterium labeling. For example, localization of a
monoclonal antibody’s epitope (20) on an antigenic (glyco)protein of
interest can be challenging if sequence coverage of the antigenic
(glyco)protein such as spike (16, 21) is incomplete (22). A key step in
modern HDX-MS analysis for obtaining sequence coverage is on-line
proteolytic digestion (23) of the (glyco)protein of interest to generate
peptides amenable to liquid chromatography (LC)/MS detection. HDX-MS
sample preparation typically includes a low pH (~2.5)
quenching step immediately before proteolytic digestion (12), limiting
digestion to acid-tolerant proteases such as pepsin (24) or
aspergillopepsin (23, 25). These proteases generate overlapping, mostly
non-specific (26) peptides that are nonetheless reproducible for a given
protein substrate. Previous HDX-MS analyses of spike have shown sequence
coverage gaps when using on-line digestion with pepsin (1, 4, 5, 13).
Specifically considering glycoproteins, sequence coverage gaps in HDX-MS
analyses are often associated with N-glycosylation “sequons” (the
amino acid sequence Asn-Xaa-Ser/Thr, where Xaa is not Pro and a glycan
portion composed of 2 to 11+ hexose subunits is covalently bound to the
Asn residue) (27). The reasons for these coverage gaps at
N-glycosylation sequons potentially include 1) stearic inhibition of the
on-line protease’s cleavage by the bulky glycan group covalently bound
to an Asn residue (28), resulting in fewer short peptides containing the
sequon, and 2) lack of detection of the resulting high-mass
glycopeptides during subsequent LC/MS and LC/tandem mass spectrometry
(LC/MS/MS) analyses. Although several previous HDX-MS analyses of spike
(4, 15) or IgG (29) have detected peptides containing the amino acid
sequence Asn-Xaa-Ser/Thr, these are not bona fide“glycopeptides” because the mass(es) and possible identity(ies) of any
covalently bound glycan group(s) were not specified. Some recent HDX-MS
publications (1, 14) report glycopeptide data from a separate (non-HDX)
analysis but this does not provide information about the deuteration of
peptides with covalently bound glycans.
Glycan identity is an important aspect of glycoprotein analysis because
microheterogeneity (the cohort of all the different glycan structures
bound to a particular N-glycosylation sequon, (30)) influences
glycoprotein structures and functions (31). The impact of
microheterogeneity on HDX-MS analyses of glycoproteins is not presently
known because detecting and identifying glycopeptides with their glycans
still covalently bound requires advanced MS/MS methods and appropriate
data processing for detection and assignment of glycans (32, 33, 34,
35).
SARS-CoV-2 Spike glycoprotein has 22 N-glycosylation sequons per monomer
and several publications describe spike’s microheterogeneity (32, 36,
37, 38). Spike’s high level of glycosylation has caused significant gaps
in sequence coverage and incomplete HDX-MS data in previous studies (1,
4, 5, 13). During our HDX-MS analyses of spike we applied a previously
described method for detecting glycopeptides (39, 40) to the
deuterium-labeled D614G variant (41). We believe this is the first
report directly measuring the deuteration of peptides with
N-glycosylation sequons and covalently bound glycan groups to determine
the impact of microheterogeneity on the HDX-MS dynamics of SARS-CoV-2
spike. Heat-treatment of spike was used to significantly change protein
structure and demonstrate the utility of deuterated glycopeptide data to
improve HDX-MS conformational analysis of glycoproteins.