3.4 Simulated pepsin digestion of T cell epitopes
Most of the soybean proteins would be broken down by pepsin in the
stomach, and only the anti-digestive peptides containing epitopes could
enter the gastric mucosa and interact with immune cells to trigger
allergic reactions [35]. Thus, the Peptidecutter tool was used to
simulate the gastric digestion of seven soybean allergens, respectively,
and the pepsin hydrolysis sites of the T cell epitopes are shown in
Table 9, which further confirm the potential allergenicity.
As shown in Table 9, except for the T cell epitope ”SKDNVISQIPSQVQE” in
P11827 protein, all other epitopes contained multiple pepsin hydrolysis
sites, and most of the digested peptides lengths were less than 12 amino
acids. Since the length of the T cell epitope is generally believed
between 12-20 amino acids [15], the peptide sequences of
”DDGTRRLVVSKNKP” (aa 179-192) in P01070 protein, ”TPVVAVSIIDTNS” (aa
158-170), ”RHNIGQTSSPDI” (aa 322-333), and SQQARQIKNNNP” (aa 468-479) in
P04776 protein, ”NPHIGINVNSIRSIKTTS” (aa 168-185) in P05046 protein,
”SKDNVISQIPSQVQE” (aa 558-572) in P11827 protein, ”EEQRQQEGVIVE” (aa
201-212) in P25974 protein and the ”LVTDADNVIPKA” (aa 23-34) in P26987
protein would be the digestion resistant regions. Among the 8
anti-digestive regions, isoleucine (I), valine (V), serine (S),
asparagine (N), and glutamine (Q) were the amino acids with the highest
content, while isoleucine (I), valine (V), and asparagine (N) covered 7
anti-digestive areas and serine (S) covered 6 anti-digestive areas.
Interestingly, all 8 anti-digestive regions are exposed on the surface
of the proteins except ”TPVVAVSIIDTNS” (aa 158-170) in P04776 (shown in
Tables 2-8), and as shown in Table 9, six peptides remain allergenicity
among these 8 peptides. It has been reported that amino acids which had
the high surface exposure were likely to bind with antibodies and formed
B cell epitopes, however, the relationship between T cell epitopes and
surface exposure has not been elaborated yet.
4. Discussion
At present, the food allergy is not exactly explained, which has
hindered the development of hypoallergenic foods and immunotherapy
[36, 37]. The identification of T cell epitopes can provide
important information about the cellular mechanisms involved in the
transition from food tolerance to allergy [38], and the analysis of
cytokines related to Th2 cells can also improve our understanding of the
immune response. Soybean is one of the most widely cultivated edible
legumes. In the past 50 years from 1968 to 2018, the world soybean
production increased 8.4 times, and the global planted area also
increased 4.3 times, while global soybean consumption increased from 24
million tons to 57 million tons between 1999 and 2019 [39]. Soybean
is rich in carbohydrates, protein, fat, cellulose, and trace elements
such as calcium and magnesium [40]. Although soybean is the
high-quality protein source for humans and animals, the prevalence of
sensitization of the major allergens from soybean has aroused wide
attention. After entering the human body, soybean protein allergens will
be treated by lysosomes and then combined with MHC class II molecules,
and then the MHC class II-T cell epitope complex will interact with CD4+
T cells to synthesize and secrete the cytokine such as IL4, which will
play a key role in promoting the proliferation and differentiation of
antigen-presenting cells and stimulating the production of IgE [22].
In this study, the T cell epitopes of seven soybean protein allergens
were comprehensively identified based on the binding affinity between
epitopes and HLA II molecular, the inducing ability of IL4 production,
and the antigenicity analysis, while the larger fragments released by
pepsin digestion were also analyzed. As far as we have known, it might
be the first paper t to analyze T cell epitopes of soybean allergens.
The combination of consensus method and NetMHCIIpan in IEDB have been
used to predict T cell epitopes of allergens that can bind to MHC II
molecules, and the NetMHCIIpan method has been recognized as the most
accurate epitope prediction method [41]. Ramesh et al. [14]
predicted
54
T cell epitopes of peanut allergen Ara h 1 by NetMHCIIpan method and
identified 32 T cell epitopes were confirmed by using ProImmune REVEAL
assay, indicated the higher accuracy (32 confirmed/37 predicted) of the
NetMHCIIpan method. Pascal et al. [42] identified 4 T cell epitope
regions of peanut allergen Ara h 2 via T cell proliferation assay and
cytokine profile analysis, and all of the T cell epitopes were further
proved to be MHC II binders by NetMHCIIpan method and NetMHCII method.
In order to further improve accuracy on the prediction of T cell
epitopes, combining consensus method and the NetMHCIIpan method in the
IEDB database with other bioinformatics tools were explored in the
previous studies, and all the predicted 6 T cell epitopes exhibited good
potentials in T-cell proliferation and cytokines release [43].
Except for previous T cell epitopes prediction methods, the
allergenicity and the ability to promote Th2 response-related cytokines
IL-4 release of T cell epitopes were also evaluated in the current
studies, which would further improve the prediction accuracy.
As shown in Tables 2-8, since the literature on T cell epitopes mapping
of soybean has not been available until now, all the T-cell epitopes are
first identified. The soybean allergen Gly m 4 belongs to the PR family
and has cross-reactivity with the birch pollen allergen Bet v 1. Jahn-
Schmid et al. [44] found that the ”TLLRAVESYLLAHSD” (aa 142-156) of
the Bet v 1 allergen was the major T cell epitope and could cross-react
with the peptide segment of soybean Gly m 4 ”ALFKAIEAYLLAHPD” (aa
142-156), and this peptide fragment is consistent with the T cell
epitopes ”ADALFKAIEAYLLAH” (aa 140-154) and ”DALFKAIEAYLLAHP” (aa
141-155) of Gly m 4 predicted in this study. In addition, comparing with
the conformational epitope of Gly m 4 reported by Husslik et al.
[45], many conformational epitope amino acid residues were exited in
the Gly m 4 T cell epitope regions ”LYKALVTDADNVIPKA” (aa 19-34),
”KKITFLEDGETKFVLHKIESI” (aa 54-74), ”AGPNGGSAGKLTVKY” (aa 106-120), and
”AKADALFKAIEAYLL” (aa 138-152) reported in the current experiment.
Allergen-specific T cells play an important role in allergic reactions
and are obvious targets for immunotherapy intervention in diseases
[17]. Recently, Candreva et al. [46] developed a new strategy
for preventing milk allergy on the basis of oral administration of a
soybean-derived peptide that was cross-reactive with bovine caseins.
This peptide contained both T and B cell epitopes of soybean allergen
Gly-m-Bd-30K and could stimulate T cells without causing IgE
cross-linking on basophils and mast cells. In this study, the T cell
epitope region ”GGSILSGFTLEFLEHAFSVD” (217-236) in the P04776 allergen
reported in this experiment is consistent with the previous results of
the B cell epitope ”GGSILSGFTLEFLEHAFSV” (217-235) identified by peptide
scanning [47]. The T cell epitope regions ”EEQRQQEGVIVELSK” (aa
201-215), ”NPIYSNNFGKFFEIT” (aa 246-260), and ”DIFLSSVDINEGALLLPHFNS”
(aa 271-291) in P25974 allergen reported in this experiment are
consistent with the epitopes identified by co-immunoprecipitation and
mass spectrometry [48]. These T cell epitopes which contained B cell
epitope amino acids would be expected for allergy treatment via
Synthetic Peptide Immuno-Regulatory Epitopes for related cross-reactive
soybean.
Selecting potential T cell epitopes would be a difficult task because of
the complexity of HLA alleles. In this study, since the epitopes
“YIKDVFRVIPSEVLS” (aa 477-491), “KDVFRVIPSEVLSNS” (aa 479-493),
“DVFRVIPSEVLSNSY” (aa 480-494) in P04347 protein could bind with more
than 9 HLA molecules, and the epitopes “AKADALFKAIEAYLL” (aa 138-152),
“ADALFKAIEAYLLAH” (aa 140-154) in P26987 protein could bind with more
than 13 HLA molecules, and these epitopes can be predicted by all three
methods (shown in Tables 3 and 8), the five new fragments are considered
as the most possible epitope candidates.
Compared with the entire protein sequence, the frequency of
phenylalanine (F), isoleucine (I), asparagine (N), valine (V) lysine
(K), and histidine (H) increases in T cell epitopes of most soybean
allergens. According to previous studies, the presence of lysine (K) had
a significant effect on T cell stimulation and secondary structure
alpha-helix could promote the antigenicity [49]. Another study also
found that isoleucine (I) and histidine (H) was the key amino acids in
the T cell epitope of the dust mite allergen Der p 2, because they might
direct contact moieties, and might also indirectly affect peptide
binding by changing the conformation of adjacent amino acid side chains
[50].
A previous study showed that MHC molecular tended to bind hydrophobic
amino acids in most positions except penultimate position, and could
bind both hydrophilic and hydrophobic amino acids in positions 4 and 6,
which could bind hydrophobic amino acids in positions 1, 2, 3, 5, and 8
[7]. Similarly, in the present paper, random forest models also
showed that amino acids in positions p1, p2, p4, p5, p6, and p13 had a
good contribution to the allergenicity. For positions p4, p5, and p6,
three physicochemical properties (z1, z2, and z3) contributed to
allergenicity in a similar way. For positions p1 and p2, the
physicochemical properties that contribute to allergenicity were in the
order z1>z2>z3, whereas the order for position
p13 was z3>z2>z1. Furthermore, the hydrophobic
(z1) residues at positions p1 and p2 contributed to the allergenicity in
most soybean allergen models. In the position p1 of P04776, P05046,
P11827, and P25974 models and the position p2 of P04347, P04776, P11827,
and P26987, there are many hydrophobic residues including which can
promote allergenicity (shown in Tables 2-8 and Fig. 2). Positively
charged amino acids, such as arginine (R), lysine (K), and histidine (H)
tend to locate at the position p13 and play an important role in the
allergenicity in the soybean models P04347, P04776, P25974, and P05046.
In this study, the frequency of phenylalanine (F), isoleucine (I),
asparagine (N), valine (V) lysine (K), and histidine (H) residues
increased in the T cell epitope region (Fig. 1), which was consistent
with the current analysis, as hydrophobic residues including
phenylalanine (F), isoleucine (I), and valine (V) located at the
positions p1 and p2 contributed to the allergenicity, whereas positively
charged amino acids, such as lysine (K), and histidine (H) located at
the positions p6 and p13 and promoted the allergenicity (shown in Tables
3-5, Table 7, and Fig. 2), and previous studies also confirmed the
positively charged amino acids would provide a net charge to increase
the activity of the epitope peptide antigen [50].
In simulated pepsin digestion experiments, most of the T cell epitopes
from soybean allergen could be hydrolyzed by pepsin into small peptides
(<12 aa), and most of the anti-digestive fragments are located
on the surface of the proteins. The digestion-resistant epitope region
contained much hydrophobic amino acids including isoleucine (I), valine
(V), and non-charged amino acids including serine (S), asparagine (N),
and glutamine (Q), and the second position of this peptide with
hydrophobic amino acid might contribute to the allergenicity, such as
the anti-digestive fragment ”LVTDADNVIPKA” (aa 23-34) in P26987. The T
cell epitopes EEQRQQEGVIVELSK” (aa 201-215), and ”EQRQQEGVIVELSKE” (aa
202-216) with higher IL-4pred scores also can be resistant to pepsin
hydrolysis and have a great potential to enter into bodies to cause Th2
cell response. In order to development of soybean hypoallergenic
products, proteases that can hydrolyze the above five amino acid sites
(I, V, S, N, and Q) can be selected. For example, the endopeptidase
Glu-C can hydrolyze glutamate (E) and glutamine (Q) [51, 52, 53] can
be used to specifically destroy the digestion-resistant T cell epitopes
”SKDNVISQIPSQVQE” (aa 558-572) (P11827) and EEQRQQEGVIVE” (aa 201-212)
(P25974).