3.4 Simulated pepsin digestion of T cell epitopes
Most of the soybean proteins would be broken down by pepsin in the stomach, and only the anti-digestive peptides containing epitopes could enter the gastric mucosa and interact with immune cells to trigger allergic reactions [35]. Thus, the Peptidecutter tool was used to simulate the gastric digestion of seven soybean allergens, respectively, and the pepsin hydrolysis sites of the T cell epitopes are shown in Table 9, which further confirm the potential allergenicity.
As shown in Table 9, except for the T cell epitope ”SKDNVISQIPSQVQE” in P11827 protein, all other epitopes contained multiple pepsin hydrolysis sites, and most of the digested peptides lengths were less than 12 amino acids. Since the length of the T cell epitope is generally believed between 12-20 amino acids [15], the peptide sequences of ”DDGTRRLVVSKNKP” (aa 179-192) in P01070 protein, ”TPVVAVSIIDTNS” (aa 158-170), ”RHNIGQTSSPDI” (aa 322-333), and SQQARQIKNNNP” (aa 468-479) in P04776 protein, ”NPHIGINVNSIRSIKTTS” (aa 168-185) in P05046 protein, ”SKDNVISQIPSQVQE” (aa 558-572) in P11827 protein, ”EEQRQQEGVIVE” (aa 201-212) in P25974 protein and the ”LVTDADNVIPKA” (aa 23-34) in P26987 protein would be the digestion resistant regions. Among the 8 anti-digestive regions, isoleucine (I), valine (V), serine (S), asparagine (N), and glutamine (Q) were the amino acids with the highest content, while isoleucine (I), valine (V), and asparagine (N) covered 7 anti-digestive areas and serine (S) covered 6 anti-digestive areas. Interestingly, all 8 anti-digestive regions are exposed on the surface of the proteins except ”TPVVAVSIIDTNS” (aa 158-170) in P04776 (shown in Tables 2-8), and as shown in Table 9, six peptides remain allergenicity among these 8 peptides. It has been reported that amino acids which had the high surface exposure were likely to bind with antibodies and formed B cell epitopes, however, the relationship between T cell epitopes and surface exposure has not been elaborated yet.
4. Discussion
At present, the food allergy is not exactly explained, which has hindered the development of hypoallergenic foods and immunotherapy [36, 37]. The identification of T cell epitopes can provide important information about the cellular mechanisms involved in the transition from food tolerance to allergy [38], and the analysis of cytokines related to Th2 cells can also improve our understanding of the immune response. Soybean is one of the most widely cultivated edible legumes. In the past 50 years from 1968 to 2018, the world soybean production increased 8.4 times, and the global planted area also increased 4.3 times, while global soybean consumption increased from 24 million tons to 57 million tons between 1999 and 2019 [39]. Soybean is rich in carbohydrates, protein, fat, cellulose, and trace elements such as calcium and magnesium [40]. Although soybean is the high-quality protein source for humans and animals, the prevalence of sensitization of the major allergens from soybean has aroused wide attention. After entering the human body, soybean protein allergens will be treated by lysosomes and then combined with MHC class II molecules, and then the MHC class II-T cell epitope complex will interact with CD4+ T cells to synthesize and secrete the cytokine such as IL4, which will play a key role in promoting the proliferation and differentiation of antigen-presenting cells and stimulating the production of IgE [22]. In this study, the T cell epitopes of seven soybean protein allergens were comprehensively identified based on the binding affinity between epitopes and HLA II molecular, the inducing ability of IL4 production, and the antigenicity analysis, while the larger fragments released by pepsin digestion were also analyzed. As far as we have known, it might be the first paper t to analyze T cell epitopes of soybean allergens.
The combination of consensus method and NetMHCIIpan in IEDB have been used to predict T cell epitopes of allergens that can bind to MHC II molecules, and the NetMHCIIpan method has been recognized as the most accurate epitope prediction method [41]. Ramesh et al. [14] predicted 54 T cell epitopes of peanut allergen Ara h 1 by NetMHCIIpan method and identified 32 T cell epitopes were confirmed by using ProImmune REVEAL assay, indicated the higher accuracy (32 confirmed/37 predicted) of the NetMHCIIpan method. Pascal et al. [42] identified 4 T cell epitope regions of peanut allergen Ara h 2 via T cell proliferation assay and cytokine profile analysis, and all of the T cell epitopes were further proved to be MHC II binders by NetMHCIIpan method and NetMHCII method. In order to further improve accuracy on the prediction of T cell epitopes, combining consensus method and the NetMHCIIpan method in the IEDB database with other bioinformatics tools were explored in the previous studies, and all the predicted 6 T cell epitopes exhibited good potentials in T-cell proliferation and cytokines release [43]. Except for previous T cell epitopes prediction methods, the allergenicity and the ability to promote Th2 response-related cytokines IL-4 release of T cell epitopes were also evaluated in the current studies, which would further improve the prediction accuracy.
As shown in Tables 2-8, since the literature on T cell epitopes mapping of soybean has not been available until now, all the T-cell epitopes are first identified. The soybean allergen Gly m 4 belongs to the PR family and has cross-reactivity with the birch pollen allergen Bet v 1. Jahn- Schmid et al. [44] found that the ”TLLRAVESYLLAHSD” (aa 142-156) of the Bet v 1 allergen was the major T cell epitope and could cross-react with the peptide segment of soybean Gly m 4 ”ALFKAIEAYLLAHPD” (aa 142-156), and this peptide fragment is consistent with the T cell epitopes ”ADALFKAIEAYLLAH” (aa 140-154) and ”DALFKAIEAYLLAHP” (aa 141-155) of Gly m 4 predicted in this study. In addition, comparing with the conformational epitope of Gly m 4 reported by Husslik et al. [45], many conformational epitope amino acid residues were exited in the Gly m 4 T cell epitope regions ”LYKALVTDADNVIPKA” (aa 19-34), ”KKITFLEDGETKFVLHKIESI” (aa 54-74), ”AGPNGGSAGKLTVKY” (aa 106-120), and ”AKADALFKAIEAYLL” (aa 138-152) reported in the current experiment.
Allergen-specific T cells play an important role in allergic reactions and are obvious targets for immunotherapy intervention in diseases [17]. Recently, Candreva et al. [46] developed a new strategy for preventing milk allergy on the basis of oral administration of a soybean-derived peptide that was cross-reactive with bovine caseins. This peptide contained both T and B cell epitopes of soybean allergen Gly-m-Bd-30K and could stimulate T cells without causing IgE cross-linking on basophils and mast cells. In this study, the T cell epitope region ”GGSILSGFTLEFLEHAFSVD” (217-236) in the P04776 allergen reported in this experiment is consistent with the previous results of the B cell epitope ”GGSILSGFTLEFLEHAFSV” (217-235) identified by peptide scanning [47]. The T cell epitope regions ”EEQRQQEGVIVELSK” (aa 201-215), ”NPIYSNNFGKFFEIT” (aa 246-260), and ”DIFLSSVDINEGALLLPHFNS” (aa 271-291) in P25974 allergen reported in this experiment are consistent with the epitopes identified by co-immunoprecipitation and mass spectrometry [48]. These T cell epitopes which contained B cell epitope amino acids would be expected for allergy treatment via Synthetic Peptide Immuno-Regulatory Epitopes for related cross-reactive soybean.
Selecting potential T cell epitopes would be a difficult task because of the complexity of HLA alleles. In this study, since the epitopes “YIKDVFRVIPSEVLS” (aa 477-491), “KDVFRVIPSEVLSNS” (aa 479-493), “DVFRVIPSEVLSNSY” (aa 480-494) in P04347 protein could bind with more than 9 HLA molecules, and the epitopes “AKADALFKAIEAYLL” (aa 138-152), “ADALFKAIEAYLLAH” (aa 140-154) in P26987 protein could bind with more than 13 HLA molecules, and these epitopes can be predicted by all three methods (shown in Tables 3 and 8), the five new fragments are considered as the most possible epitope candidates.
Compared with the entire protein sequence, the frequency of phenylalanine (F), isoleucine (I), asparagine (N), valine (V) lysine (K), and histidine (H) increases in T cell epitopes of most soybean allergens. According to previous studies, the presence of lysine (K) had a significant effect on T cell stimulation and secondary structure alpha-helix could promote the antigenicity [49]. Another study also found that isoleucine (I) and histidine (H) was the key amino acids in the T cell epitope of the dust mite allergen Der p 2, because they might direct contact moieties, and might also indirectly affect peptide binding by changing the conformation of adjacent amino acid side chains [50].
A previous study showed that MHC molecular tended to bind hydrophobic amino acids in most positions except penultimate position, and could bind both hydrophilic and hydrophobic amino acids in positions 4 and 6, which could bind hydrophobic amino acids in positions 1, 2, 3, 5, and 8 [7]. Similarly, in the present paper, random forest models also showed that amino acids in positions p1, p2, p4, p5, p6, and p13 had a good contribution to the allergenicity. For positions p4, p5, and p6, three physicochemical properties (z1, z2, and z3) contributed to allergenicity in a similar way. For positions p1 and p2, the physicochemical properties that contribute to allergenicity were in the order z1>z2>z3, whereas the order for position p13 was z3>z2>z1. Furthermore, the hydrophobic (z1) residues at positions p1 and p2 contributed to the allergenicity in most soybean allergen models. In the position p1 of P04776, P05046, P11827, and P25974 models and the position p2 of P04347, P04776, P11827, and P26987, there are many hydrophobic residues including which can promote allergenicity (shown in Tables 2-8 and Fig. 2). Positively charged amino acids, such as arginine (R), lysine (K), and histidine (H) tend to locate at the position p13 and play an important role in the allergenicity in the soybean models P04347, P04776, P25974, and P05046. In this study, the frequency of phenylalanine (F), isoleucine (I), asparagine (N), valine (V) lysine (K), and histidine (H) residues increased in the T cell epitope region (Fig. 1), which was consistent with the current analysis, as hydrophobic residues including phenylalanine (F), isoleucine (I), and valine (V) located at the positions p1 and p2 contributed to the allergenicity, whereas positively charged amino acids, such as lysine (K), and histidine (H) located at the positions p6 and p13 and promoted the allergenicity (shown in Tables 3-5, Table 7, and Fig. 2), and previous studies also confirmed the positively charged amino acids would provide a net charge to increase the activity of the epitope peptide antigen [50].
In simulated pepsin digestion experiments, most of the T cell epitopes from soybean allergen could be hydrolyzed by pepsin into small peptides (<12 aa), and most of the anti-digestive fragments are located on the surface of the proteins. The digestion-resistant epitope region contained much hydrophobic amino acids including isoleucine (I), valine (V), and non-charged amino acids including serine (S), asparagine (N), and glutamine (Q), and the second position of this peptide with hydrophobic amino acid might contribute to the allergenicity, such as the anti-digestive fragment ”LVTDADNVIPKA” (aa 23-34) in P26987. The T cell epitopes EEQRQQEGVIVELSK” (aa 201-215), and ”EQRQQEGVIVELSKE” (aa 202-216) with higher IL-4pred scores also can be resistant to pepsin hydrolysis and have a great potential to enter into bodies to cause Th2 cell response. In order to development of soybean hypoallergenic products, proteases that can hydrolyze the above five amino acid sites (I, V, S, N, and Q) can be selected. For example, the endopeptidase Glu-C can hydrolyze glutamate (E) and glutamine (Q) [51, 52, 53] can be used to specifically destroy the digestion-resistant T cell epitopes ”SKDNVISQIPSQVQE” (aa 558-572) (P11827) and EEQRQQEGVIVE” (aa 201-212) (P25974).