Results
2,179 unique allergen entries were collected from WHO/IUIS and AllergenOnline databases and subjected to the analysis of significant literature co-mentions with 21,380 pathological terms. Remarkably, 1,143 of these allergens (52.45%) were not significantly correlated with any pathological term (p-val≤1E-2), suggesting that these allergens could have a weak or inexistent clinical relevance based on the lack of significant information on their biological activity found in the literature. Indeed, a large fraction of those allergens without any significant correlation with symptoms is never mentioned in PubMed according with our searches, i.e., 1,064 out of 1,143 allergens (93.1%), being 760 of them retrieved from AllergenOnline and 304 from WHO/IUIS database. In almost half of the remaining 79 allergens with retrieved articles but no significant co-mentions with symptoms (38) had only one article gathered from PubMed, whereas other 38 were quoted only between 2 and 7 times. These observations indicate that most allergens not statistically associated to any type of clinical symptoms were due to the insufficient number of scientific articles available for them in PubMed. Regarding the 3 allergens with no associated symptoms but reported in 9 or more articles retrieved from PubMed, they belong to uncommon food sources (i.e., fungi allergens Asp o 13, Asp o 21 and Pen ch 35). The rest of annotated allergen entries (1,036) were associated with a great variety of clinical signs and symptoms (i.e., 1,180) via 14,009 relationships.
Figure 1A shows the distribution of number of symptoms for the allergens associated to 50 or more. The allergen associated to the largest number of symptoms is Can f 3 (serum albumin of domestic dog), associated to 313 terms. Some of these relationships for Can f 3 could be artifacts due to the use of that protein and/or the animalper se in biomedical studies.
Overall, the symptomatology associated to allergens makes sense considering the knowledge we have on them. For example, well-known food allergens such as peanut Ara h 1 or birch pollen allergen Bet v 1 were associated to a large variety of well-described clinical symptoms, whereas gluten proteins were correlated to celiac disease, and infections like anisakiasis and aspergillosis were exclusively associated to parasites from Anisakis simplex (and related worm parasites) and fungi allergens, respectively. Likewise, aeroallergens were predominantly correlated to symptoms involving the respiratory system (e.g., rhinitis, asthma, wheezing, sneezing, conjunctivitis, bronchial disorders). There are also trivial associations with general terms such as “allergy” or “hypersensitivity”, as well as some “artifacts” such as the relationship between “decrease circulating IgE” and several allergens. This association was inferred from articles describing structural modifications of allergens in their native form, through processing or chemical/enzymatic approaches, to reduce their allergenic potential.
Figure 1B shows the distribution of number of allergens associated to the symptoms with 50 or more. As expected, the generic term “allergy” is associated to almost all allergens (824 out of 1,036).
Figure 2 shows, as an example, the symptomatology retrieved by our system for Bet v 1 (major birch pollen allergen). The terms include the very general allergy , as well as the typicalrhinitis, conjunctivitis, pruritus/itching , and the rareranaphylactic shock . That list of terms nicely recapitulates those manually compiled for this allergen in dedicated resources and provide a clear picture of the symptomatology of this pollen allergen.