Large scale evaluation using entry/exposition routes
With the procedure described above we obtain the pathological terms significantly co-mentioned with a given allergen, which are expected to reflect its symptomatology. As this set of annotations is the first of its kind, there are no curated resources or gold standards to evaluate its global quality. A way to evaluate the quality of these relationships at large scale with the gathered data is to assess whether the symptoms caused by the allergens with a given exposure route are scientifically sound (e.g., allergens with “dermal/skin” exposure are expected to be related to skin-related symptoms such as “urticaria”).
To detect “enriched” symptoms in the allergens with a given exposure route we used the same statistical test as above. From the list of significant allergen-symptom co-mention relationships generated with the procedure above, being na the number of relationships involving allergens with a given exposure route,ns number of relationships involving a given clinical sign, b those relationships involving the clinical sign and the entry way, and P the total number of relationships, we applied eq. 1 above. This returns the p-value of the null hypothesis that allergens with that exposure route are associated to that symptom by chance. Consequently, low p-values indicate significant relationships between symptoms and exposure routes in the previously generated set of allergen-HPO relationships. For this approach, we considered four allergens exposure routes, obtained from the original allergen databases as explained above: “Airway”, “Dermal-skin”, “Ingestion” and “Injection”.