Large scale evaluation using entry/exposition routes
With the procedure described above we obtain the pathological terms
significantly co-mentioned with a given allergen, which are expected to
reflect its symptomatology. As this set of annotations is the first of
its kind, there are no curated resources or gold standards to evaluate
its global quality. A way to evaluate the quality of these relationships
at large scale with the gathered data is to assess whether the symptoms
caused by the allergens with a given exposure route are scientifically
sound (e.g., allergens with “dermal/skin” exposure are expected
to be related to skin-related symptoms such as “urticaria”).
To detect “enriched” symptoms in the allergens with a given exposure
route we used the same statistical test as above. From the list of
significant allergen-symptom co-mention relationships generated with the
procedure above, being na the number of
relationships involving allergens with a given exposure route,ns number of relationships involving a given
clinical sign, b those relationships involving the clinical sign
and the entry way, and P the total number of relationships, we
applied eq. 1 above. This returns the p-value of the null hypothesis
that allergens with that exposure route are associated to that symptom
by chance. Consequently, low p-values indicate significant relationships
between symptoms and exposure routes in the previously generated set of
allergen-HPO relationships. For this approach, we considered four
allergens exposure routes, obtained from the original allergen databases
as explained above: “Airway”, “Dermal-skin”, “Ingestion” and
“Injection”.