Score distributions for currently used tools
The distribution of values of the retrieved features, many of which are currently used as deleteriousness/pathogenicity prediction scores, were plotted for Benign and Pathogenic variants, as well as for Variants of Uncertain Significance. Figure 3 shows the distribution of values for three of the most commonly used tools, namely SIFT, PolyPhen and Revel. Considering that the SIFT score assigns a 0 value to deleterious variants, in contrast with the typical score value of 1 for deleterious/pathogenic variants, its histogram was plotted using the 1-SIFT value to allow for easier comparison with the other tools. As seen on Figure 3, 1-SIFT scores have a great proportion of values ≈ 1 for Benign variants, suggesting an overestimation of deleteriousness. PolyPhen scores have values ≈ 1 for benign, and values ≈ 0 for pathogenic variants as well. However, for VUS variants SIFT and PolyPhen have pronounced distributions with peaks on the extreme values, while the Revel scores have a less markedly bimodal distribution. An ideal prediction score for VUS variants would classify them on two clear clusters (in a similar way to PolyPhen) while avoiding classification errors.