High performance among different variant consequence types
Most of the currently available tools, v.g. SIFT (Vaser, Adusumalli, Leng, Sikic, & Ng, 2016), PolyPhen (Adzhuvei, et al., 2010), and Revel (Ioannidis, et al., 2016) are designed to yield scores for missense type variants exclusively, resulting in a lower performance in the dataset as it includes diverse variant consequence types. Only a third of the variants of this dataset are missense type, and there are significant numbers of synonymous, non coding transcript exon, intron, and splice variants. There is a smaller number of other consequence types as well, namely frameshift and nonsense variants. For this reason, to get a fairer representation of the performance of the trained models against the benchmarked tools, the ROCs for the same models were plotted for the subsets of the specific variant consequence types. On the subset of missense variants (Supplementary Figure S2a), the AUC of the RF and the MLP (0.97) outperform the SVM (0.96). As most tools are designed for this consequence type, there is a general superiority of the AUCs compared with the other variant types. Revel yields an AUC of 0.96, equal to the SVM and slightly lower than the MLP and RF. The commonly used SIFT and PolyPhen had lower AUCs than other the analyzed tool (0.81 and 0.85, respectively). M-CAP (AUROC=0.95) , MetaLR, and MetaSVM (AUROC=0.93) yield high accuracy on missense variants as well.
For the splice type variants (Supplementary Figure S2b), our RF yields an AUC of 0.97, the MLP an AUC of 0.93, and the SVM an AUC of 0.90. The CADD tool yields an AUC of 0.95 outperforming our MLP and SVM. For synonymous variants (Supplementary Figure S2c), the AUROCs are consistently lower. In these variants, our three models get an AUC of 0.89, outperforming CADD (AUROC=0.57). For non-coding mRNA variants (Supplementary Figure S2d), the AUROCs of our models are outperformed by CADD. The RF yielded an AUROC 0f 0.89, the SVM of 0.85, and the MLP, of 0.89, lower than CADD with an AUROC of 0.93. For the intron type variants (Supplementary Figure S2e), the RF yields an AUROC of 0.89, the SVM of 0.84, and the MLP of 0.83. The CADD score yielded an AUROCC of 0.76. Our models misclassify coding INDEL variants (Supplementary Figure S2f), showing AUROCs lower than 0.5, while CADD has an AUROC of 0.78. In the case of intergenic variants (Supplementary Figure S2g), the three models yielded an AUROC of 0.58, while CADD yielded an AUROC of 0.89. For other variant types (Supplementary Figure S2h), performance is better, with AUROC=0.92 for the RF and the SVM, and AUROC=0.89 for the MLP. In this variant type, CADD AUROC=0.95.