MADaM, an accurate and fast unsupervised algorithm for genotyping of
short sequencing reads
Abstract
We present here MADaM (Multiplexed Amplicon Data Miner), an original
algorithm designed to de-novo genotyping of small sequencing reads that
do not require assembly step. It performs a classification of the reads
based on an original set of features using t-SNE’s and clustering with
the DBSCAN algorithm. The algorithm is applied to three different
approaches and datasets showing that this software is fully suitable for
fastly genotyping highly variable regions such as MHC-HLA exons 2
without any priors such as SNP positions or already known alleles.