Sequences used in this study
94 protein sequences for the surface (S) glycoprotein from different
isolates of SARS-CoV-2 were retrieved (22/03/2020) from the GenBank
database through the National Center for Biotechnology Information
(NCBI). A further 731 genomes from different isolates of the SARS-CoV-2
virus were retrieved from the GISAID EpiCoV database (22/03/2020). The
genomes from GISAID were selected to contain complete sequences with
high coverage and exclude low coverage sequences. A list of sequences
used in this study can be found in Supplementary Figure 1 and 2.