Sequences used in this study
94 protein sequences for the surface (S) glycoprotein from different isolates of SARS-CoV-2 were retrieved (22/03/2020) from the GenBank database through the National Center for Biotechnology Information (NCBI). A further 731 genomes from different isolates of the SARS-CoV-2 virus were retrieved from the GISAID EpiCoV database (22/03/2020). The genomes from GISAID were selected to contain complete sequences with high coverage and exclude low coverage sequences. A list of sequences used in this study can be found in Supplementary Figure 1 and 2.