All selected sequences were aligned using the MAFFT (Multiple Alignment using Fast Fourier Transform) algorithm (Katoh, Misawa, Kuma, & Miyata, 2002) and the alignment was visualized in JalView 2.11.0. The aligned sequences were used for the construction of a phylogenetic tree using the neighbor-joining method, along with 500 bootstrap replications and a 95% site coverage cutoff value, in Molecular Evolutionary Genetics Analysis (MEGA X) software (Kumar, Stecher, Li, Knyaz, & Tamura, 2018). Interactive Tree Of Life (iTOL) v5 (Letunic & Bork, 2019) was used to adjust the branch and label color of the phylogenic tree. Non-synonymous mutations with their specific mutation sites and their global frequencies were determined using ‘CoVsurver enabled by GISAID’ based on viral sequences in GISAID’s EpiCoV database. This data was checked and validated carefully against aligned sequences. 3D structural visualization of the spike glycoproteins was performed using the same application that annotates the structural positions of mutations and amino acid substitutions based on processing coronavirus crystal structures in PDB. Countries with a high frequency of specific mutations were marked on a geographic heat map using the online tool Maptive. A mutation-time plot for Southeast Asia was prepared by analyzing the sequence during the first 15 days of each month available in GISAID. Additionally, the numbers of COVID-19 infections were collected for each month from January to May 2020.
Phylogenetic analysis and transmission patterns (performed on 23nd May, 2020) were observed by analyzing 329 genome sequences from Southeast Asia (India 220, Bangladesh 13, Thailand 80, Indonesia 9, Sri Lanka 6, Nepal 1) using country filters on Nextstrain, an online based real-time pathogen evolution tracking tool (Hadfield et al., 2018).