All selected sequences were aligned using the MAFFT (Multiple
Alignment using Fast Fourier Transform) algorithm (Katoh, Misawa, Kuma,
& Miyata, 2002) and the alignment was visualized in JalView 2.11.0. The
aligned sequences were used for the construction of a phylogenetic tree
using the neighbor-joining method, along with 500 bootstrap replications
and a 95% site coverage cutoff value, in Molecular Evolutionary
Genetics Analysis (MEGA X) software (Kumar, Stecher, Li, Knyaz, &
Tamura, 2018). Interactive Tree Of Life (iTOL) v5 (Letunic & Bork,
2019) was used to adjust the branch and label color of the phylogenic
tree. Non-synonymous mutations with their specific mutation sites and
their global frequencies were determined using ‘CoVsurver enabled by
GISAID’ based on viral sequences in GISAID’s EpiCoV database. This data
was checked and validated carefully against aligned sequences. 3D
structural visualization of the spike glycoproteins was performed using
the same application that annotates the structural positions of
mutations and amino acid substitutions based on processing coronavirus
crystal structures in PDB. Countries with a high frequency of specific
mutations were marked on a geographic heat map using the online tool
Maptive. A mutation-time plot for Southeast Asia was prepared by
analyzing the sequence during the first 15 days of each month available
in GISAID. Additionally, the numbers of COVID-19 infections were
collected for each month from January to May
2020.
Phylogenetic analysis and transmission patterns (performed on
23nd May, 2020) were observed by analyzing 329 genome
sequences from Southeast Asia (India 220, Bangladesh 13, Thailand 80,
Indonesia 9, Sri Lanka 6, Nepal 1) using country filters on Nextstrain,
an online based real-time pathogen evolution tracking tool (Hadfield et
al., 2018).