2.2.5.3 Phylogenetic analysis
Phylogenetic analyses of ASVs, cASVs and aminotypes are conducted within the Analyze pipeline and all output files are stored in a dedicated directory within the results directory. First, sequences are aligned using the program muscle (v5.1; Edgar, 2021) and then trimmed automatically using the program trimAl (v1.4.1, Capella-GutiƩrrez et al., 2009) using a heuristics-based approach. By default, substitution model testing is done with Modeltest-NG (v0.1.7, Darriba et al., 2020). The program IQTREE (v2.2.0.3; Minh et al., 2020) is used to generate a maximum-likelihood tree. The substitution model used to generate the tree can be set by the user, sourced from Modeltest-NG results, or automatically selected with ModelFinder (Kalyaanamoorthy et al., 2017). The tree produced is then used for phylogrouping with TreeCluster and is visualized in the Analyze report. Within the report, the user has the option to color code nodes based on sequence identity, taxonomy hit, MED group, or phylogroup assignment.