R-value thresholds influence module size and phylogenetic relatedness of OTUs binned into a module
A key parameter in SCNIC is the R-value threshold used to pick modules. Use of a high R-value threshold would be expected to bin only very tightly correlated microbes with strong relationships, while less stringent thresholds may identify community-level patterns representing more loosely connected microbial pairs. To illustrate this concept, we binned OTUs into modules using the SMD method at R-value thresholds between 0.2 and 1.0 using the HIV dataset. As expected, at lower R-value thresholds, more OTUs were binned into modules and lower numbers of modules of smaller average size were formed as the threshold increased (Figure 4A). To illustrate the effects of R-values thresholds on the nature of the identified modules, we compare SCNIC outputs using R-value thresholds of 0.2, 0.4, and 0.65. As shown in Figure 4, which visualizes modules in Cytoscape using SCNIC output files, the R-value threshold influences the size and connectivity of the network. We also illustrate the effect of using different thresholds by examining the correlations between OTUs that are included in the first module output by SCNIC, which is the largest module (module-0 ) (Figure 5). All OTUs in module-0 are positively correlated with each other, since SCNIC only captures positive correlations.
Microbes co-occurring in the same environmental niche have previously been observed to be phylogenetically closer on average[4]. This is likely because phylogenetic relatedness has been correlated with functional relatedness, such as through having more shared genome content, leading towards success in similar environments [55]. We show that increasing the R-value threshold results in modules that contain OTUs that are more phylogenetically similar on average (Figure 5).