Data visualization, Ordination, and PERMANOVA
As a qualitative assessment of microbial communities, we constructed compositional bar plots of the relative abundance of bacterial genera in each bat fly species (ggplot2 , v.3.1.0; (Wickham, 2011). Any genus with a relative abundance <1% of the total reads in a bat fly species were condensed into a “Low Abundance” group. To quantitatively assess variation between microbiome communities, principal coordinates analysis (PCoA), implemented in phyloseq , was used to visualize differences in microbial communities captured by Euclidean distance between phylogenetic isometric log-ratio -transformed relative abundances. Metabarcoding using high-throughput sequencing is compositional in nature – meaning the total observations (reads per ASV) for a sample contains no information about the total number of microbes and is dependent on the sequencing capacity of the instrument (Fernandes et al., 2014; Gloor et al., 2017; Gloor & Reid, 2016; Tsilimigras & Fodor, 2016; Xia & Sun, 2017). To correct for the compositional nature of 16S rRNA sequencing data, isometric log-ratio transformations were implemented in the R packagephilr v.1.8.1 (Silverman et al., 2017). This transformation utilizes a user-provided bacterial phylogeny to standardize the abundance of bacterial taxa in a sample by the abundance of its sister taxon, creating “balances” at each node on the phylogeny (Silverman et al., 2017). Euclidean distances between philr balances provide phylogenetic and abundance information about the bacteria in a sample, similar to weighted UniFrac, that can be used for ordination and down-stream statistical analysis (Gloor et al., 2017; Silverman et al., 2017).
To test whether landscape variables, parasite variables, or host bat variables were correlated with microbial community composition, we used PERMANOVA with 9,999 permutations (adonis command in the R package vegan , v2.5.4; Anderson, 2014). When sampling is uneven between groups, PERMANOVA is sensitive to heteroscedasticity (Anderson & Walsh, 2013). Homogeneity of dispersion of each group of microbiomes was confirmed using betadisper permuted 999 times withpermutest (R package vegan ). As this is a nested system (within each fragment, we expect to see a subset of bat species, and within each bat species, only a subset of bat flies occur, and within each bat fly only a subset of bacterial taxa occur), assessing each variable separately ignores the interactions that could impact our conclusions. Sequential (Type I) sum of squares was used to account for the nonindependence of variables in testing for significant differentiation between microbiome communities. In each test, parasite species was the first variable, followed by one additional variable, and the interaction between parasite species and the additional variable (e.g., pairwise sample distance matrix ~ parasite species + log-scaled area + the interaction between parasite species and log-scaled area). The additionally examined variables were bat species, bat sex, bat individual, region (REGUA area or southern sites), log2-scaled area, log2-scaled isolation, distance from source, protection status (within REGUA or outside of REGUA, excluding the southern sites), and sampling site. All PERMANOVA analyses were performed on a dataset containing all localities with taxa filtered at 0.01% relative abundance per sample, a dataset containing all localities with taxa filtered more strictly at 0.1% relative abundance per sample, only the REGUA area localities (no southern sites), and only unprotected REGUA area localities (no localities within REGUA).
When sampling is uneven, sequential sum of squares is sensitive to the order in which variables appear in the equation. To overcome this limitation, we examined the impact of landscape within the four most well-sampled bat fly species using only fly samples collected from the REGUA area with bacterial taxa filtered using a threshold of 0.01% relative abundance per sample. We ordinated samples within these species separately from the rest of the data and estimated variation explained by landscape variables using PERMANOVA on individual variables.