Methods

DNA extraction

Eight M. festivus were collected at Anavilhanas National Park in the Rio Negro (AM, Brazil) (2°41’46.1S, 60°46’33.3W) in 2015, using small seine net-fishing. Shortly after collection, fish were dissected using sterile instruments, under a flame, to extract a fin and a gut sample according to the methodology described by Sylvain etal. (2019). Samples were frozen at -80 °C right after the dissection and until the DNA extraction. Genomic DNA was extracted following the manufacturer’s instructions of the QIAGEN DNeasy® Blood and Tissues kit and stored at −80°C until amplification. Double strand DNA was quantified using the Qubit® 2.0 Fluorometer.

Blocking primers development

Using purified DNA from six M. festivus fin samples, the V4-V5 region of the rRNA 18S gene of M. festivus was amplified by polymerase chain reaction (PCR) to precisely describe the target gene. We used the following amplification primers: 566F and 1289R [Appendix B] (Hadziavdic et al . 2014) in order to have a sequence of the target gene which should have 723 base pairs. Having a longer sequence was meant to obtain a high sequence quality to help at designing the blocking primers. The final concentrations of the PCR reaction (50 µL total volume) contained: 1X of Q5 Reaction Buffer, 1X of High GC Enhancer, 200 µM of dNTPs, 0.5 µM of each amplification primer, 0.02 U/µl of Q5 High-Fidelity DNA Polymerase and 3 µl of DNA template (lower than 1000 ng of DNA in total). We used the following PCR program: (a) 30s 98°C; (b) 10s 98°C; (c) 30s 60°C; (d) 20s 72°C; (e) 2min at 72°C; using 30 amplification cycles. The amplicons were purified with 0.7X AMPure beads (Beckman Coulter Genomics) and DNA quality was tested using Nanodrop 2000. The six M. festivus 18S rRNA V4-V5 gene amplicons were sequenced by Sanger sequencing at the IBIS Plateforme d’Analyse Génomique to precisely describe the target gene. The six resulting sequences’ electropherograms were assembled jointly using the CAP3 denovo assembler implemented in the UGENE suite to obtain a robust consensus sequence (Sequences are available on Genbank NCBI with the following accession numbers: MZ736888:MZ736899[accn]).
The consensus sequence was aligned to known 18S rRNA gene sequences of 7 genera of parasites typically found in fish’s gut, i.e.,Cryptosporidium hominis , Myxobolus cultus ,Lepeophtheirus salmonis , Myxobolus cerebralis ,Caenorhabditis elegans , Ichthyophthirius multifiliis andAglaiogyrodactylus forficuloides . We used the MUSCLE (Madeira etal . 2019) alignment program, also implemented in UGENE (Okonechnikov et al. 2012). The consensus sequence is available in supporting information (Parasite_Alignment.fa). We manually compared the M. festivus 18S rRNA gene V3-V4 region consensus sequence to the seven parasite sequences to search for regions which can strongly differentiate M. festivus from other taxa. Three unique regions were detected and used as templates to develop blocking primers [Fig. 1]. From these unique genomic regions, we designed four blocking primers, two forward and two reverse primers, to specifically bind to the chosen regions of M. festivus’ V4-V5 rRNA gene [Fig. 1]. These primers, modified with a C3-spacer to inhibit DNA amplification, were manufactured by Sigma-Aldrich Corp. in Canada.
The first combination of primers [635F-C3 and 1062R-C3], henceforth represented by the letter F , binds to DNA regions far from the universal primers binding site. On the contrary, the other combination [816F-C3 and 846R-C3], henceforth represented by the letterM , is composed of primers binding at the center of the target gene and stopping the elongation midway.

Blocking primer evaluation

PCR amplification and DNA sequencing

DNA from eight M. festivus gut samples collected at Anavilhanas National Park in the Rio Negro was used as PCR material to test the efficiency of the blocking primer combinations. Four gut samples were filled with food (Sample ID 1 to 4), while the other four samples had a moderate to null food bolus, leading to a higher host DNA to gut content ratio (Sample ID 5 to 8). The V4-V5 region (≈ 550 base pairs) of the 18S rRNA gene was amplified by PCR using the amplification primers 616*F, 574*F and 1132R [Appendix B] (Hugerth et al . 2014). We used two forward primers to amplify a higher diversity of organisms in samples since 616*F and 574*F does not amplify the same range of organisms. This set of primers was designed to optimise parasite species detection (Kounosu et al . 2019). We produced seven different PCR reaction mixes for each sample. For each PCR reaction mix, the final concentrations were (50 µl total volume): 1X of Q5 Reaction Buffer, 1X of High GC Enhancer, 200 µM of dNTPs, 0.5 µM of each amplification primer, 0.02 U/µl of Q5 High-Fidelity DNA Polymerase and 3 µl of DNA template (lower than 1000 ng of DNA in total). To this reaction mix we added either 2X, 5X or 10X, in comparison to amplification primers, of each of the two combination of blocking primers (i.e., F-primersand M-primers ) and included a control group without blocking primers. The following PCR program was used: (a) 30s 98°C; (b) 10s 98°C; (c) 30s 57°C; (d) 20s 72°C; (e) 2min at 72°C; 35 amplification cycles. Amplified DNA was purified with 0.5X AMPure beads (Beckman Coulter Genomics) since using a lower PCR product to beads volume ratio helps to eliminate shorter sequences, primers dimers, proteins, and phenols. Post‐PCR DNA concentration and quality were assessed on Nanodrop and by electrophoresis on 2% agarose gels. Dual indexing PCR was completed using the following concentrations in the reaction mix (25 µl total volume): 1X of Q5 Reaction Buffer, 1X of High GC Enhancer, 200 µM of dNTPs, 0.5 µM of each unique indexing primer, 0.02 U/µl of Q5 High-Fidelity DNA Polymerase and 1 µl of DNA template from the amplifying PCR. We used the following PCR program: 30 s 98°C; (b) 10 s 98°C; (c) 30 s 60°C; (d) 20 s 72°C; (e) 2 min at 72°C; 12 amplification cycles. We ran the PCR product on a 2% electrophoresis gel to assess the efficiency of the PCR reaction, purified the amplified DNA with 0.7X AMPure beads (Beckman Coulter Genomics) and verified the DNA concentration and quality on NanodropTM 2000. Positive and negative controls were included at each step of the PCR amplifications. After purification, multiplex sequencing was performed using the MiSeq platform from Illumina®  (Illumina), by the Plateforme d’analyses génomiques  at the Institut de Biologie Intégrative et des Systèmes  (IBIS) of Université Laval using the MiSeq Reagent V3 (2 x 300 bp) Kit according to the manufacturer’s protocols. Positive and negative controls were also sequenced but did not result in any sequence including amplification primers, supporting the absence of contamination during PCRs.

Data processing and taxonomic assignation

The analysis of amplicon sequences was done at the Institut de Biologie Intégrative et des Systèmes  (IBIS) at Université Laval . After trimming with CutAdapt (Martin 2011), 2,835,254 sequences were obtained (mean of 33,753 paired end sequences per sample). The demultiplexed fastq paired sequence files were processed through dada2 (Callahan et al . 2016) using the functionytfilterAndTrim () with the following parameters: 280 for the read truncation length, total removal of reads including a nucleotide with a Phred score of two of lower and a maximum expected error of two for forward reads and four for reverse reads. The filtered reads were then fed to the error rate learning, dereplication and ASV inference steps using the functions learnErrors (), derepFastq () anddada (). Then, forward and reverse reads were merged using the function mergePairs() using the default parameters. Chimeric sequences were removed using the removeBimeraDenovo () function with the “consensus” method parameter. Taxonomic assignation was done using the function assignTaxonomy from DADA2 in R (Wang etal . 2007) based on the SILVA Release 138.1 Ref NR 99 for SSU, which is based on a 99% criterion to remove redundant sequences in the database. We verified that the sequencing depth of every sample was sufficient using a rarefaction analysis [Fig. S1]. We calculated a taxonomic tree based on the DNA sequences using the R packageape (Paradis et al . 2004, Paradis & Schliep 2019) and aPhyloseq (McMurdie & Holmes 2013) object was produced inR to do subsequent analyses.

Statistical analyses

We estimated blocking primers efficiency by calculating the relative abundance of the class Actinopterygii, the class of M. festivus , in samples using the R package Phyloseq . Then, we grouped samples according to the blocking primer used during their amplification and produced barplots representing the mean relative abundances of Actinopterygii in each group. Since the relative abundances were not normally distributed, we used a non-parametric paired Wilcoxon test to verify for significant differences in host amplicons relative abundance between blocking primer groups.
Using the R functions plot_richness from Phyloseqand estimate_pd from btools , we estimated the observed (Total number of ASVs), Faith (Phylogenetic diversity), Shannon, and Simpson alpha diversity indexes of the 5 blocking primer groups. We calculated a two-way ANOVA using both the blocking primer groups and the Gut-ID as explicative variables for each diversity index estimated. We verified the assumptions of normality and homoscedasticity of variances using Shapiro-Wilk tests on residuals and Levene tests for each group. In cases where the assumptions were not respected, we used a square root transformation which lowered the impact of anomaly high alpha diversity values. When the ANOVA was significant for a given Alpha diversity metric, we used a Tukey’s HSD to verify for pairwise differences between groups considering a 95 % confidence interval. Afterward, we computed the same diversity analyses based on a dataset where we removed the class Actinopterygii, using the function “subset_taxa()” fromPhyloseq , to verify if host related ASVs diversity were driving the differences between groups. For both the datasets with and without host related ASVs, we produced boxplots representing the four alpha diversity metrics in function of the experimental groups considered.
A two-way PERMANOVA based on Bray-Curtis dissimilarity indexes was calculated using the function adonis in the R packagevegan considering the blocking primer groups and the gut IDs of samples as the independent variables. To verify for significant differences between blocking primer groups, we used a post-hoc pairwise Adonis test with the function pairwise.adonis2 . Afterwards, PCoAs based on Bray-Curtis dissimilarity indexes and Weighted Unifrac phylogenetic distances were produced using the functionplot_ordination from the R package PHYLOSEQ to illustrate the results of the PERMANOVA. Again, we produced another PCoA based on Bray-Curtis dissimilarity index using the dataset without host related ASVs. More specifically, we removed the class Actinopterygii from the Phyloseq object and produced a PCoA to assess the impact of blocking primers while removing the influence of host related ASVs.
We produced a taxonomic barplot considering the top 10 % ASVs in abundance (43 most abundant ASVs) present in both the M-primersand control groups for the four samples with a high relative abundance of host DNA using the function plot_bar from Phyloseq inR . We produced the barplot considering the Class as the taxonomic resolution. We omitted the Phylum Streptophyta since its overabundance in most samples masked the detectability of other more-interesting taxonomic units and is food related.