Discussion

Blocking primer development

From the three unique genomic regions discovered in M. festivusV4-V5 18 rRNA gene, we developed two combinations of blocking primers that were specific enough to accurately ligate to the host’ target region. Elongation arrest blocking primers were developed in duos, one forward and one reverse, to block the amplification on both the 3’ and 5’ strands. They have been modified with a C3 spacer at the 3’ end, a chain of 3 hydrocarbons, to stop the advancement of the DNA polymerase. The blockage should result in the formation of shorter and incomplete amplicons that will not be sequenced because they do not possess the two adapter sequences required for Illumina Mi-seq sequencing.
Vestheim & Jarman (2008) previously tried to develop elongation arrest blocking primers. However, their attempt was unsuccessful since these completely inhibited the PCR reaction in their study. Likewise, the authors suggested a lower efficiency of elongation arrest blocking primers because they are not interacting directly with the DNA polymerase (Vestheim and Jarman 2008). Yet, there is no scientific data to support their lower efficiency. Also, Belda et al . (2017) previously achieved a blockage of more than 80 % using elongation arrest by Peptide-Nucleic Acid while they failed at reducing host DNA using annealing blocking primers in the same study. Furthermore, the genetic regions adjacent to universal primers are usually very conserved (Rojahn et al . 2021). For this reason, elongation arrest primers are more versatile and easier to develop, making them suitable for any species and a wide range of studies. The development of a replicable method to develop such primers should facilitate further Eukaryote metagenomic studies.
Our first duo of primers (F-primers ) [635F-C3 and 1062R-C3] was binding to DNA regions afar from the amplification primers annealing site [Fig. 1]. The F-primers set was meant to block the DNA polymerase towards the end of the amplification, forming an incomplete amplicon that should not include the 3’ complementary sequence required for the barcode indexing PCR. Since the indexing sequence is required for Miseq sequencing, the incomplete amplicons should not be sequenced.
The second duo (M-primers ) [816F-C3 and 846R-C3] was composed of primers binding at the center of the target gene and stopping the elongation midway [Fig. 1]. This results in the formation of incomplete amplicons of about half the length of a normal amplicon. The formation of short amplicons has numerous advantages. Firstly, the incomplete amplicons are visible on electrophoresis or polyacrylamide gels, on which the efficacy of the blocking primers can be assessed quickly after the PCR amplification. Secondly, these shorter amplicons are easy to filter-out with AMPure beads from Beckman Coulter Genomics. The size recovery based on bead to DNA ratio step tends to select longer amplicons and remove primer dimers and other unwanted contaminants. A bead to DNA ratio of 0.5X is removing most of the amplicons of less than 300 bp facilitating the removal of incomplete amplicons which could otherwise negatively affect the indexing PCR.

Blocking primer evaluation

There was a marked difference between the relative abundances of host related ASVs detected in the eight guts sampled. This important variation was caused by the state of the gut sampled. While four guts were filled with food and contained a low ratio of host tissues to gut content (Gut IDs 1 to 4), the four other guts contained a higher relative mass of host tissues (Gut IDs 5 to 8). Blocking primers are especially needed in this second case, where the detection of the Eukaryotic communities of the gut is masked by the high abundance of host DNA in samples.
The F-primers did not significantly reduce the amount of host DNA in samples [Fig. 2]. This could be due to the annealing position of the F-primers , located afar from the annealing site of the polymerase. Here, the long distance between the initial position of the polymerase and the blocking primers could lead to an inefficacy at blocking the polymerase. While we have no way of verifying this hypothesis, host DNA sequences were full length in the samples amplified using the F-primers , confirming that there was no blockage of the polymerase during the PCR amplification. Considering the inefficacy of the F-primers , it would have been interesting to test for blocking primers complementary to the one used in this study. The complementary sequence of the F-primers would have blocked the amplification of host DNA nearer to the polymerase annealing site. The resulting amplicons would have been very short, approximately 50 base pairs, and could potentially lead to a better blockage of host DNA amplification by directly removing the cleaved sequences during the size selection step using AMPure beads from Beckman Coulter Genomics.
Conversely, M-primers significantly reduced the relative abundance of host DNA by 66 % in samples [Fig. 2]. Also, these primers systematically led to a reduction in host DNA abundance, proving their efficacy in the presence of both low and high relative abundances of host DNA [Table 1]. While we achieved a significant reduction of host DNA amplification in samples using elongation arrest blocking primers, we were not able to reach a complete inhibition. Previous studies reached near complete inhibition of host target gene amplification using both annealing blocking primers and CCSAS (Vestheim and Jarman 2008, Liu et al . 2019, Zhong et al . 2021). Furthermore, Zhong et al. (2021) already developed guide RNAs for 16 000 referenced Eukaryotes. Their method could also be implemented for unreferenced organisms using sanger sequencing, as we did in this study. Consequently, future research should consider testing and optimizing this novel approach, which could represent a major advancement to the field.
Similarly to the study of Vestheim and Jarman (2008), using a concentration of more than 2X of elongation arrest blocking primers as the M-primers completely inhibited the PCR amplification. This could be caused by the short complementary region between these two blocking primers, which may favour the formation of primer dimers. Consequently, the PCR may be inhibited by the formation of secondary structures at high concentrations, ultimately leading to unwanted interactions with the polymerase. Also, there could be non-specific binding in presence of a high concentration of the M-primers . Not achieving to include higher concentrations of blocking primers is a major shortcoming since previous studies systematically obtained a higher amplification inhibition when using 10X of blocking primers (Vestheim and Jarman 2008, Clerissi et al . 2018, Su et al . 2018). In this sense, it is possible that achieving to use a higher concentration of elongation arrest blocking primers could lead to a near complete inhibition of host target gene amplification.

Alpha diversity

There was a reduction in Faith’s phylogenetic alpha diversity in samples from the M-primers group. However, the reduction in this alpha diversity metric was mainly caused by the lower number of host-related ASVs present in samples amplified with the M-Primers . Indeed, the Shannon and Simpson alpha diversity indexes, two metrics that correct for the evenness, were not significantly different. Here, the evenness gained from the reduced abundance of overrepresented host ASVs is counterbalancing for the loss of some ASVs from the class Actinopterygii in the M-primers group. To support this, there is no significant difference in alpha diversity between groups when omitting Actinopterygii in the dataset. This confirms that the differences in alpha diversity between M-primers and control groups were mainly caused by the presence of many ASVs from this class. According to these alpha diversity results, M-primers specifically inhibited the amplification of host DNA sequences since their usage did not hinder the detection of other ASVs in samples.
However, the usage of blocking primers did not lead to the detection of an increased alpha diversity in samples. Moreover, we only detected a mean of 30 ASVs per sample, which is a low total diversity. For this reason, the removal of a small number of host-related ASVs in theM-primers group could represent an important shift in the communities observed in samples. Strikingly, 20 % of all ASVs in the dataset are related to vertebrates, while M. festivus diet should not include vertebrates. Indeed, the species has a generalist diet mostly composed of detritus and periphyton (Pires et al. 2015). For this reason, most of these Vertebrata ASVs probably originate from host DNA present in samples. They could also come from M. festivus picking on other fish, as M. festivus was previously observed cleaning other fish from endoparasites (Severo-Neto and Froehlich 2015). Overall, alpha diversity data support that using blocking primers facilitated the unravelling of all the targeted Eukaryotic diversity while requiring a lower sequencing depth.

Beta diversity

M-primers led to a significant shift in beta diversity in the four samples which contained a high ratio of host tissues to gut content (Gut IDs 5 to 8) but had a limited influence on the beta diversity of the four other samples (Gut IDs 1 to 4) [Fig. 3]. Consequently, theM-primers reduced the impact of a starting high relative mass of host tissues in samples by homogenizing both types of samples. This is a major advantage of using blocking primers as it mitigates the sampling bias that occur when we collect fish on the field. This is even more important when working with samples that have a low diversity as the background noise caused by the important abundance of host sequences has a major influence on the community analysed.
In parallel to what we observed in the alpha diversity data, omitting the class Actinopterygii negated the effect of the M-Primers on the beta diversity [Fig. S5]. While these results support a high specificity of the blocking primers to the host target gene, this does not rule out the possibility that our blocking primers would also inhibit the amplification of other teleosteans. This limits the method for the description carnivorous Teleostean diet. Still, diet studies should rather focus on the study of the food bolus, which contains a reduced amount of host sequences and a higher prey DNA quality as it was not completely digested yet.

Blocking primers for parasite detection

The use of M. primers enhanced the detection of low abundance parasitic taxonomic classes in gut IDs 5 to 8 by reducing the background noise caused by the high relative abundance of host DNA [Fig. 4]. Indeed, the M-primers enhanced the detection of parasitic classes of high interest as Trematodes in gut ID 6 [Fig. 4]. This class of parasitic helminth is known to infect fish species as an intermediate host and Mammals as definitive host, leading to potential risk for Amazonian communities. We also detected Arcellinida in gut ID 7, a class of Amoebae that could be a parasitic taxon infecting M. festivus . This supports the usefulness of blocking primers for parasitic screening, as parasitic taxa are usually present in lower abundances than host tissues and food in gut samples.
More broadly, the implementation of a metataxonomic approach combined with blocking primers optimized for parasite screening allowed the detection of multiple other potential parasitic infections of M. festivus . For instance, we detected the presence of a Ciliophora from the genus Nyctotherus sp. , a parasite known to cause diarrhea and digestive problems in pet turtles (Satbige et al . 2017, Suzuki etal . 2020). To our knowledge, this genus was only documented once infecting a fish (Earl & Jiménez 1969). Also, we detected Microsporidia in very low abundance in gut ID 4 but cannot conclude about its parasitic role in M. festivus considering the low amount of data that we collected. These results highlight the potential of developing metataxonomic approaches to describe the host-parasite relationships in a region as diversified as the Amazonian rainforest. The usage of a metataxonomic pipeline favors the detection of unknown parasitic infections. These species would potentially not have been detected using conventional taxonomic identification methods, which rely on light microscopy and are still the main method used in tropical parasite ecology studies. However, using DNA based methods is not conclusive about the infectiousness of the taxa that are detected in the gut as dead specimens could also be amplified. Here, the gold standard would be a combination of DNA based and phenotypic data confirming the presence of a parasite at the infectious stage.