Sequencing metrics
The pilot sequencing (N = 79 including sequencing blanks) led to a total of 412 million paired end raw reads, which initial processing steps (pairing, filtering by length, quality and ambiguity, and demultiplexing) reduced to 342 million (“Pilot”, Table 1). Removal of chimeric, erroneous and rare sequences further reduced the dataset, and after taxonomic identification 284 mill. reads (69% of raw) distributed over 49 697 zOTUs were isolated. Most of the reads that were subsequently filtered out were assigned to the consumer taxon Maxillopoda (98% of assigned reads), whereas reads identified as contaminants or symbionts accounted for 1.6% and 0.2% respectively. The final dataset of putative prey counted 1.2 million reads (0.4% of the assigned reads) in 1500 zOTUs. Distributed over 75 real samples, the pilot averaged 16 000 prey reads per copepod consumer.
The final sequencing, with an increased number of samples (N = 456 including sequencing blanks), yielded 5.4 billion paired end raw reads (“Full”, Table 1). Of these, approximately 4.3 billion reads (79% of raw) in 130 000 zOTUs were subsequently assigned to taxonomy. After discarding zOTUs assigned to Maxillopoda (98% of assigned reads), contaminants (1.1%) and symbionts (0.2%), the putative prey counted 52.2 million reads in 22 391 zOTUs. This corresponded to 1.2% of the assigned reads, or 1.0% of the raw reads, and a mean depth of ~120 000 prey reads per copepod consumer. Compared to dividends from relevant literature using dissection or blocking primers, the average prey reads per sample of both sequencing runs were more than two times greater (Table 2).
Table 1: Summary of read and zOTU abundances before and during bioinformatic processing (Step 1-4), according to sample type (real samples or extraction negatives), and according to taxonomic identity (consumer, symbiont, contamination, prey). The total number of samples (N) are presented for both sequencing runs, and the number of extraction negatives and real samples are indicated in parentheses for the pilot (np) and for the full sequencing (nf). Sample types and identified taxa are also presented with percentage-wise contributions to the total of assigned reads (percentage of assigned; POA) or to assigned reads from real samples (POA).