Secondary uses of shallow shotgun metagenomic data
Our analysis of metabolic pathways revealed patterns initially overlooked by our taxonomic emphasis on bacteria; Archaea-associated pathways were more abundant among horses with access to sandwort. Re-analysis of shotgun metagenomic community profiles in which Archaea were retained, confirmed that Archaea were at greater average relative abundance when sandwort was present. Interestingly,Christensenellaceae is known to exist in tight co-occurrence patterns with Archaea in the mammalian gut . An association between Archaea and sandwort in the shallow shotgun dataset may parallel the association we observed between Christensenellaceae and sea sandwort in the 16S dataset. These results further highlight an ancillary advantage of shallow shotgun metagenomic sequencing; its ability to characterize abundant archaeal and eukaryotic constituents of the microbiome, alongside bacterial communities. Notably, the proportion of shallow shotgun sequencing reads which were classified as Bacteria (97%), Archaea (2.3%), and Fungus (0.3%) are in-line with expectations set by qPCR-based quantitation of these groups in horse feces . In contrast, the characterization of Archaea and Fungus communities using an amplicon sequencing approach would require separate primers, and additional sequencing.
In addition to more comprehensive profiling of the microbiome, shallow shotgun sequencing data might provide dietary insights. For example, we observed reads which mapped to genes related to the metabolic pathways for alginate degradation. In nature, alginate occurs primarily in brown seaweeds , which Sable Island horses are known to consume when it washes ashore. The presence of alginate degradation pathways could indicate a metabolic niche for the metabolism of brown seaweed biomass in the Sable Island horse microbiome. Alternatively, reads mapping to alginate degradation related genes could derive from undigested seaweed in the horse feces. This conclusion is partly supported by our observation of genes related to other metabolic pathways known to occur in brown seaweed, including the mannitol cycle .
The presence of dietary derived sequences in shallow shotgun metagenomic dataset provides both problem, and opportunity. Dietary confounds of functional analyses might be circumvented by restricting functional profiling to shotgun reads previously identified as microbial, or by filtering reads to remove known dietary items, if reference genomes for dietary items are available. Although diet-derived sequences in shallow shotgun metagenomics datasets represent a potential confound, they might also provide researchers the opportunity to reconstruct host diet, alongside the microbiome. When benchmarked against dietary metabarcoding, dietary reconstruction using shotgun sequences performs well, but requires deep sequencing . However, deep sequencing has been required only because previous studies relied on small marker region reference databases, rather than mapping reads to the genomes of dietary items, which are frequently unavailable. As publicly available genomic data increase so too might the viability of shallow shotgun-based characterization of host diet from fecal samples.
Beyond the scope of any one study, we argue that shallow shotgun metagenomic sequences have greater long-term value than 16S rRNA gene amplicon datasets. Although the sequencing depth inherent of this approach is too low for de novo assembly on a single sample basis, shallow shotgun sequencing can still generate a tremendous wealth of metagenomic sequence data in aggregate. Co-assembly methods could be used to create system-specific microbial reference genomes, using the same sequencing data created to profile the microbiome —but at a cost comparable to amplicon sequencing. Study-system specific MAGs would allow for more precise classifications, and permit the use of functional profilers which infer total genomic contents . Incidentally, sequencing a breadth of samples at shallow depths, might allow for more efficient MAG assembly than deeply sequencing a handful of samples . Even if co-assembled shallow shotgun reads are insufficient for MAG recovery, they are useful for identifying samples to target with strategic supplemental deep short read, or long read sequencing.
Beyond the microbiome, if sequences of dietary origin are abundant in shotgun metagenomic datasets, co-assemblies might recover genomic contigs from the host diet. These contigs could be used for diet reconstruction (see previous paragraph) by relaxing our reliance on short marker regions for dietary reconstruction. However, this requires that the genomes of species which are closely related to dietary items are available, so that contigs can be classified. Alternately, in systems where an appreciable fraction of shotgun sequence reads are derived from the host, shallow shotgun sequencing may be capable of characterizing the microbiome, while also providing low-pass genotyping of the host .