Secondary uses of shallow shotgun metagenomic data
Our analysis of metabolic pathways revealed patterns initially
overlooked by our taxonomic emphasis on bacteria; Archaea-associated
pathways were more abundant among horses with access to sandwort.
Re-analysis of shotgun metagenomic community profiles in which Archaea
were retained, confirmed that Archaea were at greater average relative
abundance when sandwort was present. Interestingly,Christensenellaceae is known to exist in tight co-occurrence
patterns with Archaea in the mammalian gut . An association between
Archaea and sandwort in the shallow shotgun dataset may parallel the
association we observed between Christensenellaceae and sea
sandwort in the 16S dataset. These results further highlight an
ancillary advantage of shallow shotgun metagenomic sequencing; its
ability to characterize abundant archaeal and eukaryotic constituents of
the microbiome, alongside bacterial communities. Notably, the proportion
of shallow shotgun sequencing reads which were classified as Bacteria
(97%), Archaea (2.3%), and Fungus (0.3%) are in-line with
expectations set by qPCR-based quantitation of these groups in horse
feces . In contrast, the characterization of Archaea and Fungus
communities using an amplicon sequencing approach would require separate
primers, and additional sequencing.
In addition to more comprehensive profiling of the microbiome, shallow
shotgun sequencing data might provide dietary insights. For example, we
observed reads which mapped to genes related to the metabolic pathways
for alginate degradation. In nature, alginate occurs primarily in brown
seaweeds , which Sable Island horses are known to consume when it washes
ashore. The presence of alginate degradation pathways could indicate a
metabolic niche for the metabolism of brown seaweed biomass in the Sable
Island horse microbiome. Alternatively, reads mapping to alginate
degradation related genes could derive from undigested seaweed in the
horse feces. This conclusion is partly supported by our observation of
genes related to other metabolic pathways known to occur in brown
seaweed, including the mannitol cycle .
The presence of dietary derived sequences in shallow shotgun metagenomic
dataset provides both problem, and opportunity. Dietary confounds of
functional analyses might be circumvented by restricting functional
profiling to shotgun reads previously identified as microbial, or by
filtering reads to remove known dietary items, if reference genomes for
dietary items are available. Although diet-derived sequences in shallow
shotgun metagenomics datasets represent a potential confound, they might
also provide researchers the opportunity to reconstruct host diet,
alongside the microbiome. When benchmarked against dietary
metabarcoding, dietary reconstruction using shotgun sequences performs
well, but requires deep sequencing . However, deep sequencing has been
required only because previous studies relied on small marker region
reference databases, rather than mapping reads to the genomes of dietary
items, which are frequently unavailable. As publicly available genomic
data increase so too might the viability of shallow shotgun-based
characterization of host diet from fecal samples.
Beyond the scope of any one study, we argue that shallow shotgun
metagenomic sequences have greater long-term value than 16S rRNA gene
amplicon datasets. Although the sequencing depth inherent of this
approach is too low for de novo assembly on a single sample
basis, shallow shotgun sequencing can still generate a tremendous wealth
of metagenomic sequence data in aggregate. Co-assembly methods could be
used to create system-specific microbial reference genomes, using the
same sequencing data created to profile the microbiome —but at a cost
comparable to amplicon sequencing. Study-system specific MAGs would
allow for more precise classifications, and permit the use of functional
profilers which infer total genomic contents . Incidentally, sequencing
a breadth of samples at shallow depths, might allow for more efficient
MAG assembly than deeply sequencing a handful of samples . Even if
co-assembled shallow shotgun reads are insufficient for MAG recovery,
they are useful for identifying samples to target with strategic
supplemental deep short read, or long read sequencing.
Beyond the microbiome, if sequences of dietary origin are abundant in
shotgun metagenomic datasets, co-assemblies might recover genomic
contigs from the host diet. These contigs could be used for diet
reconstruction (see previous paragraph) by relaxing our reliance on
short marker regions for dietary reconstruction. However, this requires
that the genomes of species which are closely related to dietary items
are available, so that contigs can be classified. Alternately, in
systems where an appreciable fraction of shotgun sequence reads are
derived from the host, shallow shotgun sequencing may be capable of
characterizing the microbiome, while also providing low-pass genotyping
of the host .