Introduction
Our understanding of host-associated microbiomes has relied heavily upon
16S rRNA gene amplicon sequencing of bacterial communities .
Amplicon-based profiling of the microbiome is affordable and supported
by a suite of accessible bioinformatic and analytical tools. But despite
their popularity, amplicon sequencing data: 1) are biased by 16S rRNA
gene copy number variation, 2) often only resolve to the level of genus
and 3) do not provide direct estimates of microbiome functional
potential . Hence, our ability to interpret the causes and consequences
of microbiome variation using 16S data is too often relegated to the
realm of speculation based on coarse taxonomic profiles.
In response to the limitations inherent in 16S approaches, there is a
growing interest in shotgun metagenomic sequencing. By randomly
sequencing the entire microbial genomic content of a sample, shotgun
sequencing can reconstruct microbial communities at finer resolutions
than amplicon sequencing, and provide direct estimates of microbiome
functional potential from microbial gene contents . However, the library
preparations and deep sequencing (10→100 million of read-pairs per
sample) required to generate shotgun metagenomic data can be an order of
magnitude more expensive than amplicon sequencing on a per sample basis
. These expenses make shotgun metagenomic sequencing infeasible for
large sample-sets .
Recent reductions in the per-base-cost of DNA sequencing and the
development of in-house and commercially available high-throughput
library preparation techniques have made shotgun sequencing more
affordable. Cost reductions notwithstanding, deep sequencing can still
be prohibitively expensive for large sample-sets. However, depending on
the research questions of interest, the characterization of major
patterns in microbiota communities and functional profiles may not
require deep sequencing.
Shotgun sequencing at lower depths than is conventional (shallow shotgun
sequencing) has been proposed as a cost-effective method for
characterizing microbial communities . Although not a synonymous
substitution for deep shotgun sequencing, shallow shotgun sequencing can
outcompete amplicon-based community characterization at comparable
costs, and with the additional benefit of capturing major variation in
microbial gene content. In an early test of shallow shotgun sequencing
capabilities in humans (retroarticular creases, stool, sub/supragingival
plaque, and tongue dorsum microbiomes), accurate species-level
differential abundances were observed among taxa which occurred within
samples at percent abundances as low as 0.05% of reads, in datasets
rarefied to 0.5 million reads/sample. Furthermore, at depths of only
1000 reads/sample, coarse biological patterns in alpha and beta
diversity were still evident
Despite its promise, shallow shotgun sequencing has limitations. For
instance, a reliance on read-based profilers means that the same low
sequencing depth which makes shallow shotgun sequencing economical,
renders de novo assembly approaches ineffective. Therefore,
shallow shotgun sequencing data cannot be used for novel gene discovery,
identification of rare taxa, or to create metagenome assembled genomes
(MAGs) using single sample assemblies. An inability to use de
novo assembly approaches means that shallow shotgun reads can only be
classified if they match references within genome databases. Publicly
available microbial genomes are heavily biased towards isolates or MAGs
from humans, and lab or production animals Therefore, the utility of
shallow shotgun sequencing needs to be assessed for host-associated
microbial communities which are likely to be underrepresented within
genome databases.
Here, we evaluate the ability of shallow shotgun sequencing to
characterize the taxonomic composition and functional potential of the
fecal microbiome of a free-ranging horse (Equus ferus caballus )
population living on Sable Island, Nova Scotia, Canada. Although horses
have been the subject of many 16S rRNA gene amplicon studies
—including Sable Island horses —they have not benefitted from deep
shotgun metagenomic studies. As a prevalent domesticated mammal, the
major bacterial clades observed in horses might be similar to those
observed in other domesticated species, which also originate from a
human agricultural environment, but which have been the subject of
deeply sequenced metagenomic studies, for example: cows , pigs , sheep ,
and chickens . Therefore, bacterial species unique to Sable Island
horses are likely to have close relatives in available microbial genome
reference databases, and so may be suitable for shallow shotgun
sequencing.
First, to determine the depth at which shallow shotgun sequencing
remains viable, we analyzed a successively rarefied deeply sequenced
dataset of 16 fecal microbiome samples. Second, to validate the efficacy
of more affordable library preparation methods, we compared sequencing
results generated using prevailing library preparation methods (Illumina
Nextera XT), to those created using a new high-throughput technique
(iGenomx Riptide, now Twist Biosciences Riptide). Third, we compared
shallow shotgun sequencing data to 16S rRNA gene amplicon sequencing of
the same DNA extracts, to quantify the concordance between amplicon and
shallow shotgun metagenomic based estimates of microbiota community
structure. Fourth, using an expanded 83-sample dataset, we also tested
whether biological patterns in the microbiome—which were first
observed in a 16S rRNA gene amplicon dataset (e.g., diet effects and
spatial structuring—could be replicated via shallow shotgun sequencing
of the same samples. Fifth, we re-analyze this 83-sample dataset using
profiles of microbiome functional potential derived from shallow shotgun
sequencing, to evaluate the purported advantage of a shallow shotgun
sequencing approach.