Genomic DNA extraction, library construction, and sequence
analysis
A Water DNA Kit (Bioteke, Beijing, China)
was used to extract genomic DNA from
water samples, and a NucleoSpin Soil
Kit (Macherey-nagel, North
Rhine-Westphalia, Germany) was used to extract genomic DNA from sediment
samples, according to the protocols provided by the manufacturer. The
extracted DNA was quantified with a
Qubit Fluorometer (Thermo Fisher, Waltham, USA) and a Qubit dsDNA BR
Assay Kit (Invitrogen, Carlsbad, CA, USA), and the DNA quality was
checked by 1% agarose gel electrophoresis.
The V4–V5 variable regions of the bacterial 16S rRNA genes were
amplified with degenerate PCR primers, 515F (5’-GTGCCAGCMGCCGCGGTAA-3’)
and 907R (5’-CCGTCAATTCMTTTRAGTTT-3’) (Shan et al., 2015). Both forward
and reverse primers were tagged with Illumina adapter, pad, and linker
sequences. PCR amplification was performed in a 50-µL reaction
containing 30 ng of template DNA, fusion PCR primers, and PCR master
mix. PCR cycling conditions were as follows: 95°C for 3 min, 30 cycles
of 95°C for 45 sec, 56°C for 45 sec, 72°C for 45 sec, and a final
extension at 72°C for 10 min. The PCR products were purified using
Agencourt AMPure XP (Beckman-Coulter, Brea, CA, USA) beads and eluted in
Elution buffer. DNA libraries were qualified using an Agilent 2100
bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). The validated
libraries were used for sequencing on an Illumina MiSeq platform
(Illumina, San Diego, CA, USA) following the standard pipelines, which
generated 2 × 250-bp paired-end reads.
All raw sequences from this study
have been stored in the National Center for Biotechnology Information
(NCBI) Sequence Read Archive (SRA) database under the BioProject number
PRJNA 774972 and the accession number SUB10569930.
The raw FASTQ files were processed and analyzed using the QIIME2
v2019.02 (https://qiime2.org) platform
(Hall et al., 2018). Quality controls, annotations, and statistical
calculations were implemented using the standard QIIME2 pipeline
(Bokulich et al., 2018). Demultiplexing was conducted to determine the
sample sources of each sequence. The sequences were then denoised to
obtain amplicon sequence variants (ASVs, which are operational taxonomic
units with a sequence similarity of 100%) (Gonzalez et al., 2019) using
DADA2 (Prodan et al., 2020). The
obtained ASVs were taxonomically classified using the Greengenes 16S
rRNA gene database (Bolyen et al., 2019). Before subsequent analysis,
all chloroplast, mitochondrial, archaeal, and eukaryotic sequences
were removed. To minimize the
influence of unequal sequencing efforts, the ASV table was rarefied for
each sample. All ASVs with relative abundance <0.01% were
also removed to produce the ASV abundance table of each sample (Sogin et
al., 2006).