Genomic DNA extraction, library construction, and sequence analysis
A Water DNA Kit (Bioteke, Beijing, China) was used to extract genomic DNA from water samples, and a NucleoSpin Soil Kit (Macherey-nagel, North Rhine-Westphalia, Germany) was used to extract genomic DNA from sediment samples, according to the protocols provided by the manufacturer. The extracted DNA was quantified with a Qubit Fluorometer (Thermo Fisher, Waltham, USA) and a Qubit dsDNA BR Assay Kit (Invitrogen, Carlsbad, CA, USA), and the DNA quality was checked by 1% agarose gel electrophoresis.
The V4–V5 variable regions of the bacterial 16S rRNA genes were amplified with degenerate PCR primers, 515F (5’-GTGCCAGCMGCCGCGGTAA-3’) and 907R (5’-CCGTCAATTCMTTTRAGTTT-3’) (Shan et al., 2015). Both forward and reverse primers were tagged with Illumina adapter, pad, and linker sequences. PCR amplification was performed in a 50-µL reaction containing 30 ng of template DNA, fusion PCR primers, and PCR master mix. PCR cycling conditions were as follows: 95°C for 3 min, 30 cycles of 95°C for 45 sec, 56°C for 45 sec, 72°C for 45 sec, and a final extension at 72°C for 10 min. The PCR products were purified using Agencourt AMPure XP (Beckman-Coulter, Brea, CA, USA) beads and eluted in Elution buffer. DNA libraries were qualified using an Agilent 2100 bioanalyzer (Agilent Technologies, Palo Alto, CA, USA). The validated libraries were used for sequencing on an Illumina MiSeq platform (Illumina, San Diego, CA, USA) following the standard pipelines, which generated 2 × 250-bp paired-end reads. All raw sequences from this study have been stored in the National Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database under the BioProject number PRJNA 774972 and the accession number SUB10569930.
The raw FASTQ files were processed and analyzed using the QIIME2 v2019.02 (https://qiime2.org) platform (Hall et al., 2018). Quality controls, annotations, and statistical calculations were implemented using the standard QIIME2 pipeline (Bokulich et al., 2018). Demultiplexing was conducted to determine the sample sources of each sequence. The sequences were then denoised to obtain amplicon sequence variants (ASVs, which are operational taxonomic units with a sequence similarity of 100%) (Gonzalez et al., 2019) using DADA2 (Prodan et al., 2020). The obtained ASVs were taxonomically classified using the Greengenes 16S rRNA gene database (Bolyen et al., 2019). Before subsequent analysis, all chloroplast, mitochondrial, archaeal, and eukaryotic sequences were removed. To minimize the influence of unequal sequencing efforts, the ASV table was rarefied for each sample. All ASVs with relative abundance <0.01% were also removed to produce the ASV abundance table of each sample (Sogin et al., 2006).