2.4. Taxonomic and functional profiling
The sequencing reads were first quality filtered using Trimmomatic13 v0.36 and PRINSEQ14v0.20.4. Human reads were removed using KneadData v0.6.1 (https://bitbucket.org/biobakery/kneaddata). High quality non-human reads were mapped against a custom database using Kraken215 v2.0.9. A total of 29,943 complete microbial genomes were downloaded on 3 May 2020, of which 19,362 were bacterial, 368 were archaeal, 9,346 were viral, and 867 were fungal. The complete bacterial, archaeal, and viral genomes were downloaded from RefSeq database using the –download-library option of kraken2-build. The complete fungal genomes were manually downloaded from GenBank database. The results of taxonomic classification were filtered using a confidence score of 0.20. Only species with more than 10 reads in at least one sample were retained. The species profiles were decontaminated (see below) and the reads derived from non-contaminant species were used for functional analysis. Functional profiling was performed using HUMAnN216 v0.11.1. The abundance profiles of gene families (UniRef90s) were summarized to the abundance of KEGG orthology (KO), Enzyme Commission (EC) gene families, EggNOG clusters of orthologous groups, and Pfam protein families, respectively. In addition, the non-contaminant reads were mapped against the protein homolog sequences of the antimicrobial resistance genes in CARD database17 (May 2020 release) using DIAMOND18 v0.9.22.123.