2.12 | Library preparation, Transcriptome sequencing and analysis
Accurate detection of RNA integrity and total volume with Agilent 2100 bioanalyzer. NEB general library building using NEBNext® Ultra™ RNA Library Prep Kit for Illumina® kit and strand-specific library building using NEBNext® Ultra™ Directional RNA Library Prep Kit for Illumina® kit. After passing the library test, the different libraries are pooled according to the effective concentration and the target downstream data volume required for Illumina sequencing. The basic principle of sequencing is sequencing while synthesizing. The library construction and Illumina sequencing were conducted at Novogene limited liability company (Beijing, China).
Clean data were obtained after quality clipping of the raw data and Q20, Q30, and GC content. All the downstream analyses were based on clean data with high quality. We selected the genome of Pse. libanoticaas the reference genome. Hisat2 (v2.0.5) as the mapping tool for that Hisat2 can generate a database of splice junctions based on the gene model annotation file and thus produce a better mapping result than other non-splice mapping tools. The mapped reads of each sample were assembled by StringTie (v1.3.3b) (Pertea et al., 2015) in a reference-based approach.
FeatureCounts v1.5.0 was used to count the reads numbers mapped to each gene. And the FPKM values were then mapped back to read counts according to known gene lengths. Differential expression analysis of two conditions/groups was performed using the DESeq2 R package (1.20.0). The resulting P-values were adjusted using the Benjamini and Hochberg’s approach for controlling the false discovery rate. Genes with an adjusted P-value <0.05 found by DESeq2 were assigned as differentially expressed.
Gene Ontology (GO) enrichment analysis of differentially expressed genes was implemented by the clusterProfiler R package, in which gene length bias was corrected. GO terms with corrected P-values less than 0.05 were considered significantly enriched by differential expressed genes. We used the clusterProfiler R package to test the statistical enrichment of differential expression genes in KEGG pathways.