2.6 Iso-Seq and ssRNA-Seq data processing and lncRNA identification
The RNA preparation, library construction, and sequencing for Iso-Seq (Li et al., 2020) and ssRNA-Seq (Li et al., 2017) were described previously. All sequencing data were deposited with NCBI under the BioProject ID PRJNA198574 and PRJNA377165. For Iso-Seq, total RNA was extracted using TRIzol reagent (Life technologies) and enriched by Oligo (dT) magnetic beads. The enriched mRNA was reverse transcribed into cDNA using Clontech SMARTer PCR cDNA Synthesis Kit. A total of two libraries (Normal and Cold) were constructed and sequenced on the Pacific Biosciences (PacBio) Sequel II platform by Gene Denovo Biotechnology Co., Ltd. (Guangzhou, China). The raw reads were classified and clustered into transcript consensus using SMRT Link v5.0.1 pipeline (Gordon, Tseng, Salamov, Zhang, Meng, Zhao, Kang, Underwood, Grigoriev, Figueroa, Schilling, Chen & Wang, 2015) supported by PacBio and then mapped to reference genome using minimap2 (Li, 2018). Long non-coding RNA identification was performed according to the pipeline described previously (Li et al., 2017). The intersection of both non-protein-coding potential results and non-protein annotation results were chosen as lncRNA candidates. For ssRNA-Seq data processing, clean reads from two samples (Normal and Cold, 3 replicates per sample) were mapped to the full-length lncRNA isoforms and cassava reference genome by HISAT (Kim, Langmead & Salzberg, 2015). The counts of each lncRNA were quantified by RSEM (Li & Dewey, 2011), and the quantitative estimation of each transcript was achieved using fragments per kilobase of exon model per million mapped reads (FPKM). Differential expressed lncRNAs were analyzed by the DESeq2 package (Love, Huber & Anders, 2014). Significant changes were determined using |Log 2 FC| > 1 and q-value (false discovery rate, FDR < 5%) from multiple-testing adjustment as cut-off.