3.1 Overview of cassava lncRNAs detected by long-read and
short-read transcriptome sequencing
In this study, through the combined analysis of Iso-Seq and ssRNA-seq
data, a total of 70 146 unique transcripts were obtained, including
95.7% predicted protein-coding and 4.3% long non-coding RNA. The
number of predicted lncRNAs based on Iso-seq data is shown in
Figure 1a.
A total of 3 004 high-confidence full-length lncRNAs were identified
with an average length of 1 424 bp. To study these lncRNAs in detail, we
classified them into 5 different categories based on their genomic
distribution and potential functions. Among these lncRNAs, a large
proportion (44.6%) represented sense lncRNAs, 19.5% of them were
produced from the antisense strand of protein-coding genes, while 19.2%
of them appeared to be intergenic non-coding RNAs. LikeArabidopsis lncRNAs (Liu, Jung, Xu, Wang, Deng, Bernad,
Arenas-Huertero & Chua, 2012), only a few cassava lncRNAs (11 of 3 004
lncRNAs) acted as precursor transcripts of miRNAs or siRNAs after
aligning miRNA sequences from miRbase to our lncRNAs collection
(Table S1). We then estimated the expression level of each transcript
using FPKM and found that these lncRNAs had a lower expression level
than mRNAs (Figure 1b), and 13.7% of them were specifically detected
in cold or normal conditions. We also found that the expression levels
of the sense and intronic lncRNAs were higher than those of other types
but slightly lower than mRNAs. By comparing global expression levels of
lncRNAs between normal and cold-treated conditions (Figure 1c), we
detected 316 lncRNAs that were significantly altered by cold treatment,
of which 139 showed induced expression (Figure 1d-e, Table S1).