3.1 De novo transcriptome assembly and functional
annotation
Raw reads generated from RNA-sequencing for each of the three tissues
(brain, liver, and cochlea) from R. episcopus , R. rex ,R. marshalli , R. osgoodi , and R. pusillus (Table 1)
ranged from 44.617 Mb to 64.037 Mb (Supporting Information Table S2).
After quality control, about 44.127Mb (6.62Gb) to 63.363Mb (9.5Gb) clean
reads of three tissues remained for the de novo assembly
(Supporting Information Table S2). The clean reads of three tissues were
pooled for each species. We obtained the transcript for each species and
extracted unigene for the following analyses. For the unigene of each
species, the longest nucleotide length is 199.489Mb for R.
episcopus , and the shortest is 138.272Mb for R. osgoodi . The
contig N50s parameters of all samples generally ranged from 914bp to
1,173bp (Supporting Information Table S3).
GO terms for each of the annotated genes mainly covering biological GO
categories at three ontologies levels (biological process, molecular
function, and cellular component) were identified (Supporting
Information Figure S1). The GO terms are similar for these species, and
the biological processes annotated more unigene. Our analyses of GO
terms generally represented the main biological GO classification and
ensured the integrity of the downstream functional analyses of the
candidate genes.