3.1 De novo transcriptome assembly and functional annotation
Raw reads generated from RNA-sequencing for each of the three tissues (brain, liver, and cochlea) from R. episcopus , R. rex ,R. marshalli , R. osgoodi , and R. pusillus (Table 1) ranged from 44.617 Mb to 64.037 Mb (Supporting Information Table S2). After quality control, about 44.127Mb (6.62Gb) to 63.363Mb (9.5Gb) clean reads of three tissues remained for the de novo assembly (Supporting Information Table S2). The clean reads of three tissues were pooled for each species. We obtained the transcript for each species and extracted unigene for the following analyses. For the unigene of each species, the longest nucleotide length is 199.489Mb for R. episcopus , and the shortest is 138.272Mb for R. osgoodi . The contig N50s parameters of all samples generally ranged from 914bp to 1,173bp (Supporting Information Table S3).
GO terms for each of the annotated genes mainly covering biological GO categories at three ontologies levels (biological process, molecular function, and cellular component) were identified (Supporting Information Figure S1). The GO terms are similar for these species, and the biological processes annotated more unigene. Our analyses of GO terms generally represented the main biological GO classification and ensured the integrity of the downstream functional analyses of the candidate genes.