Transcriptome Assembly and Transcriptome Assessment
In order to generate a comprehensive transcriptome for N.
riversi , we combined all RNAseq data and generated a de novotranscriptome assembly using Trinity v2.10.0 (Haas et al. 2013).
We used default settings, but normalized the input reads in silico based
on the calculated maximum read coverage. To provide a quantitative
assessment of transcriptome completeness, we first assessed the number
of full-length transcripts using blastx v2.7.1 (Camachoet al. 2009) to query the UniProt Swiss-Prot database
(UniProt Consortium 2019). We then examined alignment scores relative to
a set of near-universal single-copy orthologs using the software BUSCO
v2.0 (Seppey et al. 2019b). We selected the reference gene set
for Endopterygota (OrthoDB v9), which contains 2,442 genes. To
further refine the transcriptome assembly, a super-transcriptome was
generated by merging the de novo transcriptome from
Trinity and the annotated genome assembly (see below) using
Necklace v1.11 (Davidson & Oshlack 2018). The goal of this
step was to produce a compact, but comprehensive set of transcribed
genes that reflect total evidence. The super-transcriptome assembly has
been made publicly available on NCBI (TSA Accession: GIWW00000000).