Transcriptome sequencing and annotation of Macquarie perch and golden perch genomes
To facilitate genome annotations, we performed mRNA sequencing for both species. We used an adult female Macquarie perch (sample ID MP_527), captured November 2017 from Lake Dartmouth. Samples from liver, ovary, brain, kidney and muscle were collected. We used an adult golden perch (sample ID GPTT01, aged as 3 years by otolith analysis), sexed in the field as putatively male, captured in the Ovens River, Victoria, in April 2018. Samples of brain, gills, heart, gonads, kidney, liver and cheek muscle were collected. Immediately after the fish were humanely killed, RNA samples were collected and preserved in DNA/RNA Shield (Zymo Research) and stored at -80°C (Macquarie perch) or -20°C (golden perch). Total RNA was extracted from individual tissue samples using Quick-RNA Kits (Zymo Research), quantified using a TapeStation (Agilent) and 440 ng pooled per tissue. This was enriched for mRNA via poly-T beads using NEBNext® Poly(A) mRNA Magnetic Isolation Module (NEB). The enriched mRNA was processed using Universal Plus mRNA-Seq Library Preparation Kits. The libraries were pooled with libraries for other projects and sequenced on one of four lanes of S4 flowcell of an Illumina NovaSeq6000, with the aim of obtaining 20 Gb of data per sample.
A repeat library was constructed de novo for the assembled genome with RepeatModeler2 (Flynn et al., 2020), and used to repeat-mask (soft-mask) the genome with RepeatMasker v 4.0.9 (Smit, Hubley, & Green, 2013-2015). Transcriptome reads were aligned to the repeat-masked genome using STAR v2.7.1a (Dobin et al., 2013). The transcriptome alignment (single-species bam file) and repeat-masked genome were used as the input for protein-coding gene prediction in BRAKER v2.1.2 (Bruna, Hoff, Stanke, Lomsadze, & Borodovsky, 2020). Functional annotation of the predicted proteomes was completed using InterProScan 5 (Jones et al., 2014).