3.2 Similarity in gene expression among samples
Similarity in gene expression among biological replicates - i.e., individuals belonging to the same treatment group - gives an idea of reproducibility of our data and of the overall variation among samples. Similarity in gene expression within and among groups can be estimated using the sample correlation or Euclidean distances (see Materials and Methods for further details). Pearson correlation coefficients (r) for biological replicates were equal or above 0.9 for 97% of comparisons (same tissue and some tissue within a group) (Table S3 on Dryad). This indicates that although variation in gene expression occurs among individuals, biological replicates are generally very similar.
Pearson r values between the two sequencing platforms for NEB are all above 0.9 for the samples belonging to the same group (Supporting Information Table S2), indicating that different sequencing methods did not influence the number of uniquely mapped reads. Finally, r among different tissues (for QuantSeq) and among QuantSeq vs. NEB are generally <0.5 and sometimes negative, suggesting different levels of gene expression among tissues and among the same mapped genes between the two library types.
Heatmaps of the distance matrices for the different group comparisons provide hierarchical clustering based on sample distances. When heatmaps were made combining data from the three different tissues for QuantSeq, we found three clusters corresponding to the three different tissues (Figure 2A). However, within each cluster, as also shown by the heatmaps built with data from each tissue separately, samples belonging to different groups are clustered together, indicating no clear difference in gene expression among the tested groups (Supporting Information Figure S1). Lack of difference in gene expression among the different groups was also found using NEB data (Figure 2).
Finally, comparison of QuantSeq vs. NEB found differences in gene expression between the two methods; however, this difference was not associated with any of the groups (Figure 2). Principal component analysis (PCA), another way to visualize variation in gene expression among samples, further supports the lack of differences among sampling methods and time of tissue harvesting and the differentiation between QuantSeq versus NEB and among the three sampled tissues (Figures 3 and 4 and Supporting Information Figure S2).