3.2 Similarity in gene expression among samples
Similarity in gene expression among biological replicates - i.e.,
individuals belonging to the same treatment group - gives an idea of
reproducibility of our data and of the overall variation among samples.
Similarity in gene expression within and among groups can be estimated
using the sample correlation or Euclidean distances (see Materials and
Methods for further details). Pearson correlation coefficients (r) for
biological replicates were equal or above 0.9 for 97% of comparisons
(same tissue and some tissue within a group) (Table S3 on Dryad). This
indicates that although variation in gene expression occurs among
individuals, biological replicates are generally very similar.
Pearson r values between the two sequencing platforms for NEB are all
above 0.9 for the samples belonging to the same group (Supporting
Information Table S2), indicating that different sequencing methods did
not influence the number of uniquely mapped reads. Finally, r among
different tissues (for QuantSeq) and among QuantSeq vs. NEB are
generally <0.5 and sometimes negative, suggesting different
levels of gene expression among tissues and among the same mapped genes
between the two library types.
Heatmaps of the distance matrices for the different group comparisons
provide hierarchical clustering based on sample distances. When heatmaps
were made combining data from the three different tissues for QuantSeq,
we found three clusters corresponding to the three different tissues
(Figure 2A). However, within each cluster, as also shown by the heatmaps
built with data from each tissue separately, samples belonging to
different groups are clustered together, indicating no clear difference
in gene expression among the tested groups (Supporting Information
Figure S1). Lack of difference in gene expression among the different
groups was also found using NEB data (Figure 2).
Finally, comparison of QuantSeq vs. NEB found differences in gene
expression between the two methods; however, this difference was not
associated with any of the groups (Figure 2). Principal component
analysis (PCA), another way to visualize variation in gene expression
among samples, further supports the lack of differences among sampling
methods and time of tissue harvesting and the differentiation between
QuantSeq versus NEB and among the three sampled tissues (Figures 3 and 4
and Supporting Information Figure S2).