4. Discussion
We have investigated whether documented differences in defence
phenotypes between three closely related species that are attacked by
the same parasitoid wasp species (G.
pusilla >G. tenella >G.
calmariensis ), are reflected in genomic signatures of positive
selection among them. We found that the relative ranking of species by
immune performance was predictive of the number of immune genes
exhibiting signatures of positive selection. However, this pattern
differed depending on the type of test that we used, as well as by the
region of the genome analysed by a given test.
Our analysis of the coding region of genes with a modified MK-test based
approach showed that higher numbers of immune genes (N = 7) were
under positive selection in the species with the strongest capacity to
encapsulate wasp eggs (G. pusilla ), compared with the species
lacking the capacity to encapsulate wasp eggs (G. calmariensis, N= 1) or having an intermediate encapsulation capacity (G. tenella,
N = 4). These findings suggest that evolutionary adaptations of host
defences are not only a consequence of changes in regulatory genes but
are also due to changes in protein-coding regions. Even though the
identified genes have yet to be functionally characterized in these
species, their importance in mediating defence against parasitoids was
partly confirmed by previous comparative gene expression analyses (Yang
et al., 2020). This results contrasts with the pattern from LASSI Plus,
where an equal number of immune related genes were identified as
outliers in the three species. While this pattern may in part be due to
differences in the power of the two methods to detect selection at
different time scales, it is also notable that most genes (4/6)
identified as outliers in G. pusilla were also previously found
as being differentially expressed following wasp attack whereas none of
the genes were such for G. calmariensis .
Even though we could analyze only a subset of immune genes in theGalerucella genome (96 out of 166), the detected genes indicate
the type of processes most likely under selection. Most importantly, the
analysis suggested that the observed phenotypic differences in the
capacity to encapsulate wasp eggs (Fors et al., 2014) are likely
explained by genes involved in wasp egg detection and in the recruitment
of hemocytes building the capsule (Yang et al., 2020), whereas genes
involved in melanization or wound healing processes seem to be more
conserved. A key observation from the previous gene expression analyses
was that attack by A. parviclava increased expression of multiple
genes in G. pusilla and almost none in G. calmariensis(Yang et al., 2020). This finding indicates that differences in defense
phenotypes between these species can best be explained by genes coding
for proteins involved during the early phases of the immune process. The
current analysis identified several promising candidates, such assanta-maria , Corin , PGRP-LE , Tak1 andGrass , that have been suggested to be involved in either the
recognition or signalling processes. In addition, two of these genes
(Grass and PGRP-LE) were are among those identified as outliers in LASSI
Plus when including flanking regions. The specific functions for these
five genes have not been identified in Galerucella , but some
studies exist from Drosophila . First, both santa-maria andCorin have been suggested to encode proteins with scavenger
receptor or serine-type endopeptidase activities (Cao & Jiang, 2018),
where Corin has a transmembrane domain that may function as a
receptor to the Toll pathway (Irving et al., 2001). Second,PGRP-LE encodes an intracellular protein that binds to
diaminopimelic acid-type peptidoglycans to activate the IMD/Relish
pathway (Bosco-Drayon et al., 2012), but the function may primarily be
anti-bacterial (Libert, Chao, Chu, & Pletcher, 2006). Third,Grass has been predicted to be involved in the Toll signalling
pathway, which may be the most important signalling pathway in the
immune response against wasp attack in Drosophila (Carton et al.,
2008; Lemaitre & Hoffmann, 2007; Wertheim et al., 2005), whereasTak1 may be involved in the change between the JNK and IMD
pathways. Among these genes, a particularly interesting candidate isCorin , which was under positive selection in both G.
pusilla and G. tenella , the two species with relatively more
efficient encapsulation responses. Even though G. tenella was not
included in the gene expression analysis, we know from other studies
that this species has an encapsulation capacity that is intermediate
between the two other species (Fors et al., 2016). Previous work onDrosophila show upregulation of Corin following bacterial
infections (Irving et al., 2001), but not following parasitoid attack
(Salazar Jaramillo et al., 2017). Unfortunately, this gene was data
deficient in our gene expression analysis.
These recognition and signaling genes are sufficient for explaining
differences in defense phenotypes, but we also detected positive
selection on genes involved in downstream regulation of hemocytes.
First, cher was under positive selection in both G.
tenella and G. pusilla , and has previously been shown to
negatively regulate lamellocyte differentiation (Rus et al., 2006).
Lamellocytes are key elements in the capsules killing parasitoid eggs ofGalerucella and are induced from precursors after attack byA. parviclava (Fors et al., 2016; Fors et al., 2014), and
similarly induced by Drosophila following parasitoid attack (Kim-Jo,
Gatti, & Poirié, 2019). Two other genes involved in lamellocyte
regulation similarly have evidence for positive selection, one inG. pusilla (zfh1 ) and one on G. tenella(Cyt-b5 ). zfh1 is one element in a transcription factor
cascade that plays a role as the switch between plasmatocyte and
lamellocyte fate (Frandsen, Gunn, Muratoglu, Fossett, & Newfeld, 2008),
and Cyt-b5 encodes a conserved hemoprotein that is required for
hemocyte regulation (Kleinhesselink, Conway, Sholer, Huang, & Kimbrell,
2011). These two latter genes were also differentially expressed
following wasp infection involving A. parviclava and G.
pusilla (Yang et al., 2020).
When examining other gene functions identified as experiencing positive
selection, we found that genes coding for imaginal disc pattern
formation and wing disc pattern formation were under positive selection
in all species. Imaginal discs are epithelial sacs found in insect larva
that later develop into cuticular structures (e.g., head, wing, limbs,
thorax) of adult insects, and the wing disc is among the largest
imaginal discs in insects (Blair, 2009). These findings are not
surprising as the beetle species are morphologically differentiated,
both in size and colour. Other genes varied between species, but those
under positive selection in G. calmariensis specifically involved
genes coding for metabolic functions such as the metabolism of
carbohydrate derivatives, oligosaccharides, amino sugars, sulphur
compounds and catechol-containing compounds. These sets of positively
selected gene functions indicate the importance of energy allocation and
dietary transitions during evolution in this species. Moreover, pathways
related to the nervous systems were also found to be positively selected
in G. calmariensis . Positive selection on nervous system-related
genes were previously documented in social insects such as bees and ants
(Roux et al., 2014; Woodard et al., 2011) but have rarely been reported
in beetles. These patterns may indicate changes either in the capacity
to detect host plants or mates and may thus be involved in the species
differentiation. In G. tenella , several pathways related to lipid
metabolic processes were enriched in positively selected genes. Some of
these processes, such as sterol metabolism, may be linked to needs to
handle differences in plant chemistry from the different host plants
(Rosaceae vs. L. salicaria ). Other lipids, such as sphingolipids,
have been suggested to be involved in cell defences and could be
interesting to study in relation to wasp attack. Finally, a unique
pathway under positive selection in G. tenella was involved in
the pigmentation metabolic process, which may potentially explain the
different spotting and pigmentation patterns in the species.
With regards to our HDPRFMK tests, there are some methodological
concerns warranting attention. First, we were only able to include 4,154
orthologous genes in the analysis, which may have reduced the power to
detect important GO terms. Whether these removals have biased our
conclusions is unclear, but by discarding genes of low coverage and
removing bias due to recent gene birth death dynamics, this smaller set
of genes provides a robust gene set for inferring selection dynamics and
is sufficient for estimating genome wide selection dynamics. Second, a
general problem in MK tests is their sensitivity to the demographic
history (Eyre-Walker, 2002; McDonald & Kreitman, 1991). In our
analysis, we approached this problem by using HDMKPRF, which attempts to
correct for demography effects during the detection of selection
patterns by accounting for changes in effective population size and
other population genetic parameters. One potential concern, however, is
that the proportion of substitutions driven by positive selection were
highest in G. calmariensis even though this species had the
smallest Ne . Third, an additional issue when
examining selection patterns is slightly deleterious mutations. If such
mutations are segregating in the population, the degree of positive
selection will be underestimated. Removing deleterious mutations during
selection analysis is therefore widely accepted, by setting a threshold
for removing minor alleles (Bierne & Eyre-Walker, 2004; Fay, Wyckoff,
& Wu, 2001; Fay et al., 2002). We only removed singleton polymorphisms,
which increased the proportion of negatively selected sites but retained
the relative ranking of α among species. Thus, while absolute
α-estimates in our analysis are certainly underestimated, the relative
ranking among species is likely correct.
Our investigation of signatures of positive selection from genome wide
analysis in each of the three species failed to reveal an enrichment of
immune genes biased towards the most defended species, similar to our
predictions and what was observed in the HDPRFMK test. Because we
assessed the overlap of the identified sweep regions with both gene
bodies, as well as up to 10kb flanking regions as a proxy for gene
regulatory regions, these findings are not because we limited our sweep
detection to either regulatory or coding regions; we captured putative
candidate sweep regions for both. Thus, we conclude that differences in
the results from the two tests are more reflective of their inherent
differences, which are diverse.
The HDPRFMK test could be considered to be much more conservative than
LASSI Plus, as it requires the fixation of multiple amino acid mutations
and hence multiple rounds of selective sweeps in order to result in
significant departures from neutral expectations. However, because the
test uses all such events since a last common ancestor was shared with
the other species in the analysis, while conservative, this length of
time provides many opportunities for selective sweeps, making it a
powerful test, yet narrow in the underlying scenario it can detect. In
contrast, LASSI Plus, like nearly all tests reliant upon outliers in the
site frequency spectrum, is only able to detect positive selection
events on a much more recent time horizon, likely a small fraction of
the time since the last common ancestor. While limited in the time
horizon such a test can query over, most substantial selective sweeps
are likely to be detected and these are identifiable nearly anywhere in
the genome; LASSI Plus should be able to detect one of the many
selective sweeps that build up the amino acid fixations underlying
HDPRFMK tests, as well as sweeps happening anywhere else. In sum, while
the HDPRFMK test is narrowly focused upon selection acting on amino acid
variation, it gains power by covering a deep time horizon. LASSI Plus,
while able to detect positive selection nearly anywhere in the genome,
can only see events in the relatively recent past.