Discussion
Discriminating whether the most appropriate model to reconstruct the
demographic history of a species is structured orunstructured should be the first step in empirical population
genetics investigations, particularly when targeting species of
conservation concerns. Even when an extensive spatial sampling is
lacking, an ABC model selection approach can actually distinguish
whether the sampled deme belongs or not to a meta-population (similarly
to previous studies (Maisano Delser et al., 2019; Peter, Wegmann, &
Excoffier, 2010)). This should inform on how to interpret the variation
in Ne through time traced using unstructured models. Among
the four species considered here, the tiger shark is the only panmictic.
This means that the reconstructed stairwayplot has a direct
biological interpretation, with the shark experiencing a mild ancestral
expansion and a recent ~4-fold bottleneck around 2,000
years B.P. (consistent with the results of (Pirog et al., 2019)) (Figure
3). How to interpret the stairwayplot in the remaining three
species? Coalescence simulations provided helpful and general insights
into the understanding of the relation between the inferences performed
under unstructured and structured models. We first focus
on scenarios simulated under the SST, with parameters close to those
estimated in real data. The first and most striking result is that we
systematically observed a recent bottleneck under all simulated
scenarios (Table 2, Figure 4, 5, 6, S1, and S2). This result could seem
at a first glance surprising and due to an artefact. However, this is
not the case, as: i) the signal does not depend on the inferential
algorithm chosen to analyse the data (i.e., the stairwayplot ),
since the normalized spectra showed a deficit in singletons compared to
the other low frequency classes (Figure 4, 5, 6 and S1), which is
typical of a recent population decline; ii) it is consistent with the
distribution of the Inverse Instantaneous Coalescent Rate (IICR)
computed in one diploid individual, which shows a signature of decline
under similar meta-population models (Chikhi et al., 2018; O. Mazet,
Rodríguez, Grusea, Boitard, & Chikhi, 2016; Rodríguez et al., 2018).
The results of our simulations are consistent with the recent bottleneck
observed in the three shark species (Figure 3), with its intensity
inversely correlated to the estimated Nm (i.e., stronger forC. melanopterus and C. limbatus than for C.
amblyrhynchos ). In our SST model there is an instantaneous colonization
of the array of demes at Tcol , which corresponds also to a
demographic expansion (i.e., the total number of individuals in the
array of deme is larger than those in the ancestral deme). However, this
signature is detected only for Nm ≥5 when Tcol is neither
too recent nor too old (at equilibrium) (Figure 5, 6, S1 and S2). In
these scenarios, the beginning of the expansion retrieved by thestairwayplot broadly corresponds to the simulated Tcol .
This again corroborates the results obtained for the three shark
species, since the two species with higher Nm displayed indeed an
ancestral expansion in the stairwayplot with a timing consistent
with the estimated Tcol (Table 1, Figure 2 and 3). Similarly, it
explains why we could not retrieve the ancestral expansion for C.
melanopterus nor estimate Tcol under the SST model: this appears
to be a property of the coalescent pattern and it is not related to the
amount of data available (see below).
It is now straightforward to frame all these findings under the
coalescent perspective. The coalescent history of the lineages sampled
from a single deme in an SST (or FIM) model can be separated for
simplicity into three phases: the scattering , thecollecting and the ancestral phase. Going backward in
time, lineages will coalesce in the sampled deme with a rate according
to both Nm and N until all lineages either have coalesced
or migrated to another deme. This is the scattering phase
described in the seminal works of (Wakeley, 1998, 1999). Thescattering phase was considered instantaneous for mathematical
tractability, with its outcome dependent on Nm only, but later
works could disentangle the effect of N and m on the shape
of the gene genealogy (Mona, 2017). The collecting phase starts
when the lineages which did not coalesce have migrated to other demes of
the array: they will then coalesce according to a Kingman process with a
rate scaled by Nm and the number of demes of the array (Wakeley,
1999). Finally, all surviving lineages (in non-equilibrium model) will
reach the ancestral deme at Tcol , where they will coalesce at a
rate depending only on the Nanc parameter. In species with lowNm , the rate of coalescent during the scattering phase is
very fast since lineages have low probability of emigrating from the
sampled deme and high probability of coalescence due to the smallN . Once all the lineages are dispersed in the array of demes,
there will be two possible outcomes: i) in equilibrium model, we shift
to the collecting phase, where the rate of coalescent drops since
lineages will hardly fall in the same deme again; ii) in non-equilibrium
model, with the parameters we have simulated here, there will be very
few (if any) coalescent events during the collecting phase and the
transition will be directly from the scattering to theancestral phase. Both the collecting and theancestral phases have a rate of coalescent lower than thescattering phase, which determines the observed recent drop inNe for all simulated scenarios. Remarkably, the decline inNe is much stronger in equilibrium model, since the rate of
coalescent is much lower in the collecting than in theancestral phase (Figure 4, 5, 6, S1, and S2). Low Nmspecies will therefore have only two coalescent phases, thescattering and either the collecting (in equilibrium
model) or the ancestral (in non-equilibrium model) which is why
the signature of the ancestral expansion is lost. For growing Nm ,
in equilibrium model there will be again only two coalescent phases,
namely the scattering and collecting , with the latter
having lower rate of coalescent than the former independently of the
simulated parameters. This is why we observed always a strong bottleneck
consistent with the distribution of the IICR statistics in any
equilibrium model (Chikhi et al., 2018; Olivier Mazet et al., 2015;
Rodríguez et al., 2018). In non-equilibrium model, there will be two
different situations: a) Tcol (in generations) is of the same
order of the deme size N . In this setting, going backward in time
few lineages would have escaped the sampled demes before Tcol .
This corresponds to a shift in the coalescent rate directly from thescattering to the ancestral phase, resulting in a
bottleneck of lower intensity compared to an equilibrium model (Figure
4, 5, 6 and S1), for the same reasons as above; b) Tcol (in
generations) is larger than N . In this setting, some coalescent
events may occur during the collecting phase, at a rate much
slower than the two other phases. This determines the hump observed in
the stairwayplot (Figure 4, 5, 6) and explains why in this window
of parameters it is also possible to correctly estimate Tcolusing our ABC framework. Further simulations under the FIM model
confirmed those patterns even though the ancestral expansion could be
detected for lower long-term Nm than the corresponding SST
scenario (Figure S3). This is probably due to a higher apparent
connectivity underlined the by FIM, where lineages can move more freely
during the collecting phase in comparison to SST where migrants only
come from the closest neighbours. If many coalescent events occur during
the collecting phase, the change in coalescent rate will affect
the resulting gene genealogy and it will be detected by thestairwayplot (or any other unstructured method based on
coalescent theory).
Using coalescent arguments, we clarified why simple meta-population
models with constant connectivity generate a gene genealogy harbouring a
signature of a recent decline for any parameters’ combination. The
signature of bottleneck detected by the stairwayplot in the three
shark species best described by SST can be therefore interpreted as a
consequence of the underlying structure. However, connectivity likely
changes through time. For instance, human activities have likely
impacted the evolutionary history of a large number of species either by
decreasing their effective population size and/or by fragmenting their
habitat (i.e., reducing migration rates between demes). This intuitively
should exacerbate the signature of population decline in the resulting
gene genealogy. However, it remains to be shown whether this signature
is qualitatively and quantitatively distinguishable from models with
constant connectivity. This is a question of fundamental importance to
understand whether it is possible to detect recent bottleneck in
structured populations. To this end, we further investigated by
coalescent simulations the expected gene genealogy in SST-CH (and
FIM-CH) models with a change in connectivity 10 or 50 generations B.P.,
which matches the beginning of extensive anthropogenic influence on
biodiversity considering our species’ generation time (Ceballos et al.,
2015). The resulting gene genealogies were poorly affected by the recent
drop in connectivity, with both the normalized SFS and the inferredNe dynamic following the same trajectory of the corresponding
scenario with the same long-term Nm and Tcol (Figures 7,
8, S5, S6, S7, S8, S9 and S10). We noticed the drop in N (Figure
8, S6, S9 and S10) had stronger influence than the drop in m(Figure 7, S5, S7 and S8), consistent with previous finding showing that
the distribution of coalescent events depends not only by the Nmcompound parameter but also by their individuals values (Mona, 2017).
These results imply that the simulated change in connectivity is too
recent to significantly alter the pattern of coalescent events during
the scattering phase and that a recent drop can be hardly
detected on the basis of the SFS only. Our empirical data are consistent
with these findings: when we compared SST vs. SST-CH models in the three
shark species using the ABC framework, we failed to clearly distinguish
the two models (Table S5, Table S6, Figure S4). This seems a paradox: we
observed a recent bottleneck in species of conservation concern usingunstructured model, but we cannot exclude that this is just the
consequence of population structure.
This study highlight once more the importance to explicitly test for
meta-population structure before interpreting the demographic signals
detected by unstructured models, similarly to what advocated
previously by (Maisano Delser et al., 2019; Rodríguez et al., 2018). If
the meta-population structure hypothesis is rejected, the variation ofNe through time can be directly interpreted as the demographic
history of the population under investigation, such as the case of tiger
shark. Otherwise, this variation is still related to demographic events,
but it has to be explained in the light of population structure and its
consequence on the rate of coalescent events. We showed by coalescent
simulations how to interpret such variation: the recent bottleneck
detected by the stairwayplot in demes belonging to a
meta-population is a consequence of the coalescent process. In other
words, any inferential method implementing an unstructured model
will detect such decline (if enough data is available) since it is a
property of the gene genealogy. Importantly, the gene genealogy is only
slightly affected by recent changes in connectivity if the time of this
change in generations is of the same order of the size of the deme.
Our study highlights a key issue in conservation genetics as a recent
decline inferred by an unstructured model can be mis-interpreted
as a consequence of recent anthropic pressures (Ceballos et al., 2015)
when it actually results from meta-population structure. This is all the
more alarming since the majority of species is likely organised in
meta-populations across their range rather than panmictic at a large
scale. We therefore stress the necessity for an educated choice of tools
to correctly uncover the recent trend of a species and design proper
conservation programs. For instance, detecting a recent bottleneck in
meta-populations will require summary statistics measuring the linkage
disequilibrium (Boitard, Rodríguez, Jay, Mona, & Austerlitz, 2016;
Kerdoncuff, Lambert, & Achaz, 2020) and/or the inferential framework
based on the IICR (Chikhi et al., 2018; Rodríguez et al., 2018) coupled
with whole genome data. On a positive note, we showed that the
colonization time of the array of demes can be estimated to some extent
(and under some combinations of parameters) by unstructuredmodels. We believe that this is particularly important because it has
been shown that the simple instantaneous colonization process we used
here behaves similarly to a spatial explicit range expansion (Hamilton,
Stoneking, & Excoffier, 2005; Mona, 2017), which is certainly a more
realistic model but more difficult to investigate. We are aware that the
meta-population models here tested are simple and the parameters chosen
are specific of the three shark species we focused on. Nevertheless, the
time-scale separation of the coalescent process is general, and it
allows explaining intuitively any structured models. The four shark
species here used as an example has the merit to cover a large spectrum
of LHT and consequently a large spectrum of demographic scenarios, going
from a highly structured to a panmictic population: this has strong
implications on the distribution of coalescent times and therefore on
the interpretation of the observed data.
In this study we found that population structure, independently from the
degree of connectivity between demes and the migration matrix relating
them, intrinsically determines a variation in the rate of coalescent
events through time. We showed that the intensity and the direction(s)
of such variation related to the demographic parameters of the
meta-population in a predictable way. Our results highlight the
importance of detecting population structure (which depends on LHT among
other factors) before performing any demographic inferences but, at the
same time, they reveal the utility of unstructured models to
describe the shape of the gene genealogy, which is the final product of
the evolutionary history of a species. A combination of structured andunstructured models is therefore the key to best characterize the
evolutionary history of a species. We call for a change in perspective
when investigating the demographic history of a species: the focus
should be put in the reconstruction of the variation of both Nand m through time, which requires certainly new methodological
development and probably more data.