2.2.1 | Simulating the admixed population, Effective
population size and sampling individuals
At each generation, MetHis performs simple Wright-Fisher
(Fisher, 1922;
Wright, 1931) forward-in-time
simulations, individual-centered, in a panmictic population of diploid
effective size Ng . For a given individual in the
population H at the following generation (g + 1), MetHisindependently draws each parent from the source populations with
probability \(s_{S,g}\) (Figure 1 , Table 1 ), or from
population H with probability\(h_{g}=1-\sum_{\text{Sϵ}\left(Afr,Eur\right)}s_{S,g}\),
randomly builds a haploid gamete of independent markers for each parent,
and pairs the two constructed gametes to create the new individual.
Here, we decided to neglect mutation over the 21 generations of
admixture considered. This is reasonable when studying relatively recent
admixture histories and considering independent genotyped SNP markers.
Nevertheless, for users interested in microsatellite variation and
longer admixture histories, MetHis readily implements a standard
General Stepwise Mutation Model allowing for insertion or deletion
(Estoup, Jarne, & Cornuet, 2002), with parameters set by the user
(Supplementary Note S1 ).
To focus on the admixture process itself without excessively inflating
the parameter space, we consider, for each nine-competing model, the
admixed population H with constant effective population sizeNg = 1000 diploid individuals. Nevertheless, note
that MetHis readily allows the user parametrization of stepwise
or continuous changes in Ne (Supplementary Note S1 ).
After each simulation, we randomly draw individual samples matching
sample-sizes in our observed dataset (see 2.4.3). We sample individuals
until our sample set contains no individuals related at the
1st degree cousin within each population and between
population H and either source populations, based on explicit parental
flagging during the last 2 generations of the simulations. Note that
this is done to best mimic, a priori , the observed case-studies
dataset, but excluding related individuals is an option set by the user
in MetHis (Supplementary Note S1 ).
2.2.2 |
Simulating source populations
MetHis , in its current form, does not allow simulating the source
populations for the admixture process modeled in Verdu and Rosenberg
(2011). Simulating source populations can be done separately using
existing genetic data simulation software such as fastsimcoal2sequential coalescent (Excoffier,
Dupanloup, Huerta-Sanchez, Sousa, & Foll, 2013;
Excoffier & Foll, 2011).
Another possibility to simulate source populations emerges if genetic
data is already available for the known source populations, as it is the
case in our case studies of enslaved-African descendants in the Americas
(see 2.4.3). We consider here that the African and European source
populations are very large populations at the drift-mutation
equilibrium, accurately represented by the Yoruban YRI and British GBR
datasets here investigated (see 2.4.3). Therefore, we first build two
separate datasets each comprising 20,000 haploid genomes of 100,000
independent SNPs, each SNP being randomly drawn in the site frequency
spectrum (SFS) observed for the YRI and GBR datasets respectively. These
two datasets are used as fixed gamete reservoirs for the African and
European sources separately, at each generation of the forward-in-time
admixture process. From these reservoirs, we build an effective
individual gene-pool of diploid size N g, by
randomly pairing gametes avoiding selfing. These virtual source
populations provide the parental pool for simulating individuals in the
admixed population H with MetHis , at each generation. Thus, while
our gamete reservoirs are fixed, the parental genetic pools are randomly
built anew at each generation. Again, note that this is not necessary to
the implementation of MetHis for investigating complex admixture
histories; source populations can be simulated separately by the user at
will.