Low-coverage and population assignment
Since the majority of our nonbreeding and breeding samples were feathers
and blood, respectively, we expected the nonbreeding samples to have
lower sequencing depth than breeding samples. Therefore, to ensure that
we could still achieve high assignment accuracy at lower depths for the
nonbreeding samples, which have unknown breeding origin, we first tested
assignment accuracy with low coverage breeding samples of known origin.
We used the set of individuals from our ESSBPs to estimate population
allele frequencies (our training set) and used the remaining breeding
samples as a test set. We created two data sets from the test set
individuals by down sampling to 0.1X and 0.01X. These two thresholds
were based on the majority of the nonbreeding samples being greater than
0.1X and the lowest coverage sample being 0.02X. To determine the
accuracy of assignment of individuals with low sequencing depths, we
assigned the test sets back to the standardized breeding populations and
compared the population assigned with the known population of origin.