Species distribution modelling
We modelled the breeding range of L. c. superciliosus and
projected it into the LGM. Data on occurrence in Japan were obtained
from a summary of breeding localities based on the National Surveys
conducted by the Ministry of Environment of Japan and literature (M
Kitazawa and his colleagues (in prep.) hereafter referred to as ‘review
data’). Additionally, on 12 May 2019 we downloaded data from
Global Biodiversity Information
Facility (GBIF; https://www.gbif.org/,
http://doi.org/10.15468/dl.igj5uw). A polygon bounding the presumed
distribution of L. c. superciliosus according to Lefranc &
Worfolk (1997) was used to clip the data. Records were limited only to
the breeding season (May to August) to avoid integration of migrating
birds. After a few quality-checking processes, final coordinates were
thinned to include only one occurrence per 20 km square covering the
Japanese archipelago (Figure S1.1).
We constructed a model with the final coordinate data. Our purpose is to
infer the continental area that might have been suitable for L. c.
superciliosus as its breeding habitat during the LGM. Therefore, any
ecological differences between the current archipelago and the
continental area need to be integrated in the model (Elith et al.,
2011). We clipped the environmental data to include the present
distribution of the Brown Shrike regardless of subspecies and this
environmental data was defined as the modelling domain. Of the available
WorldClim data (version 1.4; Hijmans et al. 2005), only one variable
from each pair of correlated variables (|r|
> 0.7) was used, resulting in seven variables: bio1, 2, 3,
5, 12, 14 and 15.
We used the R package ‘ENMeval’ v. 0.3.0 (Muscarella et al., 2014) to
execute the maximum entropy (MaxEnt) algorithm (Phillips, Anderson, &
Schapire, 2006) across a range of several settings. Combinations of
three feature classes (linear, a combination of linear and quadratic,
and hinge) and regularization multipliers ranging from 0.5 to 3
(increments of 0.5) were tested. The ‘maxent.jar’ option was used as the
model algorithm. From the geographical modelling domain defined above,
10,000 background points were sampled randomly. The ‘block’ method was
used to partition data for cross-validation since it assesses model
transferability better than other methods (Muscarella et al., 2014). The
hinge feature, with a regularization multiplier of 1.5, was selected by
the lowest AICc. The present distribution was predicted, and then it was
projected to the LGM environmental data using cloglog transformation.
The LGM environmental data were drawn by Hijmans et al. (2005) from
three different climate models: the community climate system model
(CCSM, Gent et al., 2011), the model for interdisciplinary research on
climate model (MIROC-ESM, Hasumi & Emori, 2004) and the
Max-Planck-Inst. Für Meteorologie climate model (MPI-ESM-P, Liepert &
Lo, 2013). Multivariate environmental similarity surface (MESS) was also
calculated and extrapolation was restricted to regions indicated by
negative values from this analysis (See Appendix S1 for the detailed
procedures on the SDM analysis).
Migratory route analysis
Data retrieved from light-level geolocators were analysed in the R
package ‘GeoLight’ version 2.0.0 (Lisovski & Hahn, 2012) and ‘SGAT’
version 0.1.3 (Wotherspoon, Sumner, & Lisovski, 2015). We prepared an R
script by following the manual provided by Lisovski et al. (2019). We
determined sunset and sunrise transitions for specific dates by light
intensity thresholds of 0.5. We incorporated a statistical modelling
approach, called the ‘group model’ in a Bayesian framework implemented
in ‘SGAT’. This model refines the estimated locations by means of three
components: the twilight model, a movement model, and spatial masks. The
twilight model incorporates the deviation between determined and actual
twilight times in calibration periods of each tag. We used twilight data
of 26 days after geolocator deployment as the calibration period. We
fitted gamma distribution as an error distribution of detection of
twilight. To define a movement model and spatial masks separately for
stationary and movement periods, the ChangeLight function in ‘GeoLight’
was applied to discriminate between these periods from the twilight
data. A probability threshold for discrimination was set to 0.9. For the
movement model, we set a flight speed prior with a gamma distribution
for periods of movement (shape = 13, rate = 0.3), determined allowing
for the average air speed of 12.9 m/s for Lanius collurio(Bruderer & Boldt, 2001). We conducted a sensitivity analysis for the
gamma distribution, and we found that priors did not greatly influence
our interpretation of the results (Table S1.2). In the group model,
twilight times grouped for a stationary period are treated as records at
a single location, thus no movement model was defined. For a probability
mask, birds must cross the sea between the Japanese archipelago and the
continent and must stopover on land. Therefore, a spatial mask for
stationary periods was designed only to include terrestrial land while
both sea and land areas were allowed for movement periods. We slightly
modified our analysis for bird
#V5604-025 because its geolocator
stopped recording during the autumn equinox period (see Appendix S1 for
details).
We interpolated latitudinal estimations around the equinox periods by
setting a ‘tol’ value between 0.08 and 0.1. As burn-in and fine tuning,
we ran 120,000 chains with three iterations, and the final paths were
determined by the last 15,000 samples with three iterations. Regarding
large uncertainties involved in migration estimations by geolocators
(Lisovski et al., 2019), we standardized the posterior probabilities of
location estimates on each raster cell, calculated their means among the
three individuals, and we presented them as a probability distribution
in a region of interest where we expect birds to retrace the route of
past colonisation. We also inferred median paths of the three migratory
tracks. Because birds spend more time at stopover sites than at an
instantaneous point along a migratory path, the former will be indicated
with relatively higher probabilities than the latter.
Results: