Environmental variables
Altitude and a set of 19 bioclimatic variables were adopted in our
research (Table 1). These bioclimatic variables were derived from the
monthly meteorological data(Fick & Hijmans, 2017). They could be
clustered into four groups which represented annual trends, seasonality
and extreme or limiting environmental factors(van Zonneveld, Castaneda,
Scheldeman, van Etten, & Van Damme, 2014). All these variables were at
5 arc-minutes spatial resolution (~10 km×10 km) and with
the range of 180°W to 180°E longitude and 60°S to 84°N latitude.
The altitude data were obtained from CGIAR-CSI (available athttps://srtm.csi.cgiar.org/).
The current global bioclimatic variables data were
download from CHELSA (available at
http://chelsa-climate.org/). They were the average values for the period
1979-2013(Karger et al., 2017). The future ones were derived from the
projections of eight global climate
models (GCMs): BCC-CSM2-MR, CNRM-CM6-1, CNRM-ESM2-1, CanESM5,
IPSL-CM6A-LR, MIROC-ES2L, MIROC6, MRI-ESM2-0. These GCMs were selected
according to their data availability and involvement in relevant model
intercomparison projects within the CMIP6(Moseid et al., 2020). The
future climate data were 20-year averages for 2021-2040, 241-2060,
2061-2080 and 2081-2100 and for four Shared Socio-economic Pathways
(SSPs): 126, 245, 370 and 585. These data were downloaded from WorldClim
(available at http://www.worldclim.org). In order to reduce the bias in
certain area from sole GCM, multi-model ensemble (MME) was adopted to
derived the average values for future climates(Mendlik & Gobiet, 2016;
Pierce, Barnett, Santer, & Gleckler, 2009). Since
collinearity-correlation among variables would result in accuracy drop
of SDMs, total seven variables (shown in bold in Table 1) were picked
out from 20 environmental variables according to multiple correlation
coefficient (|R| < 0.6) and variance
inflation factor (VIF < 10) between each other(Naimi &
Araújo, 2016).
Modelling procedure
Biomod2 was a R package which was developed for species distribution
modelling. It included 10 different species distribution algorithms,
such as Artificial Neural Network (ANN), Classification Tree Analysis
(CTA), Flexible Discriminant Analysis (FDA), Generalized Additive Model
(GAM), Generalized Boosting Model (GBM), Generalized Linear Model (GLM),
Multiple Adaptive Regression Splines (MARS), Maximum Entropy (MAXENT),
Random Forests (RF) and Surface Range Envelop (SRE)(Wilfried Thuiller,
Georges, Engler, & Breiner, 2016). Since absence records requirement of
several algorithms above, 10000 pseudo-absences were selected randomly 5
times to follow a common strategy(Antoine Guisan, Thuiller, &
Zimmermann, 2017; Merow, Smith, & Silander, 2013). 80% of presence and
pseudo-absence data were used to calibrate the models, and the rest was
used for model testing(Antoine Guisan et al., 2017). The model
calibrations and evaluations would repeat 10 times. Response curves and
relative contributions of each variable involved were calculated. AUC
(area under the receiver operating characteristic curve) and TSS (true
skill statistics) were employed as performance evaluation criteria for
10 different algorithms. The models whose AUC was greater than 0.90 and
TSS was greater than 0.6 were considered to be with good performance in
species distribution modelling(Allouche, Tsoar, & Kadmon, 2006; Swets,
1988). By means of biomod2, an ensemble modelling approach were adopted
to build the ensemble models for eliminating model bias caused by model
selection. Using these ensemble models, the potential distributions
under current and future climate scenarios were projected for A.
annua . And finally, based on these projections, distribution patterns,
range sizes and shifts under different climate scenarios were analyzed
and compared for A. annua .