Model validation and model setting
Generalized Linear Models (GLM), Random Forest (RF), Generalized
Additive Models (GAM), Artificial Neural Networks (ANN), Multivariate
Adaptive Regression Splines (MARS), Generalized Boosted Models (GBM) ,
and Maximum Entropy Models (MaxEnt), are the seven modeling techniques
employed in this study . SDMs were generated using the ‘biomod2 ’
package’s ensemble forecasting method found in R under the following
parameters; MARS models had a highest interaction level of 2, whilst RF
models were fitted by growing 750 trees with half the available
predictors sampled to split at each node. The default settings and the
highest iteration count of 1000 were applied to MaxEnt models. While
GAMs were computed using a logistic link function, GLMs were readjusted
using a binomial link function. On the other hand, GBMs were generated
by performing 5000 three-fold cross-validation procedures to determine
the optimal number of trees to keep and a maximum depth of variable
interactions of 7. The default specifications were used to fit ANN
models. This method has previously been used in other research (e.g. .
We added a background set of 10,000 randomly chosen background points to
the study area because our dataset only contained presence data
. As in previous research with species distribution modeling, the
occurrence dataset was randomly divided into a 30% sample for
evaluating the performance of the model and a 70% sample for model
calibration . We performed 175 SDMs in total (7 algorithms X 5 splitting
replicates for model evaluation X 1 repetition X 5 species).