Model complexity
Overall, our results showed 5xH was the most common RM-FC combinations
for fish which agreed well with the earlier findings of optimal models
with higher RM values combined to more complex FCs like H as better
model setting for species with small occurrence data (Shcheglovitova and
Anderson 2013, Galante et al. 2018). While, for the odonates the most
common RM-FC combination was 5xL which also agrees with the need to use
RM values greater than the default setting (Galante et al., 2018).
However, our result also showed other optimal models had different RM-FC
combinations for both fish and odonates (Figure 2a-b). Further, our
results also showed poor agreement among the five model selection
approaches with regard to RM-FC combinations of the optimal models
selected by them for both the fish and odonate. The EXP approach had the
highest agreement with ORTEST_PER for fish, while for
odonates the EXP approach agreed the most with
AUCDIFF_PER and ORTEST_BAL. When these
findings are considered together, they may suggest the need for
taxon-specific model tuning, thus agreeing with the recognized need for
model tuning for the specific species in a given study (Galante et al.,
2018).
Our results showed that all approaches chose models with percentile ORs
equal to or greater than the theoretically expected 10% suggesting
generally overfit models (Galante et al., 2018). But other studies have
also found percentile ORs greater than 10% (see Muscarella et al. 2014,
Radosavljevic and Anderson 2014, Galante et al. 2018) when a small
number of occurrences were used. However, the ORTEST and
EXP approaches not only chose a larger number of optimal models with
larger percentile ORs over that by AUCDIFF approaches,
but they also chose models with larger AUCDIFF and a
larger number of parameters for both fish and odonates. These findings
suggest the ORTEST and EXP approaches might have
selected overfit and over-parameterized optimal models (Muscarella et
al. 2014, Radosavljevic and Anderson 2014, Galante et al., 2018).
However, an earlier study found over-parameterization a lesser issue
than under-parameterization (Warren and Seifert 2011).
Our use of a relaxed balance threshold to first generate binary
suitable/unsuitable habitat area and then choosing the EXP optimal model
might have overcome model overfitting and under predicting for EXP
approach (Pearson et al., 2007, Radosavljevic and Anderson 2014).
Whereas, models chosen solely based on smaller percentile ORs,
AUCDIFF and number of parameters may choose overly
relaxed (over predicting) models (Galante et al., 2019) which can be
aggravated by our use of a relaxed balance threshold to generate the
binary habitat map. Therefore, though AUCDIFF approaches
chose the optimal models with small AUCDIFF and smaller
number of parameters over ORTEST and EXP approaches, in
our context these optimal models are not necessarily the best. Further,
optimal models chosen by AUCDIFF approaches had
comparatively lower AUCTEST values over the EXP approach
followed by ORTEST approaches. Therefore,
AUCDIFF approaches might have chosen a greater number of
over predicting optimal models with lower model discriminatory power
(Warren and Seifert 2011).