2.2 Environmental data
We downloaded environmental variables in raster format, commonly used in
spatial modeling of marine environments from the global repositories
Bio-Oracle (Assis et al. 2018) and MARSPEC (Sbrocco and Barber 2013) at
four depths (superficial, benthic maximum, benthic mean and benthic
minimum). Then, we classified the variables into five groups: Bsurf
(superficial Bio-Oracle; n = 10), Bma (maximum benthic Bio-Oracle; n =
8), Bme (mean benthic Bio-Oracle; n = 8), Bmi (minimum benthic
Bio-Oracle n = 8), and Msurf (superficial MARSPEC; n = 6). This protocol
was used to select the combination repository/depth that maximizes the
explanatory capacity of predictors (Table 1). To standardize the spatial
resolution of the variables, we resampled the MARSPEC variables to match
the 5 arc-minutes cell resolution of Bio-Oracle. Subsequently, we
resampled both repositories at a resolution of 10 arc-minutes with the
same method, to assess the effect of spatial resolution on our results
(Connor et al. 2019).
For each species, we defined a polygon representing a hypothesis of its
historical accessibility (area M ; sensu Soberon and Peterson
2005) based on the marine biogeographical provinces of the world
(Spalding et al. 2007) and ocean current information
(earth.nullschool.net/#current/ocean/surface/currents/patterson). We
first selected as area M all the provinces with at least one
presence record of the species in question. Then, based on the natural
history of the species and the geographical configuration of its
occurrences, we limiting or expanding it depending on whether we
consider that the currents could serve as physical barriers or dispersal
channels. Then, we also defined a calibration area and masked the
variables by selecting only the areas within the intersection between
the presence records of the species in question and the marine
eco-regions; a finer sub-regionalization than the provinces. Finally, to
reduce collinearity and dimensionality of predictors, we eliminated
individually for each species, variables in each of the five groups that
had a pairwise Pearson correlation >0.8 with the functioncorrelation_finder in “ntbox” package (Feng et al. 2019; Mateo
et al. 2013; Osorio-Olvera et al. 2020).