2.2 Environmental data
We downloaded environmental variables in raster format, commonly used in spatial modeling of marine environments from the global repositories Bio-Oracle (Assis et al. 2018) and MARSPEC (Sbrocco and Barber 2013) at four depths (superficial, benthic maximum, benthic mean and benthic minimum). Then, we classified the variables into five groups: Bsurf (superficial Bio-Oracle; n = 10), Bma (maximum benthic Bio-Oracle; n = 8), Bme (mean benthic Bio-Oracle; n = 8), Bmi (minimum benthic Bio-Oracle n = 8), and Msurf (superficial MARSPEC; n = 6). This protocol was used to select the combination repository/depth that maximizes the explanatory capacity of predictors (Table 1). To standardize the spatial resolution of the variables, we resampled the MARSPEC variables to match the 5 arc-minutes cell resolution of Bio-Oracle. Subsequently, we resampled both repositories at a resolution of 10 arc-minutes with the same method, to assess the effect of spatial resolution on our results (Connor et al. 2019).
For each species, we defined a polygon representing a hypothesis of its historical accessibility (area M ; sensu Soberon and Peterson 2005) based on the marine biogeographical provinces of the world (Spalding et al. 2007) and ocean current information (earth.nullschool.net/#current/ocean/surface/currents/patterson). We first selected as area M all the provinces with at least one presence record of the species in question. Then, based on the natural history of the species and the geographical configuration of its occurrences, we limiting or expanding it depending on whether we consider that the currents could serve as physical barriers or dispersal channels. Then, we also defined a calibration area and masked the variables by selecting only the areas within the intersection between the presence records of the species in question and the marine eco-regions; a finer sub-regionalization than the provinces. Finally, to reduce collinearity and dimensionality of predictors, we eliminated individually for each species, variables in each of the five groups that had a pairwise Pearson correlation >0.8 with the functioncorrelation_finder in “ntbox” package (Feng et al. 2019; Mateo et al. 2013; Osorio-Olvera et al. 2020).