2.1. Study area
The study area includes 69 rivers from 15 basins located in southern France, Spain, and Portugal covering most of the western part of the European Mediterranean region (Fig. 1). Briefly, rivers of France are located on 4 temporary tributaries of the Rhone-Mediterranean basin district, very close to the Mediterranean Sea. In Spain, we studied 44 rivers, of which 16 are located in river basin districts that flow into the Mediterranean Sea (Catalonian, Ebro, Júcar, Segura and Andalusian Mediterranean basins) and 28 into the Atlantic Ocean (Tagus, Guadiana, Guadalquivir, Guadalete and Barbate basins). The 21 stations of Portugal drain into the Atlantic Ocean and belong to Guadiana, Algarve, Sado, Mira, Vouga, Mondego, Lis, Douro, Cávado, Ave and Leça river basin districts.
In natural conditions, all the rivers in these basins present a Mediterranean flow regime pattern, which implies two alternate periods: a high-flow period during the wet season (i.e. autumn to winter) and a low-flow period during the dry season (i.e. late spring and summer). However, the basins under study cover a large geographical area representing a wide gradient of climatic, topographic and geologic conditions which imply notable differences in hydrological regimes. The climate is mostly temperate although there are stations located in semi-arid regions of southern Spain. According to the Köppen-Geiger classification, the studied NPRS are matched with hot (Csa), warm (Csb), or cool (Csc) summer Mediterranean climate, and hot (BSh) or cold (BSk) semi-arid climate . Although the climate has a common pattern with mild and wet winters, and dry, hot, or cold summers, there are differences in the range of precipitation and temperature. The land use in the studied basins is dominated by agriculture, originated by the use of human society on the natural environment . The rivers and its stream tributaries are heavily regulated by the construction of dams and weirs, which have substantially altered the natural flow regime also reducing the number of unaltered gauging stations with flow data records .
(Here Fig. 1)
2.2. Hydrologic data
We used daily flow records from gauging stations in NPRS minimally impacted by human activities. Due to the lack unaltered stations with adequate data in Mediterranean NPRS , we assumed that most of them contained missing data (Table A1). The selection of NPRS in almost natural conditions implied avoiding hydrological alteration, deviation, or cessation of water due to transverse barriers (large dams or smaller weirs) located upstream. Data records were obtained from different sources. In Spain, they were obtained from the national database of public gauging stations of CEDEX (Centre for Hydrographic Studies; https://ceh.cedex.es/), or from the corresponding River Basin Authority. For this purpose, we first identified gauging stations without altered flow conditions using the national inventory of barriers on non-perennial rivers (available at https://sig.mapama.gob.es/geoportal). In France and Portugal, stations were obtained from SMIRES project database (https://www.smires.eu/). To identify those gauging stations in NPRS near to natural conditions, we used the AMBER barrier Atlas (https://amber.international/european) and hydrological pressures collected in the European WISE database (www.eea.europa.eu/data-and-maps/data/wise-wfd-4). The Supporting Information of Table A1 expands the information about the stations such as river basin where are located, country, coordinates, gauging station code, the length of the data period, the number of days of the data series and the number of days with missing data.
Gauging stations with more than 15 years of daily flow records were used, except one with 11 years in the Tagus basin in Portugal (Table A1). The median length of the data period was 36 years (IQR=26-43 years). All the series were validated for the calculation of zero-flow hydrological indices with smires package (https://github.com/mundl/smires). For each station we calculated a set of 315 hydrological indices that have been previously referred in other studies focused on perennial rivers (Eng et al., 2017; Olden and Poff, 2003), drought events in NPRS (Costigan et al., 2017; Delso et al., 2017) and low river flows (Henriksen et al., 2006; Kennard et al., 2010). Following Richter et al. (1996), the indices were classified into five groups characterizing hydrological conditions related to: (i) magnitude, (ii) frequency, (iii) duration, (iv) timing, and (v) rate of change of flow or drought events. A list of the calculated indices and their main characteristics is shown in Appendix B.
Finally, the hydrological indices were encoded for calculation with theR programming language . We used the lfstat package to calculate the number and duration of zero flow events. Thehydrostats package was used for calculating the Colwell’s index of predictability and seasonality , and the rate of change in the magnitude of the flow and the asymmetry (skewness) of the hydrological series. In order to calculate the indices, three conditions were adopted. First, all years of the series have been used, even those with incomplete records. Second, the hydrological year was set at the beginning of the Julian calendar year. Third, we defined different thresholds to define the days without daily flow at 0, 1, 2, and 5 l/s. This is due to false positives of null flows associated with the restrictions and uncertainties of the measurement of days without flow in gauging stations of NPRS .
2.3. Hydrologic classification for non-perennial Mediterranean rivers and streams
We used principal component analysis (PCA), an unsupervised learning statistical technique, to examine the relationships between the hydrological indices. Given the strong correlation between hydrological indices , we also utilized PCA to reduce the dimensionality by selecting prominent metrics for each attribute . Correlation matrices were used to equalize the contribution of the indices to the PCA regardless of the scale . Following the PCA analysis, we selected indices with the highest loading coefficient (in absolute terms) associated to the five first components that accounted for approximately 70% of the total inertia for each of the zero-flow thresholds . Specifically, we reduce by more than half the hydrological indices with the loading coefficient, but it was not enough to choose the indices of each attribute that best define the hydrological pattern of NPRS and respond to the diversity of Mediterranean temporary flows. Thus, we decided to select one index for each attribute (magnitude, frequency, duration, timing, and rate of change) based on expert criteria and supported by statistical analysis. Here, we used both the repetition of the indices at each threshold with the highest loading coefficient associated (in absolute terms) to the first five PCA dimensions and hierarchical clusters based on the correlation distance for each group of selected indices with PCA of each attribute (Appendix C). PCAs and clusters were executed with theFactoMineR package .
Self-Organizing Maps (SOM) were used to classify temporal rivers into hydrological types according to similarities with the selected indices. SOM is an unsupervised machine learning technique that uses an artificial neural network to reduce multidimensional data into two-dimensional nodes heatmap. This is an interactive process that assigns a weight to each node on the map where the minimum similarity distance is chosen and the neighbourhood of the nodes is established. We followed the rule proposed by to determine the optimal dimension of the number of nodes in the map. The nodes must be close to 5√n where n is the number of samples analysed (in this study n=345). Consequently, our map should have approximately 93 nodes distributed by a layer of 9 rows x 10 columns. Additionally, we also evaluated the quality of the maps by means of quantization and topographic error of different layers (from 2 x 2 to 10 x 10). To identify clusters on the SOM output map and draw the boundaries, we applied a hierarchical cluster. The optimal number of clusters for a SOM output was determined using NbClust package in R , whereas SOM analysis and its graphical representation were generated with the Kohonen package .
2.4. Comparing methods for Mediterranean non-perennial rivers and streams classification
Finally, we compared the relationship of our results with three other hydrological classifications developed for Mediterranean rivers. Firstly, we used the classification of the Ministerial Order for Hydrological Planning of the Spanish legislation that classifies rivers into four categories depending on the number of days with presence of water throughout the year: (i) permanent river courses if water flows every day of the year, (ii) temporal river courses with presence of water during an average period of 300 days per year, (iii) intermittent with water flowing between 100 and 300 days per year, and (iv) ephemeral flowing less than 100 days per year. Secondly, we classified rivers following the Italian legislation depending on the number of months with presence of water in: (i) temporal river courses without presence of water for at least 2 of the last 5 years, (ii) intermittent with more than 8 months with water, (iii) ephemeral with less than 8 months with water, and (iv) episodic with water only after heavy precipitation events. Last, we applied the classification developed by LIFE+ TRivers project to evaluate the hydrological flow regime in NPRS according to the aquatic phase on biological communities . This classification is provided by TREHS free software tool and comprises four aquatic regime types or hydrotypes classification: permanent or perennial, intermittent-pools, intermittent-dry, and episodic or ephemeral. To carry out the comparison between the different classifications, we conducted an alluvial plot with ggalluvial and circlizepackage.