2.2 Study area and species distribution data
The study area comprises approximately 31 200 km2 in
the southeast part of Poland, which extends from latitude 50.2° to 49°N
and longitude from 19° to 23°E (Figure 1). This area is diversified due
to environmental conditions mostly shaped by the altitude ranging from
160 to 2503 m a.s.l. Additional factors underlying diversity are
correlated with climate, land use systems, land relief, and human
population density. In the northern part, the lowland areas are used for
agriculture and the foothills are dominated by forests, and the southern
part has high mountains with alpine vegetation. In addition to the
north–south altitudinal gradient, there is also a climatic gradient of
continentality, with higher temperature range in the eastern part of the
study region (Szabo-Takacs, Farda, Zahradníček, & Štěpánek, 2015)
which, in the studied region, correlated strongly with decreasing
eastward precipitation (Appendix, Tab S.3.). The study area includes a
densely populated industrial landscape (Silesia), urban agglomerations
(largest city Kraków), and moderately populated agricultural areas, as
well as sparsely populated areas in the mountains. The detailed
characteristics of the study area (climate, topography, land use
structure, and human population density) were previously described by
Szymura et al. (2018).
FIGURE 1 The study region location (green) on a background of
land relief (a), and distribution of communication network and
settlements on the background of altitude within the study region (b).
The data on distribution of the studied Solidago species were
obtained from the atlas Distribution of Kenophytes in the Polish
Carpathians and their Foreland (Zając & Zając, 2015), which shows maps
of species presence or absence in a 2 × 2 km grid in the Polish part of
the Carpathian Mountains and their foreland, Central Europe. The
fieldwork designed for the purpose of compiling the atlas was based on a
survey of flora in particular regions (e.g., mountain ranges, particular
towns and surrounding areas) and exploration focused exclusively on
neophytes in given regions. These observations were supplemented with
additional data from phytosociological relevés, herbarium records, and
published materials. The fieldwork was carried out by several dozen
professional botanists as well as graduate students, focusing on a
predefined 2 × 2 km grid for sampling (Zając A., personal information).
This work represents a ‘survey’ type of data, according to Elith et al.
(2020) nomenclature. Such data, with true absence records, enable
species distribution models to be less biased and to perform better,
compared with presence-only records, the ‘collection’ data type
(Barbet-Massin et al., 2012; Elith et al., 2020). This distinction is of
particular importance for examination of wide-ranging and tolerant
species (Brotons, Thuiller, Araújo, & Hirzel, 2004). To reduce the
possible effect of lower sampling effort in some regions (Bailey, Boyd,
Hjort, Lavers, & Field, 2017; Yang, Ma, & Kreft, 2013), the
potentially undersampled squares were excluded from modelling. For this
purpose, we used a ‘target group approach’ (Chapman, Pescott, Roy, &
Tanner, 2019; Phillips et al., 2009) and a previously established model
which explains neophyte richness (the ‘target group’ in this case) as a
function of environmental and socio-economic variables in the studied
region (Szymura et al., 2018). We assumed that the squares with the
highest negative model residuals (i.e., squares where recorded neophyte
richness was much lower than predicted by the model) indicated
potentially undersampled regions. After preliminary testing, we decided
to exclude from modelling 25% of squares (1950 squares) with the
highest negative residual values and simultaneously without any invasiveSolidago records (for details of this calculation see Appendix
1).