Main Body:
The aggregation of large datasets from museum records and community
science provides a valuable resource for macroscale ecological analyses.
However, such data include spatiotemporal and taxonomic biases that must
be addressed (Troudet et al. 2017; Ries et al. 2019).
Given these biases, proper data curation and appropriate modelling
strategies are necessary to ensure valid inferences.
Fric et al. 2020 used occurrence data for 100 species aggregated in the
Global Biodiversity Information Facility (GBIF) in temperate regions of
North America and Europe to track phenology across latitudes. Estimating
phenology metrics and trends from large occurrence datasets is possible,
but requires sufficient data density and appropriate statistical methods
(Taylor and Guralnick 2017). The data from this study were frequently
too sparse and insufficiently curated to estimate phenological patterns
across latitudes. Further, using regression of residuals resulted in
spurious patterns; after correcting for altitude and year, onset and
termination phenology appeared the same at low and high latitudes for
most species, contrary to previous findings (Karlsson 2014, Matechou et
al. 2014). We show that applying appropriate data curation and methods,
most species demonstrated later onset and shorter flight periods at
higher latitudes.
Many species analyzed (in 105 datasets separated by continent) in Fric
et al. (2020) had insufficient data for independent analysis. Data were
analyzed with as few as 15 occurrence records across >20
degrees latitude and >100 years. Phenological “onset” and
“termination” of flight periods were extracted simply as the first and
last day-of-year (DOY) values within latitudinal bands, pooled across
all years and altitudes. Pooling data increased spatiotemporal bias,
lowered the resulting power to detect patterns, and resulted in only one
observation date being used as both “onset” and “termination” of
flight periods (resulting in one-day flight periods and “peak flight”)
in an average of 20% of latitudinal bands per species (Figure 1).
Fric et al.’s data curation was inadequate regarding spatial precision
and outlier detection. Altitudes were extracted using imprecise GIS
coordinates, sometimes representing sea floor or vague place names (eg,
”Mt Shasta”) and skewed left, giving high altitude observations outsized
leverage in regressions. Temporal outliers were problematic; one
species’ onset at 68° N was in January, when the next occurrence across
all latitudes was in June. No sources were cited for species traits, and
we found evidence documenting additional generations in portions of
their range for 22 species identified as obligate univoltine
(Supplemental Table 1).
Finally, the analytical approach in Fric et al. (2020) produced biased
results. Beyond regressing individual species’ phenometrics against
latitude, altitude, and year separately, regression of residuals was
used for corrected regressions. This resulted in biased parameter
estimates due to collinearity among explanatory variables and reduced
statistical significance (Freckleton 2002). Results suggested most
species’ onset (67 datasets) and termination (71 datasets) were similar
across latitudes (Figure 2). These results were surprising, considering
well-documented delayed and/or shortened flight periods at high
latitudes (Karlsson 2014).
We sought to validate those results with a more robust analysis,
applying stricter data standards and curation. For 72 species (76
datasets) we confirmed as univoltine, we filtered data for altitude
(0-500m) and timing (March-November). We calculated phenometrics for
year-latitude combinations with at least 10 observations. Only 22
datasets, all European, met these requirements in at least three
latitudinal bands (Supplemental Table 1). For these, onset and
termination were estimated from a Weibull distribution using R package
phest (Pearse 2017) and bounded by days (60,330). To estimate unbiased
parameters, we modeled each species phenometric using multiple
regression (DOY ~ latitude + year) using R version 3.6.2
(R Core Team 2019). We compared our results to those from Fric et al.
Supplemental Table 2.
We were unable to validate most patterns reported in the original study.
Our results varied substantially in both onset and termination across
species (Figure 2). In contrast to Fric et al. (2020), we found
significantly later and/or shorter flight periods at higher latitudes
for most species. These new results were consistent with the latitudinal
gradient in climate and growing season length (Kobayashi et al. 2016).
This evidence of inaccurate phenological patterns also discredits Fric
et al.’s downstream trait analyses.
Despite this critique, we recognize that occurrence data have great
potential to address many ecological questions. New aggregations of
large datasets provide valuable inputs for macroscale ecological
research, and the sheer amount of data accumulated across time and space
may provide statistical power. However, “with great power must come
great responsibility” (Lee 1962); robust scientific inference requires
careful data curation and robust analytical models. Other phenology
metrics are less confounded by abundance and effort (Belitz et al.
2020); integrated community models with random species effects or
informed priors better suit community phenological analyses (Ellwood et
al. 2012). We enthusiastically support continued digitization and use of
collection data in ecological analysis, but urge researchers to exercise
caution when using these data.