Including genetic and environmental variation improves model performance

The inclusion of temperature predictors improved model performance relative to genotype-only models for both traits. Genetic variation alone explained only a small proportion of phenotypic variation and could be captured in a GSM. This supports our hypothesis that GxE alters the genetic architecture of traits across environments and renders individual markers less informative (Brachi et al., 2010; El-Soda et al., 2014; Fournier-Level et al., 2011; Linde et al., 2006) which is consistent with results from provenance testing in different tree species (Benito Garzón et al., 2019). The method used to compute pairwise genetic similarity did not affect model performance (Table S2 & S3, Appendix S2), suggesting they produced functionally identical descriptions of genetic similarity. Eu-ahsunthornwattana and colleagues (2014) previously reported high correlation between different genetic similarity estimates in humans, suggesting that our framework is likely to be broadly applicable.
Phenotypic variation was mainly explained by temperature differences between the plantings. This was expected: climate-responsive traits are by definition affected by environmental conditions, and the influence of temperature on plant phenotypes is well-established (Anderson et al., 2012; Arft et al., 1999; Foden et al., 2007; Schwartz & Hanes, 2010; Sun et al., 2020; Zhao et al., 2017). The better performance of models predicting DTB than SP is likely caused by the higher heritability of phenological traits over seed traits (Fournier-Level et al. 2013; Gnan et al., 2014). Moreover, the photothermal model of A. thalianaflowering time (Chew et al., 2012) suggests a mostly-linear relationship between flowering phenology and temperature that was effectively captured by our model. Still, internal validation results for SP are comparable to those for yield traits reported in non-model species (Deomano et al., 2020; Velazco et al., 2019).
Within the existing literature, our framework positions itself alongside similar models designed to predict climate response. This includes frameworks like ΔTraitSDMs (Benito Garzón et al., 2019) that link functional traits to species distributions and landscape genomic models measuring genomic offset based on associations between genetic and environmental variation (Capblancq et al., 2020; Gougherty et al., 2021; Supple et al., 2018). In comparison to these methods, our approach distinguishes itself on three main points.
Firstly, we focus on predicting quantitative traits. This provides a more straightforward measure of prediction accuracy and interpretation of model results. In the absence of trait data, mismatch between present and future conditions has been quantified using metrics like FST (Gougherty et al., 2021) and habitat suitability (Benito Garzón et al., 2019). These metrics cannot be estimated for individuals and do not provide actionable targets that land managers can directly manipulate or select for, contrasting the simplicity of using functional traits.
Secondly, we use fine-scale time series data instead of low-resolution environmental predictors such as the Bioclim variables (c.f. Gougherty et al., 2021 and Supple et al., 2018; Fick & Hijmans, 2017). This functional approach defines conditions as experienced by plants throughout their growing period, rather than through summary climate variables that condense years of weather data into a single statistic. This is necessary because A. thaliana plants can occupy the same geographical site but experience very different environments due to variation in germination time (Donohue et al., 2005). Predictors based on monthly, quarterly, or yearly averages cannot account for the multiple seasonal cohorts germinating in a single location. Moreover, long-term averages cannot account for the effects of climate change on temperature variability (Bathiany et al., 2018; Schär et al., 2004; Screen, 2014) and the distinct responses of plants to changes in mean temperature and temperature variability (Burghardt et al., 2016; Scheepens et al., 2018; Wheeler et al., 2000). Experimental studies have typically used a consistent increase in temperature to simulate climate change (Fournier-Level et al., 2016; Li et al., 2014; E. S. Post et al., 2008; Sherry et al., 2007; Springate & Kover, 2014) while maintaining current patterns of variability (Springate & Kover, 2014), but this may not reflect actual patterns of climate change. By considering both the daily temperature range and temperature variation between days, our predictions may better match trait values seen in natural populations. This is particularly relevant because revegetation will introduce plants to uncontrolled conditions.
Thirdly, we account for GxE (analogous to phenotypic plasticity; Ghalambor et al., 2007) in a way that facilitates out-of-sample predictions. GxE is the immediate and potentially adaptive response of organisms to environmental change (Ghalambor et al., 2007). Trait models typically consider GxE on the basis of genotypic and/or environmental identity (Montesinos-López et al., 2018; de Oliveira et al., 2020). In contrast, our ancestry-based approach allows for continuous GxE, allowing for estimation of phenotypic plasticity even in novel genotypes. We rely on the assumption that shared neutral ancestry results in similar climate responses, which may not be true if GxE depends on only a few important variants. However, environmental blocking results indicate our approach can be applied to traits with high heritability in the absence of knowledge regarding specific trait architecture, which lends itself well to application in poorly characerised species.
The consistent focus on transferability to novel conditions results in a model devoid of identity-based descriptors. Both genetic and environmental variation are described continuously, using predictors that can be straightforwardly estimated for any novel environment-genotype combination. This approach does have a tradeoff in terms of multicollinearity: we relied on multiple non-independent predictors to describe both genetic and environmental variation. Thus, the final model contain multiple correlated predictors and does not lend itself to biological interpretation. However, our validation results show this does not impact its ability to predict the climate response of multiple genotypes and identify those suitable for revegetation.