Distinction From Model Selection
Covariate selection using the backdoor criterion is fundamentally distinct from information-based model selection techniques. The backdoor criterion is based on counterfactual reasoning, equating observational distributions to what would be expected under a randomized control experiment (Pearl 2009). Unlike model selection, the backdoor criterion was specifically created to answer cause and effect relationships from observational data. Further, whereas model selection relies on the data to determine the best model, the backdoor criterion uses domain knowledge, above all else, to determine the best causal model for a given causal query. The use of DAGs and the subsequent application of the backdoor criterion allows ecologists to move away from an automated approach of model selection to one that empowers ecologists to think critically about the cause-and-effect relationships in their study system. The use of DAGs also facilitates open critique of causal assumptions therefore their causal conclusions, which in turn can lead to productive scientific debate that deepens our understanding of ecological phenomena (e.g., see Schoolmaster Jr. et al. 2020; rebuttal by Grace et al. 2021; and reply by Schoolmaster Jr. et al. 2021).
Currently, DAGs and the backdoor criterion are significantly less utilized than predictive model selection techniques for understanding causal relationships in ecology. Thus far, the backdoor criterion has been applied to understand the causes of species level trait covariation (Cronin and Schoolmaster Jr. 2018), biodiversity-ecosystem function correlations (Schoolmaster Jr. et al. 2020), and causal drivers of coral-algal regime shifts (Arif et al. 2021). As these varied examples demonstrate, the backdoor criterion can be widely applicable for understanding ecological causal relationships. Increasing its use across ecological studies will require a shift in culture toward openly discussing causality. While predictive model selection techniques can play an important role in developing good statistical models, they should not be conflated with causal inference (Laubach et al. 2021). Ultimately, ecologists must start to rely on valid causal inference methods to answer fundamental causal questions in observational ecology.