Tip-level regression analyses
Information on days taken to build nest was available for 277 species
(69 domed, 208 open), and to test whether there are differences in the
time taken to build a nest we used linear models that account for
evolutionary history (PGLS, description of approach below). We used the
time taken to build in days as response variable (log transformed) and
we used log body size, the sex of the builder (female, both or male) and
the type of nest built (domed/open) as predictors.
To test whether building different nest types is associated with current
range size and niche width we used linear models that account for
evolutionary history, with climatic niche width (PCTEMPor PCPRE), range size or presence in urban environments
(yes/no) as response variables. As predictors, we used the species nest
type and body size (log), because body size is known to explain
variation in range size in birds (Gaston & Blackburn 1996). In the case
of range size we also accounted for the absolute mid-latitude of the
species range, since tropical species are expected to have smaller
ranges (Gaston et al. 1998). For initial analyses we used as
‘nest type’ a detailed classification of nest type with six categories
(domed, open, domed in cavity, open in cavity, pouch and both). Based on
non-significant differences across some categories, and for simplicity,
for posterior analyses we used a more intuitive classification (domed,
open and cavity). In this case, we consider cavity nesting species those
that build either domes or cups inside cavities or crevices. We also
consider species building pouches or both domed and open nests to have
‘open’ nests, because pouch nests do not have a roof and open nests
seems to be the derived state. For diversification analyses we also
created a simpler category based exclusively on the structure built,
splitting species only between domed and open nests, since building
inside cavities (domes or cups) represents the nesting site preference
rather than nest structure per se. Analyses were done using multiple
categorisations, to ensure results were consistent independently of how
we categorised nest type. We used the R package ‘performance’ and the
command check_model to look for outliers, and assess whether
there were any collinearity issues in our set of predictors (Lüdeckeet al. 2019).
For the continuous response variables (time spent building nest,
PCTEMP, PCPRE and range size) we used a
phylogenetic generalised least squares regression (PGLS), using maximum
likelihood to estimate lambda, implemented in the R package ‘caper’
(Orme 2013). To control for phylogenetic relatedness among species we a
generated a maximum clade credibility tree (MCC, across 10000 trees)
using the package ‘Phangorn’ (Schliep 2011) and a set of 10000
phylogenies from birdtree.org (Jetz et al. 2012). For models with
significant results using the MCC tree, we also performed PGLS analyses
across a set of 100 trees, using the LIEF HPC-GPGPU facility hosted at
the University of Melbourne. For each model using the MCC tree we report
the estimates and p-values calculated, and for the analysis on 100 trees
we generated highest posterior density intervals (HPD) for the estimates
using the R package ‘coda’ (Plummer et al. 2006). We highlight
that these tip-level regressions inform us on the links between multiple
variables and independent origins of such associations, but they do not
inform us about the processes underlying such associations (see below
for such analyses).
For the binary response variable (urban or not) we used a Bayesian
approach in the R package MCMCglmm (Hadfield 2010). Predictors were nest
type (open, domed, cavity) and log body size. This nest category was
used based on the results of the analysis described in the previous
paragraph. We run one model using the MCC tree as a random effect until
convergence was reached. To account for phylogenetic uncertainty, we
followed Ross et al. (2013). Briefly, we run the model using 1300
different trees and for each tree used 10000 iterations and saved the
last iteration before going into the next tree. We used the first 300
iterations (e.g. 300 trees) as burnin and assessed model convergence,
ensuring that the effective sample size was above 900. We report the
credibility intervals for each predictor in each model.