Statistical analyses
Statistical analyses were conducted using R (3.5.2; R Development Core
Team, R Foundation of Statistical Computing, Vienna, Austria). In our
first analysis, we used a generalized linear mixed model (lme4package; Bates et al. 2015) to test whether the number of discrete songs
per hour varied among the six breeding stages. We included the discrete
song rate from a given recording session as the response variable, the
breeding stage (i.e., no nesting duties, nest building, egg stage,
nestling care, fledgling care, non-breeding) observed that same day as a
fixed factor, and subject identity (1–12) as a random effect to account
for possible dependencies among multiple recording sessions of the same
male. The response was modeled with a negative binomial distribution and
log link. The overall statistical significance of breeding stage was
tested using the Anova function of the car package (Fox
and Weisberg 2019). Post-hoc linear contrasts of estimated marginal
means (emmeans package; Lenth 2021) were then used to compare
discrete song rate between the breeding (i.e., mean of no nesting
duties, nest building, incubation, nestling care, and fledgling care)
and non-breeding seasons, between the no nest duty and nest duty stages
(i.e., mean of nest building, incubation, nestling care, and fledgling
care) of the breeding season, and between the nestling care stage and
the other nesting stages (i.e., mean of nest building, incubation, and
fledgling care). We could not repeat this analysis on rambling song
because preliminary inspection of the data revealed that only 5% of all
songs were rambling song, thus precluding reliable estimates of rambling
song rates from our short recording sessions. For example, only 11
rambling songs were detected during the entire nestling care period.
In our second analysis, we used a generalized linear mixed model to test
whether song perch height was associated with breeding stage or song
type. The song perch height (m) of each song was included as the
dependent variable, with breeding stage and song type as fixed factors
and recording session (1–32) nested within subject identity (1–12) as
a random effect to account for possible dependencies among multiple
perch heights estimated from the same recording session of the same
male. The response was modeled using a Poisson distribution with log
link. After testing the overall significance of breeding stage and song
type, post-hoc linear contrasts of estimated marginal means were used to
compare song perch height between the breeding and non-breeding seasons
and between the no nest duty and nest duty stages of the breeding
season.
Results were considered statistically significant where P< 0.05. We used the DHARMa package (Hartig 2020) to validate
the two statistical models. Its diagnostic tests, combined with visual
inspection of scaled residual plots, indicated adequate model fit. We
also simulated the responses of each model and compared the simulated
data to the original data by overlaying semi-transparent histograms of
each; in all cases, we found strong agreement between the simulated data
and the original data.