2.4. Statistical analysis
We used mixed-effects negative binomial regression models to generate
incidence rate ratios (IRRs) and 95% confidence intervals for each
predictor in the model. We modeled case and death outcomes separately
for each Phase, and we fit case and death models inclusive and exclusive
of cases/deaths identified as institutional (yielding eight models in
total). A random effect of town (351 towns in MA) was included to
address within-town spatial autocorrelation of residuals for nearby
tracts. We used counts of cases or deaths at each census tract as the
outcome variable, with census tract population used as an offset term to
reflect consistent rates. Predictors that affected modeling estimates
significantly or that demonstrated changes between the Phases were
retained in the models, as were predictors of a priori interest
to health disparities or specific COVID-19 risk factors regardless of
statistical significance (e.g., housing unit density and proportion of
AIAN residents). All statistical analyses were conducted in R (version
4.0.3) using the “glmmTMB” function from the glmmTMB package
(version 1.0.2.9).
3. RESULTS
Total cases, deaths, and community characteristics differed between
Phase 1 and Phase 2 of the COVID-19 pandemic in Massachusetts (Table 1).
Phase 1 had substantially fewer cases than Phase 2 (99,051 vs. 407,525),
but more deaths (7,285 vs. 6,207). Compared to Phase 1,
non-institutional outcomes in Phase 2 accounted for greater shares of
total cases (96.6% vs. 80.1%) and deaths (57.0% vs. 37.0%).
Geocoding was highly successfully at matching individuals with census
tracts of residence, with each outcome group having at least a 99.7%
match rate; in total, 1,360 cases (0.27%) were excluded from the models
due to inability to geocode to a census tract.