Study design and data sources
We conducted a two-sample Mendelian randomization (MR) analysis to
assess the causal associations of sex hormones with COVID-19
susceptibility and severity. MR is a causal inference approach, which
uses germline genetic variants as instrumental variables (IVs) to
estimate possible causal effects of modifiable risk factors on health
outcomes. This approach is less prone to non-genetic confounding and
reverse causation bias 15,16.
We used data from the UKB and COVID-19 HGI. Summary statistics on sex
hormones levels (including estradiol, total testosterone (TT),
bioavailable testosterone (BT), and sex hormone binding globulin (SHBG))
were obtained from the largest genome-wide association studies (GWASs)
of sex hormones 17,18, in up to 230,454 women and
194,453 men of European ancestry in the UKB.
In the estradiol GWAS, individuals’ estradiol levels were analyzed as a
binary phenotype, with values equal to or above the detection limit (175
pmol/L) considered as one group, and values below the limit as another
group 18. Moreover, for quantitative analysis,
individuals with estradiol levels below the detection limit were
included by using censored regression modeling with a Tobit type I
technique 19. This approach allowed analyzing
estradiol levels as a continuous phenotype in a total of 163,985 women
and 147,690 men 18.
Testosterone and SHBG levels were measured and analyzed as continuous
phenotypes. In the original GWAS of SHBG levels, body mass index (BMI)
was unadjusted and adjusted for, in order to assess the potential impact
of collider bias 20. In this study, we took potential
collider bias into account by using summary data from GWAS of SHBG
levels, where BMI was unadjusted and adjusted for, to estimate the
causal effects of genetically predicted SHBG on COVID-19 susceptibility
and severity, respectively.
For the outcomes in this study (i.e., COVID-19 susceptibility and
severity), summary statistics were obtained from the latest and largest
GWAS of COVID-19 outcomes in European ancestry conducted by HGI with
data freeze 6 (excluding UKB and 23andMe participants)21. Three COVID-19 related phenotypes were selected as
the outcomes: (1) severe acute respiratory syndrome coronavirus 2
(SARS-CoV-2) infection (as cases) and the general population (as
controls) (74,614 cases and 1,803,529 controls); (2) COVID-19
hospitalization (as cases) and the general population (as controls)
(14,925 cases and 1,393,029 controls); and (3) COVID-19 critical illness
(as cases) and the general population (as controls) (4,297 cases and
378,521 controls) (Table 1 ). Due to a lack of European ancestry
GWAS of COVID-19 critical illness in data freeze 6, summary statistics
on this outcome were obtained from GWAS data freeze 5 instead. This
study used publicly available data and was not subject to institutional
review board approval.