Data analyses and statistics
Descriptive statistics were used to describe the patient demographic,
clinical and laboratory characteristics. Continuous variables (e.g. age)
were expressed as median ± SD and were compared using Mann Whitney
U test. Categorical variables were expressed as numbers and percentages
and were compared using ꭓ2 or Fisher’s exact test. Correlation
and agreement between RAT and RT-qPCR results were calculated using
Pearson’s correlation (r) and Cohn’s kappa (κ), respectively
(Watson & Petrie, 2010). Measurements of
diagnostic performance of RAT (sensitivity, specificity, positive
predictive value, negative predictive value, accuracy and likelihood
ratio) for the whole subjects and subject’s subgroups were calculated on
contingency tables containing the numbers of each outcome. The
confidence intervals (CI) were calculated using the Wilson-Brown method
(Brown, Cai, & DasGupta, 2001).
Participant’s categories based on Ct values were defined
following a previous report (Nalumansi et
al., 2020). Receiver operating characteristic curve (ROC) was generated
to provide another assessment for the diagnostic power of the RAT. These
two analyses were done using GraphPad Prism version 8.0.0 for Windows,
GraphPad Software, San Diego, California USA,
(www.graphpad.com). To investigate
whether combining measurements of blood parameters would by any means
enhance the predictive accuracy of the RAT and thus raises its clinical
utility, a support vector machine (SVM) model with Monte-Carlo cross
validation was applied as described previously
(de Araujo et al., 2019) and the
performance of top ranked combination (best model) was evaluated for
sensitivity, specificity and accuracy using class probability analyses.
This analysis was done on data from 68 subjects (the other 15 subjects
had no data on any of the laboratory feature). Random forest
classification was utilized to reveal the demographic and clinical
parameters that are most important in determining individuals with
positive and negative results for both RAT and RT-qPCR. In both SVM and
random forest models, singular value decomposition method was used to
impute the missing values (Stacklies,
Redestig, Scholz, Walther, & Selbig, 2007). These analyses were done
using Metaboanalyst online server (Pang,
Chong, Li, & Xia, 2020).