Methods
Data Source, Sample Selection and Variables
All data are as of April 17th, 2020. Data on confirmed cases and deaths come from the John’s Hopkins Coronavirus Resource Center (5). Data on population, geography, income and expenditures, and B Hepatitis, Measles and Diphtheria/Pertussis/Tetanus (DPT) vaccines come from World Development Indicators database (https://databank.worldbank.org/source/world-development-indicators) and from the United Nations Comtrade statistics (https://comtrade.un.org/). Data about foreign direct investments to and from China come from the International Monetary Fund Coordinated Direct Investment Survey (https://data.imf.org/?sk=40313609-F037-48C1-84B1-E1F1CE54D6D5). Detailed information on tuberculosis vaccination policies comes from the Bacille Calmette-Guérin (BCG) vaccine Atlas (14), last updated in 2017 in the online version (http://www.bcgatlas.org/index.php). Data about human freedom comes from the 2019 Human Freedom Report by the Fraser Institute (https://www.cato.org/sites/cato.org/files/human-freedom-index-files/human-freedom-index-2018-revised.pdf).
Data from a total of 121 countries, out of the 209 that reported cases of Covid19, accounting for about 99% of both confirmed cases and deaths, have been used. The countries in the analysis, listed in supplementary appendix, have been chosen in view of the availability of observations relative to covariates.
The set of dependent and independent variables is reported in Table 1. In particular, we used confirmed cases per million inhabitants as a proxy for the intensity of contagion; the number of cases 15 days earlier as a proxy for the stage of the diffusion of the virus; population in the largest city as a proxy for density and the degree of urbanization; life expectancy at birth as a comprehensive health indicator, and as a proxy for the share of aged people in the population; the latitude to define both the season as of April 17th (above or below the Equatorial line) and tropical countries (those countries whose latitude as defined by the corresponding variable in the World Development Indicators lies in between the two tropics). As for BCG vaccination policy, two alternative continuous measures were constructed, and used for robustness checks: the BCG coverage, as reported in national surveys in various years, and the years of absence of mandated vaccination, until 2020.
Coverage rates for different vaccines (B Hepatitis, Measles and DPT) were also used, to disambiguate the effects of BCG from those of a more general vaccination policy.
Among the variables proxying for economic ties with China, where the epidemic first appeared, we include imports from China, and the levels of inward and outward Foreign Direct Investment (FDI) relative to China. Finally, to proxy for the compliance with the lockdown measures implemented by the various governments, we use the Index of Human Freedom (HFI), a weighted average of 79 distinct indicators (37 for the personal freedom subindex and 42 for the economic freedom subindex), each one ranging from 0 to 10, with 10 representing the most freedom. The HFI ranges therefore from 0 to 10, in increasing order of freedom (https://www.cato.org/sites/cato.org/files/human-freedom-index-files/human-freedom-index-2018-revised.pdf).
We used Gross Domestic Product (GDP) per capita, and private and general government health expenditure to proxy for countries’ level of development (general and of their health system) and for the countries’ testing capability (more income and a richer health system should be positively correlated to more Covid-19 testing).
Statistical Analysis
To model our dependent variables, we used both ordinary least squared, as a reference estimator, and nonlinear estimation methods. In particular Tobit regressions, estimating both the impacts of covariates on the probability of a country reporting more than 100 cases as of April 17th, and their effect on relative diffusion, was our preferred estimation method.
The reported coefficients in the Tobit regression represent the marginal effects of the explanatory variables on the outcome variable, after accounting for the inclusion of countries in the high incidence group.
The second and third outcome variables, i.e. CFRs and MRs, were first modelled by ordinary least squares to obtain benchmark estimations, and then by Probit fractional regression methods to account for the fractional nature of the dependent variables (15). When the dependent variable is a fraction, as with CFRs and MRs, using log-odds transformation or Tobit regressions with lower and upper limits set to 0 and 1 may yield biased results (15, 16). Therefore, fractional regressions will be our preferred estimation method for fatality and mortality rates.
For consistency and comparison purposes all models included the same set of explanatory variables. Moreover, ordinary least squares and fractional regressions also account for heteroskedasticity, by using robust variance-covariance estimators.
All statistical analyses have been performed by using Stata/MP 16 for Windows.