Discussion
This study sought to convert a tertiary institution’s EMR database into
the OMOP-CDM to assess the value of conversion for post-market
regulatory purposes. This is part of Singapore’s pioneering effort to
assess the feasibility of CDM conversion to obtain insights that help
improve drug safety assessments. We find that while data conversion is
laborious, there are inherent benefits of undertaking the exercise. CDM
conversion is a collaborative effort involving multiple parties such as
data scientists and clinical experts that needs to be judiciously
undertaken as there are bound to be several data cleaning and editing
steps that will transform the data during the conversion. For instance,
an appreciation of the coding and medication supply practices and
inclinations of an institute can have significant implications to any
future analyses. The importance of understanding the provenance and
underlying constructs that the data represents cannot be understated in
the context of repurposing transactional healthcare data for drawing
reliable insights.
Once converted however, the set architecture of the CDM, the OHDSI tools
and opportunities available (i.e. past and ongoing study protocols,
analytic code templates) may be seen as a fertile ecosystem that can
accelerate analyses, although some modifications and extensions to
previously written code are likely required for specific use cases.
After making the necessary amendments to the code by Hripcsak et al, we
applied it on the OMOP-CDM converted data. The original code allowed us
to easily specify the inclusion and exclusion criteria as well as the
observation period of interest.12 Built into the
OMOP-CDM is a derived table (termed as the ‘Drug Era’ table) that
aggregates all drug exposures. This consolidated drug exposure table
allows analysts to define and apply the appropriate drug exposure
conditions required for the study (e.g. permitted gap days in between
prescription fills and stockpiling of previously filled prescriptions).
The ‘Drug Era’ table in the OMOP-CDM therefore simplifies precise
exposure specification, a key component in any pharmacoepidemiological
analysis. Notably, this derived data elements feature is unavailable in
other CDMs such as the pCORnet, Sentinel and i2b2 CDMs, which organize
medication data at the transaction level 13-16.
In the illustrative (uncontrolled) analysis of AF patients, most
(76.5%) were warfarin users, followed by rivaroxaban (19.7%) and
apixaban (3.8%) users. This is not surprising given that warfarin had
been registered in Singapore since 1995 and would be more commonly used
in patients requiring anticoagulation 17. The DOACs,
however, were more recent entrants to the market. Rivaroxaban was
registered in 2008, and apixaban in 2012. Since the dataset analysed
only included drug dispensing data from 2013 to 2016, this explains the
smaller number of patients on rivaroxaban and apixaban. Dabigatran had
been registered since 2009. However, we did not find any patients on
dabigatran that fulfilled our inclusion criteria.
The three agents compared appear to have differences in event rates for
efficacy and safety. Nonetheless, this was meant to be a descriptive
analysis only and no adjustment for baseline differences in patient
populations are performed. The relative proportions of adverse events
among warfarin and DOACs may be somewhat expected given previous
findings in clinical trials 18-20, though our study
showed higher numerical percentages across all incidences, which may be
the effect of clinical trials often selecting healthier populations
through stringent selective inclusion/exclusion criteria. This would
exclude patients with higher risks of AEs. This study also included a
wide variety of ICD codes involving haemorrhagic or thromboembolic
events, whereas most studies were confined to ICD codes relating to
ischaemic stroke, gastrointestinal bleeds, and intracranial haemorrhage.
Additionally, the number of patients using apixaban from our final
cohort was very small, hence it was challenging to identify any pattern
of usage or trend in events. The higher numerical percentages of events
in our study population may also be an indication of real world
prevalence of adverse events with OAC in Singapore. 21
To visualize the results, we extended the work by Hripcsak et al and
proposed the use of a 100% horizontally stacked, weighted bar chart
(Figure 5) that amalgamates utilization data with efficacy and safety
event information to facilitate multiple comparisons between agents and
make available the code that we used to derive Figure 5. The figure
potentially facilitates comparisons of the overall prevalence of
thromboembolic and bleeding events across the agents at the end of
follow up. However, the chart does not account for differences in
exposure times that would influence the number of patients experiencing
the events of interest. Indeed, patients on warfarin had been exposed to
warfarin for a longer period of time compared to those on rivaroxaban
and apixaban and were therefore more likely to experience events.
Therefore, incidence rates could not be represented even though this was
a longitudinal study. To resolve the imbalances in exposure times, a
fixed time-point analysis was undertaken (Figure 6). Comparing Figures 5
and 6 reveal that while bleeding events occur at any time during follow
up, the majority of thromboembolic events appear to occur within 6
months of initiating therapy.
Our study is not without limitations. Firstly, our data was obtained
from a limited period and may be insufficient for studying chronic drug
exposures and adverse events. The data was from only one hospital, which
could skew the findings depending on the usage preferences of OAC in
that hospital. It is precisely for this reason that a formal controlled
analysis was not undertaken to avoid a potentially biased comparative
assessment of the agents, opting instead to perform an illustrative
analysis only. To fully leverage the potential benefits of a CDM and
provide a more accurate overview of patient journeys, EMR data from more
healthcare institutions could be included into future analyses. The
application of propensity score matching prior to generating the
weighted bar charts could equalize the cohorts and their risk factors to
render comparable charts, but this was not viable given that there were
few patients at the outset. INR was used as a surrogate measure for
duration of drug exposure to warfarin. This is a blunt method to
correlate drug exposure as it assumes the presence of the INR test as an
indication of the patient taking warfarin. Additionally, the data
sources could only capture INR tests performed in public healthcare
institutions contributing to the EMR database. Point-of-care testing of
INR at satellite care sites or at patients’ homes were not included and
hence the frequency of testing may be under-represented. Furthermore, in
comparison, DOACs do not have a similar indirect measure to rely upon.
Nonetheless, the primary purpose of our study was to evaluate the
potential value of adopting a CDM. While our proposed bar chart does not
account for other confounding factors that may influence haemorrhagic
and thromboembolic event rates observed, it provides an overview of the
relative utilisation and adverse event incidence in a specified
population of interest– an important first step to identifying insights
that warrant more rigorous observational studies it. Within the OHDSI
community, similar tools such as the Cohort Characterisation tool in
ATLAS have been developed, which allows for viewing of incidence rates
of selected events in cohorts of interest. Our stacked bar chart is a
quick visual that allows for similar comparison of incidence rates,
albeit across users of different drug types. The sharing of code written
for the same CDM format could also enable other researchers to conduct
the same analysis on their own databases.