Discussion
This study sought to convert a tertiary institution’s EMR database into the OMOP-CDM to assess the value of conversion for post-market regulatory purposes. This is part of Singapore’s pioneering effort to assess the feasibility of CDM conversion to obtain insights that help improve drug safety assessments. We find that while data conversion is laborious, there are inherent benefits of undertaking the exercise. CDM conversion is a collaborative effort involving multiple parties such as data scientists and clinical experts that needs to be judiciously undertaken as there are bound to be several data cleaning and editing steps that will transform the data during the conversion. For instance, an appreciation of the coding and medication supply practices and inclinations of an institute can have significant implications to any future analyses. The importance of understanding the provenance and underlying constructs that the data represents cannot be understated in the context of repurposing transactional healthcare data for drawing reliable insights.
Once converted however, the set architecture of the CDM, the OHDSI tools and opportunities available (i.e. past and ongoing study protocols, analytic code templates) may be seen as a fertile ecosystem that can accelerate analyses, although some modifications and extensions to previously written code are likely required for specific use cases.
After making the necessary amendments to the code by Hripcsak et al, we applied it on the OMOP-CDM converted data. The original code allowed us to easily specify the inclusion and exclusion criteria as well as the observation period of interest.12 Built into the OMOP-CDM is a derived table (termed as the ‘Drug Era’ table) that aggregates all drug exposures. This consolidated drug exposure table allows analysts to define and apply the appropriate drug exposure conditions required for the study (e.g. permitted gap days in between prescription fills and stockpiling of previously filled prescriptions). The ‘Drug Era’ table in the OMOP-CDM therefore simplifies precise exposure specification, a key component in any pharmacoepidemiological analysis. Notably, this derived data elements feature is unavailable in other CDMs such as the pCORnet, Sentinel and i2b2 CDMs, which organize medication data at the transaction level 13-16.
In the illustrative (uncontrolled) analysis of AF patients, most (76.5%) were warfarin users, followed by rivaroxaban (19.7%) and apixaban (3.8%) users. This is not surprising given that warfarin had been registered in Singapore since 1995 and would be more commonly used in patients requiring anticoagulation 17. The DOACs, however, were more recent entrants to the market. Rivaroxaban was registered in 2008, and apixaban in 2012. Since the dataset analysed only included drug dispensing data from 2013 to 2016, this explains the smaller number of patients on rivaroxaban and apixaban. Dabigatran had been registered since 2009. However, we did not find any patients on dabigatran that fulfilled our inclusion criteria.
The three agents compared appear to have differences in event rates for efficacy and safety. Nonetheless, this was meant to be a descriptive analysis only and no adjustment for baseline differences in patient populations are performed. The relative proportions of adverse events among warfarin and DOACs may be somewhat expected given previous findings in clinical trials 18-20, though our study showed higher numerical percentages across all incidences, which may be the effect of clinical trials often selecting healthier populations through stringent selective inclusion/exclusion criteria. This would exclude patients with higher risks of AEs. This study also included a wide variety of ICD codes involving haemorrhagic or thromboembolic events, whereas most studies were confined to ICD codes relating to ischaemic stroke, gastrointestinal bleeds, and intracranial haemorrhage. Additionally, the number of patients using apixaban from our final cohort was very small, hence it was challenging to identify any pattern of usage or trend in events. The higher numerical percentages of events in our study population may also be an indication of real world prevalence of adverse events with OAC in Singapore. 21
To visualize the results, we extended the work by Hripcsak et al and proposed the use of a 100% horizontally stacked, weighted bar chart (Figure 5) that amalgamates utilization data with efficacy and safety event information to facilitate multiple comparisons between agents and make available the code that we used to derive Figure 5. The figure potentially facilitates comparisons of the overall prevalence of thromboembolic and bleeding events across the agents at the end of follow up. However, the chart does not account for differences in exposure times that would influence the number of patients experiencing the events of interest. Indeed, patients on warfarin had been exposed to warfarin for a longer period of time compared to those on rivaroxaban and apixaban and were therefore more likely to experience events. Therefore, incidence rates could not be represented even though this was a longitudinal study. To resolve the imbalances in exposure times, a fixed time-point analysis was undertaken (Figure 6). Comparing Figures 5 and 6 reveal that while bleeding events occur at any time during follow up, the majority of thromboembolic events appear to occur within 6 months of initiating therapy.
Our study is not without limitations. Firstly, our data was obtained from a limited period and may be insufficient for studying chronic drug exposures and adverse events. The data was from only one hospital, which could skew the findings depending on the usage preferences of OAC in that hospital. It is precisely for this reason that a formal controlled analysis was not undertaken to avoid a potentially biased comparative assessment of the agents, opting instead to perform an illustrative analysis only. To fully leverage the potential benefits of a CDM and provide a more accurate overview of patient journeys, EMR data from more healthcare institutions could be included into future analyses. The application of propensity score matching prior to generating the weighted bar charts could equalize the cohorts and their risk factors to render comparable charts, but this was not viable given that there were few patients at the outset. INR was used as a surrogate measure for duration of drug exposure to warfarin. This is a blunt method to correlate drug exposure as it assumes the presence of the INR test as an indication of the patient taking warfarin. Additionally, the data sources could only capture INR tests performed in public healthcare institutions contributing to the EMR database. Point-of-care testing of INR at satellite care sites or at patients’ homes were not included and hence the frequency of testing may be under-represented. Furthermore, in comparison, DOACs do not have a similar indirect measure to rely upon.
Nonetheless, the primary purpose of our study was to evaluate the potential value of adopting a CDM. While our proposed bar chart does not account for other confounding factors that may influence haemorrhagic and thromboembolic event rates observed, it provides an overview of the relative utilisation and adverse event incidence in a specified population of interest– an important first step to identifying insights that warrant more rigorous observational studies it. Within the OHDSI community, similar tools such as the Cohort Characterisation tool in ATLAS have been developed, which allows for viewing of incidence rates of selected events in cohorts of interest. Our stacked bar chart is a quick visual that allows for similar comparison of incidence rates, albeit across users of different drug types. The sharing of code written for the same CDM format could also enable other researchers to conduct the same analysis on their own databases.