Abstract:
Objective: High data quality is essential to ensure the validity of clinical and research inferences based on it. However, these data quality assessments are often missing even though these data are used in daily practice and research. Our objective was to evaluate the data quality of our high-resolution electronic database (HRDB) implemented in our pediatric intensive care unit (PICU).
Design: A prospective validation study of a HRDB.
Setting: A 32-bed pediatric medical, surgical and cardiac PICU in a tertiary care freestanding maternal-child health center in Canada.
Population: All patients admitted to the PICU with at least one vital sign monitored using a cardiorespiratory monitor connected to the central monitoring station.
Interventions: None
Measurements and Main Results: Between June 2017 and August 2018, data from 295 patient days were recorded from medical devices and 4,645 data points were video recorded and compared to the corresponding data collected in the HRDB. Statistical analysis showed an excellent overall correlation (R2=1), accuracy (100%), agreement (bias=0, limits of agreement=0), completeness (2% missing data) and reliability (ICC=1) between recorded and collected data within clinically significant pre-defined limits of agreement. Divergent points could all be explained.
Conclusions: This prospective validation of a representative sample showed an excellent overall data quality.
Key words: Pediatrics; Critical care; Database; Electronic Health Record; Big data
INTRODUCTION
Over the past two decades, technological and computer advances were used extensively to modernize medicine and assist medical teams in daily practice, as shown by the widespread use of electronic medical records (EMR) or connected biomedical devices. While the dedicated purpose in health care services is patient management, these systems have been perceived by many scientists as a way of improving clinical research efficiency and data analysis (1–4). As a result, many medical databases (DB) have been built since the beginning of the twenty-first century (4–6). To optimize our research quality in our different fields of expertise such as respiratory physiology and the development of clinical decision support systems (CDSS) (7), we implemented in 2015 an automated electronic data gathering process in our pediatric intensive care unit (PICU) (8). This DB was designed to develop and validate virtual or synthetic patients for cardiorespiratory physiology as well as for CDSS and data-driven learning systems (8). However, a validation step of the collected data is necessary before considering this DB suitable for research purposes (9–11). Indeed, the value of research findings depends on data quality (12,13). Several guidelines or frameworks were elaborated to evaluate and report the quality of DBs and national registries and to guide designers of DBs at each step of the data collection (12,14,15). These documents highlighted the need to evaluate data quality, to compare dataset quality performance between them and raised the question of data validity that every scientist or clinician, as data users, deal with whether in day-to-day clinical care decision-making or in medical research (16,17). However, none of these guidelines provide a detailed validation process that is entirely suitable for high resolution electronic DB (HRDB), defined as a database that collects more than one data point per minute per variable and per patient. Besides, to our knowledge, none of the HRDB published a detailed validation procedure and evaluation of the quality of the data (18–20). This article constitutes the final part of the validation process of our HRDB (8,11). The purpose of this study was to assess the quality of the data include in our HRDB and to provide a generalizable validation method for all HRDB.
METHODS
This study was a prospective data quality assessment conducted in the PICU of Sainte-Justine hospital (Montreal, Canada), a pediatric 32-bed medical, surgical and cardiac ICU in a free-standing tertiary maternal-child health center. The study was performed between June 2017 and August 2018.
Population
Eligible patients were those admitted to the PICU with at least one vital sign monitored using a cardiorespiratory monitor connected to the central monitoring station. Patients were excluded if the presence of one study observer in the patient room was considered incompatible or inappropriate by the physician or the nurse in charge.
Standard management
As previously reported (8), as a standard of practice in our PICU, all physiological, therapeutic and clinical data from medical devices available at the bedside of all children admitted in the PICU were continuously collected in an organized HRDB linked to the EMR from admission to discharge of the PICU (8). Biomedical signals from the monitors were sampled and recorded every 5 seconds while data from ventilators and infusion pumps were recorded every 30 seconds. The full details of the HRDB structure were previously reported (8).
Study protocol
The study was divided in three periods of 14, 16 and 17 days respectively (convenient samples): the first was dedicated to data from the monitors, the second to the data from the ventilators and the third to the infusion pumps. During the first period, data were collected on devices that displayed the monitored data outside of the patient’s room, whereas both second and third period took place at the bedside. On every study day, a sample of 20% of the children hospitalized in the PICU that meet the inclusion criteria was randomly selected. One patient could have been included more than once. A videotape of the data displayed on the medical devices (monitors, ventilators and infusion pumps) and available at the bedside, such as heart rate or positive inspiratory pressure (Figure 1) was recorded. Each day, a time synchronization process with the automatically calibrated clocks of the hospital and the video recorder was made. Each monitor (IntelliVue MP60, MP70 and MX800, Koninklijke Philips Electronics, Amsterdam, the Netherlands) was video recorded for 30 seconds, each ventilator (Servo-I®, Maquet, Getinge, Sweden) for 90 seconds and each infusion pump (Infusomat®, B. Braun Medical Inc, Bethlehem, Pennsylvania, U.S.) was simply photographed. Since ventilator data are recorded every 30 seconds in the HRDB, 90 seconds was enough to get at least two consecutive records in the HRDB. Because the infusion pumps parameters are only set, and not measured, static pictures were considered enough. The data displayed on the devices were then manually extracted into a spreadsheet from the pictures or at every second from the videotape. Data were periodically screened for aberrant values. These data, collected by one independent observer (AM) who was not implicated in patients’ care, were considered as the reference data. Three types of data from medical devices were collected (Figure 1): 1) Physiologic signals from patient monitors (heart rate, oxygen saturation and systolic, diastolic and mean blood pressure) 2) Respiratory and ventilator parameters from the ventilator (positive end-expiratory pressure, peak inspiratory pressure, respiratory rate, respiratory minute volume) 3) Pharmacotherapy from the infusion pumps (ex: drug names and infusion rate). The corresponding HRDB data were extracted using structured query language (SQL) and used for comparison (Figure 1).
Endpoints
The primary endpoints were the absolute value of the selected variables (heart rate (HR) and pulse oximetry (SpO2)) recorded from the monitors. The secondary endpoints were:
Reference data were compared to the experimental data simultaneously collected in the PICU HRDB at a specific time point for each patient. Variables were expressed as mean ± standard deviation or median [minimal – maximal value] for continuous variables, depending on whether they followed a normal distribution (Shapiro-Wilk normality test) and count (percentage) for categorical variables. Comparisons between experimental and reference data were made by dependent tests as appropriate.
Under the concept of quality lies several features that tends to delineate the degree to which the HRDB is a true representation of the reality of the PICU’s data (14,21)
All analyses were performed after the exclusion of the paired measurements when one of the experimental or reference data was missing. Thus, we intended to differentiate inaccurate data from missing data. A p-value < 0.05 was considered statistically significant. Statistical analyses were performed using open access R software (version 3.5.1, 2018-07-02, http://cran.r-project.org/).
Ethics: The study was approved by the institutional review board of Sainte-Justine Hospital (reference number 2016-1210, 4061). The exploitation of the HRDB is regulated by a DB policy validated by the institutional review board and no protected health information were stored in the HRDB nor in the video recordings. No patients or caregivers were recorded in the videos.
RESULT
Between June 1, 2017, and August 30, 2018, 1378 patients were admitted to the PICU and 100% were included in the HRDB. During the effective 47 days of study, 81 patients were hospitalized in PICU and 81 (100 %) were included in the HRDB. Data from 70 patients (86 %), 295 patients’ days, were recorded from medical devices (Table 1) and 4645 data points were video recorded and compared to the corresponding data collected in the HRDB (Table 2).
Monitor data validity
Statistical analysis showed an overall excellent correlation, agreement and reliability, as shown in Table 2. ICCs were considered as excellent for all the tested variables (Table 2). Bland-Altman analysis showed an excellent accuracy and precision between recorded and collected data within clinically significant pre-defined limits of agreement (Supplemental Digital Content 1). A single heart rate measurement in the experimental data (0.03 %) was considered as clinically different from the reference data (Figure 2,3). We documented 74 data points (2 %) that were missing, as detailed in Table 2.
Ventilators’ data validity
Statistical analysis showed an excellent overall correlation, agreement and reliability (Table 2, Supplemental Digital Content 2). A small, but statistically significant difference was found for the positive inspiratory pressure (mean difference of -0.022 cmH2O, p-value 0.02). This difference was observed only for a minority of the data (95.5% of all values were equal). Agreement remained over 90% with an excellent correlation between reference and experimental data. ICCs were considered as excellent for all the tested variables (Table 2). Bland-Altman analysis showed excellent accuracy and precision (Supplemental Digital Content 2). No data were missing (table 2).
Infusion pumps data validity
The comparison with the data displayed on the infusion pumps showed an overall excellent correlation, agreement and reliability (Table 2) with Bland-Altman analysis showing an excellent accuracy and precision between recorded and collected data for all the tested variables (Supplemental Digital Content 2). ICCs were considered as excellent for all the tested variables (Table 2). Twenty-three infusions (9 %) were not retrieved in the HRDB (Table 2). Nine episodes were related to six patients without any pharmacological data collected in the HRDB and 14 episodes were related to pump dysfunction. Other minor discrepancies were noticed between HRDB and EMR (Table 3). Correlation between HRDB and EMR regarding drugs of interests over the study period were depicted in figure 4.
Timestamps
A delay was observed between time synchronized videotapes and collected data from the monitors and the ventilators. This delay was less than 28 seconds and remained stable among patients. Besides, regarding infusion pumps data, we discovered that the data were not collected in the HRDB every 30 seconds as expected, but at different time interval between 10 and 40 seconds or when a modification was done. No delays were observed between the source and the HRDB.
DISCUSSION
Whether in day-to-day clinical care decision-making or in medical research, the need to evaluate data quality is essential to ensure the reliability of DB (9,21,28,29). To our knowledge, this is the first study to validate PICU data contained in a specific HRDB (20,30). This article is indissociable from our two previously reports (8,11). The first report described the gathering process of our HRDB (8) and the second gave a comprehensive description of the HRDB’s architecture and process (11), this articles constitute the quality assurance of the HRDB (14,31). This third article completes this set. It contributes to the quality assurance phase and to the quality control phase of the HRDB (14,31).
As there were no guidelines specifically designed to guarantee high-resolution data quality (9,14), we elaborated the first complete validation procedure. Our validation procedure was inspired by previously published experiences (9,10,30,32–34) and guidelines (13–15,28,35) regarding data quality assessment in the field of medical DB collected at a lower rate or in a restricted area. To evaluate the quality of the data, we chose to perform an external validation procedure. We compared our extracted results with the information displayed on the monitor or the biomedical device (21). Our study showed an excellent overall accuracy, completeness and reliability of our HRDB when compared to displayed data at the bedside at the same time.
Regarding the accuracy of the dataset, we noticed only one clinically significant different heart rate value. This error was due to a rapid acceleration of the heart rate (Figure 2). In the video, the heart rate increase from 118 beats/minute to 154 beats/minute and the HRDB recorded one single value at 135 beats/minute during the transition. This suggests that monitors processed those data and only refreshed the display at a specific interval (probably between one and two seconds) and did not show intermediate data. Then, the HRDB recorded an intermediate value, which explains the importance of the difference between the reference value and the experimental value. Differences between the HRDB’s data and the reference data were observed regarding PIP. Even statistically significant, disagreements were not clinically significant (the maximal difference was 0.5 cmH20 and concerned only 4.5% of all the collected PIP, the remaining 95.5% values were strictly equal) as shown by a mean difference of -0.022 cmH2O. Only integers are displayed on the ventilator screen and the data processing algorithm of the raw values measured by the ventilator is unpublished. Thus, we suspect that these very minor differences may be due to rounding process.
Regarding the completeness of the dataset, 2% of the data were missing. Even less than previously reported (9,14,30), this number of missing data didn’t meet our expectations for this HRDB, as we planned for a 0% missing data. This loss of data was mainly caused by a systematic error in the data processing. Indeed, we discovered that the original HRDB structure could only record nine parameters simultaneously. Then, when more than nine parameters were sent, the additional data were not registered. Once this issue was identified, we modified our database for an entity-attribute-value structure where each data point is stored as an independent row (36,37).
Regarding infusion pumps and pharmacological data, the discrepancies between the experimental and the reference data or the EMR appeared associated with variability in care more than with a gathering process failure. Regarding the 23-missing data from infusion pumps, we proved that the corresponding infusion pumps were disconnected from the network, thus the data were not sent to the HRDB. This disconnection of the infusion pumps explained these discrepancies between the EMR and the experimental data, with all the pharmacological data missing in six patients. In addition, the large majority of inconsistencies between the EMR and the experimental data were due to a time difference from the beginning or the end of the drug. In the EMR, a drug needs to be ordered before the drug rate could be registered, while in the HRDB, the rate starts to be registered directly when the pump is connected to the network. Furthermore, medications were not registered in the patient EMR, probably because the physician did not order it. However, nursing notes confirmed that the drug was given. In these situations, the HRDB could be considered as more accurate than the EMR. On two occasions, the name of the fluid was different between EMR and HRDB. However, the name recorded on the pump and the one in the HRDB was the same, suggesting the infusion pump drug name was not modified when the medication was replaced. Finally, it happened twice that no data were recorded over a period when they should be. These intervals happened just before the patient was moved to another room and the procedure is to disconnect the pumps before moving the patient. Although these four situations altered the HRDB accuracy, they were not due to a HRDB limitation. Last, timestamp asynchronies were due to a server setting that was corrected after this study.
This study’s main limit lies in the lack of validation of the complete dataset (10,14,30). We have considered several procedures to apply either during or after the gathering of the HRDB. Given the gigantic data gathering rate (about 10,000 data points per minute), it is humanly impossible to both gather and validate the data simultaneously while collecting the DB or even validate the entire database retrospectively. Thus, we decided to perform a point-by-point data analysis on a randomly chosen patients sample considered as representative of the HRDB (30). Besides, some could argue, and they would be right, that we were not able to correct abnormal values or undisplayed data. But, as this dataset is supposed to reproduce the patient’s entire course in PICU, abnormal values and undisplayed data should be considered as part of the patient’s course as much as a true value (19). Furthermore, this is a study in one institution with an excellent understanding of the value of data quality. Even if the methodology is transferable to other data, this study only validates this particular data in this particular HRDB and its results shouldn’t be generalized to other clinically collected data. Finally, even limited as most of the analyzed data were electronically captured, we must consider the possibility of a Hawthorne effect. The observational methodology might have modified the quality of the data being entered in the EMR by the bedside personnel.
CONCLUSION:
This study showed an excellent overall quality of the data include in the HRDB of our PICU while performing validation procedures on a representative sample. We considered that this study provides an assurance for future HRDB users of the data quality, especially regarding monitor and respirator data. By reporting and detailing this data quality validation process, the process becomes reproducible by any research team and sets a reference for future validation studies of similar datasets.