Research Article Open Access
Data Validity in the Society for Vascular Surgery Vascular Quality Initiative Registry
Jens Eldrup-Jorgensen1* and Jim Wadzinski BA2
1Division of Vascular Surgery, Maine Medical Center, Portland, Maine, Department of Surgery, Tufts University School of Medicine, Boston, Massachusetts
2Society for Vascular Surgery Patient Safety Organization, Chicago, IL
*Corresponding author:

Jens Eldrup-Jorgensen, Maine Medical Partners Vascular Care, 887 Congress St, Suite 400, Portland, ME 04102, Tel: 207-662-7032, Fax 207-774-9388, E-mail: @

Received: February 12, 2019; Accepted: August 21, 2019; Published: August 26, 2019
Citation: Eldrup-Jorgensen J, Wadzinski J (2019) Data Validity in the Society for Vascular Surgery Vascular Quality Initiative Registry. SOJ Surgery 6(2): 1-5. DOI: http://dx.doi.org/10.15226/2376-4570/6/2/00165
AbstractTop
Objective: The Vascular Quality Initiative (VQI) was formed in 2011 to collect demographic and clinical data on vascular patients to improve care. The VQI has thousands of providers and hundreds of centers contributing data on over 557,000 procedures. The purpose of this article is to describe an evaluation of data validity and a quality improvement initiative within a clinical medical registry.

Methods: After discovery of data discrepancies, the VQI undertook a data review in 2017. A statistical review was performed to identify variances in the data distribution for each variable reported in the data sets in each registry. Using a retrospective audit strategy, the internal consistency and accuracy of data was investigated.

Results: There are 12 different registries within the VQI and the error rate for variables was generally less than 5% involving mostly descriptive variables rather than key outcome variables. Less than 2% of data points were found to be in error.

Discussion: Clinical registries contain large amounts of data subject to multiple error types. The audit allowed discovery of coding and other systematic errors resulting in improved quality control measures and data checks.

Conclusion: Although the audit was time consuming and labor intensive, it led to improved data accuracy as well as new and enhanced audit strategies.

Key words: data validity, audit, data quality, registry
IntroductionTop
The Vascular Quality Initiative (VQI) was created in 2011 and has been collecting patient data since that time. The mission of the VQI is to "improve the quality, safety, effectiveness and cost of vascular health care by collecting and exchanging information [1]." The VQI collects information on different procedures in 12 different registries, including infrainguinal bypass, carotid endarterectomy, peripheral vascular intervention, open abdominal aortic aneurysm repair, suprainguinal bypass, endovascular aneurysm repair, carotid artery stent, lower extremity amputation, varicose vein operations, hemodialysis arteriovenous access procedures, inferior vena cava filter placement, and thoracic endovascular aneurysm repair. Currently there are over 3000 physicians from 536 sites throughout the United States and Canada who have entered data on over 557,000 procedures. The patient data is entered at the individual centers, transmitted securely to M2S ®, where it is warehoused. The VQI operates under the auspices of the Society for Vascular Surgery (SVS) Patient Safety Organization (PSO), which allows hospitals and providers to engage in quality improvement efforts in a confidential and protected environment [2].

VQI data has been used for multiple quality improvement initiatives. Analysis of VQI data has impacted how clinicians provide care which has undoubtedly prevented strokes, heart attacks, limb loss and mortality. VQI has provided information that allows providers and centers to benchmark their care and performance. The VQI registry has served as a robust source of data for dozens of scientific analyses and publications. As well as preventing complications, VQI data have also allowed centers and providers to reduce their resource utilization and length of stay with a subsequent reduction in costs.

There are over 160,000,000 data points in the VQI registries and, as is inevitable, data errors have been encountered. Almost 2 years ago while reviewing the EVAR registry, an apparent disproportionate number of complications was encountered. This was readily appreciated by the investigators prompting review of the data set and its code. Further analysis identified the responsible coding error which was corrected.
ObjectiveTop
The purpose of this article is to describe an evaluation of data validity and a quality improvement initiative within a clinical medical registry.
MethodsTop
One of the fundamental principles of our registry is privacy and confidentiality of Protected Health Information (PHI). It is helpful to understand how a data set is released to investigators for analysis. Patient information that is sent in to the registry contains PHI or individually identifiable information which is prohibited by law for public dissemination (https://www. hipaa.com/hipaa-protected-health-information-what-does-phiinclude/). In order to be HIPAA compliant, the registry data is aggregated and de-identified creating a Blinded Data Set (BDS). When an investigator is approved to do a study with VQI data, a BDS is created that is HIPAA compliant and no information is released that would violate patient, provider or center confidentiality or allow identification.

When the registry was found to be corrupted, the code for creating this BDS was subject to further quality control analysis and coding errors were discovered. The errors were felt to be unique to this registry as it had recently undergone a major revision. When a registry is revised, new data fields are created, some data fields are eliminated, some are changed and some are given new definitions. When the revised registry is merged with the previous registry data, there is a mapping process that incorporates these modifications which can be subject to data variation.

Shortly thereafter data discrepancies were discovered in another registry. In this instance, a center found that their reports from the IVC filter registry contained fewer patients than they had entered into the registry. The BDS from the IVC filter registry did not contain inaccurate data but was found to be incomplete (as some patients were not included in the BDS). In this instance, the code used to create the BDS truncated the data set and some patients were not included. As a result of these two events - one registry with inaccurate data and one with incomplete data, the PSO informed the SVS and embarked on a data audit. As a result during this time period, the PSO did not release any BDS for investigation or analysis until the audit for the registry had been completed, approved by PSO and VQI.

Data is the foundation of everything that the VQI (or any clinical registry) does. All reports, analyses, findings, benchmarks, etc are based upon patient data. As such, it is imperative that the data be of the highest quality and beyond reproach. The data audit of VQI was begun with our corporate partner, M2S® (a subsidiaryof Medstreaming®), which is responsible for the secure transmission and warehousing of the data as well as creating the BDS that is provided to investigators. New quality controls were developed including a standardized methodology for creating the BDS. After a new uniform code was created for the BDS, it was subject to analysis. At the outset, the PSO requested that an independent auditor be brought in to assess the effectiveness of the recently written code for creating the BDS. The external audit confirmed that the data displayed in the BDS matched the data entered in to the registry. After review, it was felt that the new code was accurate and reliable. Using the new standardized code, a new BDS was created for each of the 12 registries. Each registry was then subject to further auditing by analyzing the means or the categorical distribution of each variable by year and flagging any unexpected changes as suggestive of potential data errors. Variations could be due to changes in practice, coding or other systematic errors. If significant variation was encountered, the fields with data variation were subject to further analysis. Some degree of data variation was due to clinical variation. Coding or transcription errors were corrected. The “variation check” was performed on each registry and the registry was cleared for release of BDS.

The audit of 12 registries was a tedious, time-consuming and labor-intensive process that took thousands of hours of PSO and M2S personnel until data accuracy was confirmed. There were multiple reasons for the development of data discrepancies including transcription (some data had been collected in 2003 on paper forms and migrated through multiple databases before arriving in the current format), improper mapping of old variables to new during prior registry revisions, and inadequacy of the data dictionaries.
ResultsTop
The audit focused on internal consistency, required over one year to complete and consumed thousands of personnel hours. Affected variables ranged from 0 – 5% that are currently active and were mostly due to transcription and mapping (change in definition or criteria) Table 1.
Table 1: Summary Statistics on Blinded Data Set (BDS) Variables (INFRA – Infrainguinal bypass; CEA – Carotid Endarterectomy; PVI – Peripheral Vascular Intervention; OAAA – Open Abdominal Aortic Aneurysm Repair; SUPRA – Suprainguinal bypass; EVAR – Endovascular Aneurysm Repair; CAS – Carotid Artery Stent; AMP – Amputation; VV – Varicose Vein; AVACESS – Hemodialysis Arteriovenous Access; IVC – Inferior Vena Cava Filter; TEVAR – Thoracic Endovascular Aneurysm Repair)

Registry

Total # Variables

# (%) Variables that required Correction due to transition from Paper Forms

# (%) Variables that Required Correction due to Data
Mapping

# (%) Prior Calculated Variables that Required Correction (Now Retired)

INFRA

187

8 (4%)

5 (3%)

3 (2%)

CEA

221

4 (2%)

9 (4%)

3 (1%)

PVI

286

0 (0%)

0 (0%)

5 (2%)

OAAA

144

1 (1%)

5 (4%)

0 (0%)

SUPRA

187

0 (0%)

5 (3%)

2 (1%)

EVAR

759

15 (2%)

33 (4%)

3 (0%)

CAS

189

1 (1%)

2 (1%)

1 (1%)

AMP

161

0 (0%)

8 (5%)

3 (2%)

VV

437

0 (0%)

1 (0%)

0 (0%)

AVACESS

159

0 (0%)

5 (3%)

1 (1%)

IVC

152

0 (0%)

1 (1%)

14 (9%)

TEVAR

1038

0 (0%)

35 (3%)

5 (1%)

Summary Statistics on Released Blinded Data Set (BDS)
In the retired (no longer being recorded) variables, the error rate varied from 0 – 9% Table 1. Less than 2% of the data points were found to be in error and almost all were in descriptive variables, e.g. PCI vs CABG. Very few involved critical outcome fields such as CVA, MI, or return to OR. Although outcomes are only recorded as an inpatient and at one year, the Social Security Death Index is used to calculate long term mortality for all registries. Because there was heterogeneity in the code for each registry, there was variability in the mortality calculations (< 5%) which has subsequently been corrected.

When each of the registries had gone through the audit process and been cleared for release, a new BDS was generated. The BDS for each registry was sent to all investigators who had used that BDS for study. Each investigator was asked to evaluate the results of the same analysis on the new BDS compared to the previous BDS that he or she had received. As yet, no investigator has reported that the data errors have significantly impacted their outcomes or analysis.
DiscussionTop
In this effort to validate the accuracy of VQI data, multiple data discrepancies were identified but almost all were in descriptive variables rather than in key variables or outcomes. The data dictionaries have been improved making them more useful to users and investigators. There is a new standardized methodology for creation of BDS which should ensure consistency between the original data and the BDS. New quality control measures have been developed and implemented.

The foundation of any clinical registry is data accuracy. Unless there is absolute confidence in the data integrity, any analysis or conclusions based upon the registry is questionable [3]. Even small degrees of error can significantly impact outcomes or analyses. At every scientific congress and in most medical journals, there are multiple presentations and manuscripts based on administrative and/or clinical databases. Clinical registries have long been felt to be superior and more accurate than administrative databases due to better coding [4, 5]. The primary purpose of a clinical registry is to provide aggregate clinical data for analysis whereas administrative databases primarily function for financial or administrative reasons. Clinical databases are felt to have improved accuracy compared to administrative databases because they are compiled by data collectors with clinical expertise, are used for clinical rather than administrative or financial purposes, are prospectively maintained and are subject to audit [6].

All registries are subject to errors in coding, transcription, and interpretation. Multiple efforts are made to minimize these errors by training, certification, data checks, and other quality control techniques. However it is inherent to any large complex registry that there will be some degree of data variation. The Society for Thoracic Surgeons (STS) National Adult Cardiac Surgery Database (NCD) is recognized as a premier high quality clinical registry. Years ago the validity of the NCD data was questioned by investigators, who found variable agreement (89%, range 42-100%) in the data when subject to review by untrained abstractors [7]. Most of the variability was noted in “subjective fields” such as etiology of aortic valve disease or NYSHA classification. More critical fields such as number of internal mammary grafts or absence of significant valvular disease had high degrees of concordance. However the patient status at 30 days (dead or alive) was in agreement in only 83% of cases in 2008. Grunkemeier and Furnary outlined the issues of a large complex clinical registry emphasizing the need for awareness of incomplete and inconsistent data and how to utilize internal quality controls [8].

The STS has made major efforts at improving data validity, including developing its own program for improving data quality (STS Adult Cardiac Surgery Database v2.61 Consistency Edits and Checks [9]. [February 17, 2009].

The National Cardiovascular Data Registry (NCDR) has multiple registries with a comprehensive data audit and quality assurance program. They describe average accuracy rates within 3 of their registries to range from 90-93% (center range from 85- 97%) [10]. Review of a high-quality orthopedic registry showed that in 2 hospitals the percentage of records that were error free were 8 and 10% [11]. Setting the benchmark for accuracy at 95%, 57% of variables at one hospital and 74% at the other met this threshold. The majority of discrepancies were in medical history although agreement for the variable "postoperative complications" was 73%.

With the increasing number and utility of medical registries, there has been increased focus on defining and improving data quality [12]. The European Society of Thoracic Surgeons used task-independent metrics to assess quality within their database [13]. They measured completeness, correctness and consistency with a threshold for good quality set at 0.8. Although 0.8 does not appear to be a high threshold for accuracy, it was used for all variables including non-critical fields. Correctness was defined as syntactic accuracy (or plausible with clinical values, e.g. FEV1 >25 and < 150) and not semantic accuracy (i.e. checked against source data). Consistency was measured only as internal consistency (or feasibility, e.g. DATE OF ADMISSION < DATE OF OPERATION < DATE OF DISCHARGE). There were few parameters that could be measured for correctness and consistency. The ESTS has gone on to create the Aggregate Data Quality score (ADQ) to further measure data quality [14].
SignificanceTop
The bottom line for any database is accuracy in correlation with source data. In the NIH collaboratory statement, they define three quality dimensions – completeness, accuracy and consistency [15]. Completeness is defined as not only all required variables being entered but all appropriate patients being included. Most importantly accuracy, as opposed to correctness, is defined as closeness of agreement between a data value and the true value. This is a rigorous standard and one toward which the VQI strives. Both the STS and the NCDR use regular audits to confirm data accuracy, provide valuable feedback to centers and enhance data collection. Audits identify variables with poor reliability and allow modification of data fields and definitions [16]. STS annually audits 10% of its sites for data accuracy for key variables. Currently the VQI audits 30% of our sites annually for completeness of patient entry. The recently completed VQI audit evaluated internal consistency but did not review source data. The PSO is currently developing a more comprehensive audit strategy which would include a regular review of source data. The program would be based on statistical indicators as well as random audits.
ConclusionTop
All large, complex registries, whether they are clinical or administrative, are subject to some degree of error. There will inevitably be errors due to keypunching, transcription, and miscoding. Clinical registries contain huge amounts of information (over 160 million data points in VQI). Data points are subject to interpretation, revision, anonymization, and aggregation. Each of these steps is a potential source of variation. After discovery of data discrepancies, the VQI undertook a robust and intensive effort to review our data in 2017. Using a retrospective audit strategy, the internal consistency and accuracy of data was investigated. Less than 2% of data points were found to be in error. The data errors that were uncovered were felt to be minimal without any significant impact on clinical analyses or outcomes. The degree of data error encountered was felt to be comparable to other high quality clinical registries. In an effort to minimize future data error, new and enhanced quality control measures have been introduced. VQI is also developing an audit strategy that reviews not only internal consistency but analyzes source data accuracy in the registry.
DeclarationsTop
Conflict of Interest: Jim Wadzinski is an employee and is compensated by the Society for Vascular Surgery Patient Safety Organization.

Ethical Approval: NA
Clinical Trial Registration: NA
ReferencesTop
  1. Society for Vascular Surgery. VQI: Vascular Quality Initiative. 2018; https://www.vqi.org/. Accessed January 2 2018.
  2. Surgery SfV. Patient Safety Organization.  https://vascular.org/research-quality/vascular-quality-initiative/patient-safety-organization. Accessed January 2, 2018.
  3.   Gallivan S, Stark J, Pagel C, et al. Dead reckoning: can we trust estimates of mortality rates in clinical databases? Eur J Cardiothorac Surg. 2008;33(3):334-340.
  4. Prasad A, Helder MR, Brown DA, et al. Understanding Differences in Administrative and Audited Patient Data in Cardiac Surgery: Comparison of the University HealthSystem Consortium and Society of Thoracic Surgeons Databases. J Am Coll Surg. 2016;223(4):551-557 e554.
  5. Mack MJ, Herbert M, Prince S, et al. Does reporting of coronary artery bypass grafting from administrative databases accurately reflect actual clinical outcomes? J Thorac Cardiovasc Surg. 2005;129(6):1309-1317.
  6. Shahian DM, Jacobs JP, Edwards FH, et al. The society of thoracic surgeons national database. Heart. 2013;99(20):1494-1501.
  7. Brown ML, Lenoch JR, Schaff HV. Variability in data: the Society of Thoracic Surgeons National Adult Cardiac Surgery Database. J Thorac Cardiovasc Surg. 2010;140(2):267-273.
  8. Grunkemeier GL, Furnary AP. Data variability and validity: the elephant in the room. J Thorac Cardiovasc Surg. 2010;140(2):273-275.
  9. Welke KF, Ferguson TB, Jr., Coombs LP, et al. Validity of the Society of Thoracic Surgeons National Adult Cardiac Surgery Database. Ann Thorac Surg. 2004;77(4):1137-1139.
  10.   Messenger JC, Ho KK, Young CH, et al. The National Cardiovascular Data Registry (NCDR) Data Quality Brief: the NCDR Data Quality Program in 2012. J Am Coll Cardiol. 2012;60(16):1484-1488.
  11. Seagrave KG, Naylor J, Armstrong E, et al. Data quality audit of the arthroplasty clinical outcomes registry NSW. BMC Health Serv Res. 2014;14:512.
  12. Arts DG, De Keizer NF, Scheffer GJ. Defining and improving data quality in medical registries: a literature review, case study, and generic framework. J Am Med Inform Assoc. 2002;9(6):600-611.
  13. Salati M, Brunelli A, Dahan M, et al. Task-independent metrics to assess the data quality of medical registries using the European Society of Thoracic Surgeons (ESTS) Database. Eur J Cardiothorac Surg. 2011;40(1):91-98.
  14. Salati M, Falcoz PE, Decaluwe H, et al. The European Thoracic Data Quality Project: an aggregate data quality score to measure the quality of international multi-institutional databases. European Journal of Cardio-Thoracic Surgery. 2015;49(5):1470-1475.
  15. Smerek MM. Assessing Data Quality for Healthcare Systems Data Used in Clinical Research (Version 1.0). In:2015.
  16. Willis CD, Jolley DJ, McNeil JJ, et al. Identifying and improving unreliable items in registries through data auditing. Int J Qual Health Care. 2011;23(3):317-323.
 
Listing : ICMJE   

Creative Commons License Open Access by Symbiosis is licensed under a Creative Commons Attribution 4.0 Unported License