Monitoring Surgical Performance : Current Models and Limitations

Agostino Pozzi1*, Laetitia F Colombo2, Francesca De Stefano3, Alessandra Sala4, Marco Bartolucci2 and Fabio Villa5 1Department of Gastrointestinal Surgery, San Raffaele Hospital, Milan, Italy. 2Department of Dermatology, San Raffaele Hospital, Vita-Salute San Raffaele University, Milan, Italy. 3Department of Gynaecology, San Raffaele Hospital, Milan. 4Department of Cardiothoracic Surgery, San Raffaele Hospital, Milan, Italy. 5The Fletcher School of Law and Diplomacy, Boston, MA, USA. SOJ Surgery Open Access Opinion Article


Introduction
Crucial factors for every healthcare system comprise costeffectiveness and patient satisfaction.Recently there has been a growing interest in evaluating and publishing individual outcomes related to surgical performance.Current drawbacks in developing an efficient performance evaluation system involve the various independent methods detecting consistent 10 percent performers [1].Despite many attempts to apply proficiency assessment in surgery, a substantial disagreement exists regarding its use.Several questions remain: is it possible to define objective parameters to classify surgeons' abilities and increase surgical care effectiveness and efficiency?Is it possible to create a model able to include all the patient-and environmentrelated factors that may affect a surgeon's performance?Would using such a model be effective in ameliorating surgical outcomes?Unspecific methods that include feedback on the major complications related to a surgical operation have a strong impact on surgical performance and costs.However, the studies' non-randomized design and their limited number reduce the findings' significance [2].
The American Aggregate Physician societies and the United States government agencies identified useful criteria to standardize surgical performance evaluation and consequently healthcare quality improved.These guidelines were delineated in the National Quality Forum and mainly applied to cardiac surgery [3].Despite these meritable efforts carried out by the National Quality Forum staff, many patient-centered aspects still require improvements.Surgical performance measurement ought to take into consideration: (i) patients are and just not mortality rates; and (ii) the surgical decision-making processes' value [4].
Another important point is to unify patient, institutional and scientific perspectives.One of the most crucial benefits from this system involves the reduction of complications.With regards to the methodology, mortality rate is among the most used measures.This may be misleading as it is based on the presumption that all mortality in surgery can be prevented, however proper risk adjustments are lacking [5].Mortality as a result of a surgical procedure is rather rare.Focusing mainly on this aspect does not allow enough consideration for complications arising from variations in surgical techniques and surgical performance.
Another issue related to assessing mortality rates is the incongruence between expected and observed rates.Variable life adjusted display (VLAD) is a commonly used parameter, however it has its limitations with regards to timing features and its monitoring retrospective nature [6].The data collection is based upon arbitrary time intervals and the negative trends in surgical performance.It could be identified with a relevant delay, causing inevitable inefficiency in quality improvement interventions.
Outputs support evidence over time.Poor surgical performance or surgical errors that hinder the final result is most likely to arise when it lies close to the control range extremities, rather than when the value is adjacent to the average.Although VLAD provides an easily understandable display, the model's limitations lead to significance and accuracy loss [7].
An alternative evaluation method is the risk-adjusted Bernoulli cumulative sum (RA-CUSUM), which is set on a realtime prospective observation, preventing the use of retrospective methods [8].The RA-CUSUM is an attempt to apply the cumulative sum graphical method, used for quality monitoring in industrial settings, to the surgical field.It becomes significant across multiple cases and adopts run length distribution to evaluate performance changes.A small cohort may increase the discrepancy between the observed and the expected mortality rate [9].Contrarily to VLAD, which is a mortality-scoring system (i.e., a penalty is assigned for every death) based on perioperative death risk, the RA-CUSUM method identifies patients with improved surgical outcomes with respect to the expected ones.With this method, any failure in surgical performance can be reliably detected when compared to previous successful ones.Any deterioration in surgical performance is expressed as a positive slope, sending a signal when values above the upper control limit appear [10].Risk adjustment is obtained by a model confronting the outputs with a statistical mortality risk and adverse events in the same operation, according to patient comorbidities and ana graphical characteristics (age, sex), and other factors related to an increase in operative complications [7,11].Adverse events appertain to an improvised risk score that involves many variables, however it is not based on a single-case, representing a rough approximation to assess real risk related to the individual patient.It is not wise to assume that risk data extracted from literature fully supports the prognosis in uncertain contexts, such as complications attributed to poor surgical performance or events occurring during hospitalization.This approach also neglects the pre-operative, environmental and team-related features, as surgical outcome complexity does not necessarily show a linear correlation with procedural quality.Hence, the variability related to the learning curve and the need for continued monitoring makes this method less practical.Forced ranking distribution methods are useful only if applied constructively to attenuate the subjective and discretional performance evaluation in different surgical departments.An apparently safe tool for ranking surgeons and medical staff fails to grasp service complexity.It should not be the only method used to classify surgical skills.
Healthcare competition has existed for some time now and positively correlates with the rise in life expectancy.Forced ranking distribution system has the potential to incentivize best practice in surgery.However, potential drawbacks in this system could be present, for example there may be a tendency to deliberately avoid high-risk procedures to limit uncertain outcomes.
Further complications may be caused by tension between competing team members and a delay in recognizing adverse events occurring after the procedure.The above-mentioned factors are prone to delay the improvement process involving surgical service and productivity, which is the forced ranking system's goal.A vitality curve detractor, Michael Schrage argues that this system, leads to 'dishonest and unfair evaluations by management' [12].

Conclusion
Given measurement accuracy and the healthcare system's peculiarity in an industrial scenario, work performance monitoring strategies must be revised and rankings should be a complementary performance index.Among the current methods available,