Abstract
Background:
Despite the complexity of SYNTAX score (SS), guidelines recommend this tool to help choosing between coronary artery bypass grafting (CABG) and percutaneous coronary intervention (PCI) in patients with left main of three-vessel coronary artery disease. The aim of this study was to compare the inter-observer variation in SS performed by clinical cardiologists (CC), interventional cardiologists (IC), and cardiac surgeons (CS).
Methods:
Seven coronary angiographies from patients with left main and/or three-vessel disease chosen by a heart team were analyzed by 10 CC, 10 IC and 10 CS. SS was calculated via SYNTAX website.
Results:
Kappa concordance was very low between CC and CS (k = 0.176), moderate between CS and IC (k = 0.563), and moderate between CC and IC (0.553). There was a statistically significant difference between CC, who classified more cases as low complexity (70%), and CS, who classified more cases as moderate complexity (80%) (p = 0.041).
Conclusion:
Concordance between SS analyzed by CC, CS and IC is low. The usefulness of SS in decision-making of revascularization strategy is undeniable and evidence supports its use. However, this study highlights the importance of well-trained professionals on calculating the SS. It could avoid misclassification of borderline cases.
Keywords
Introduction
Studies comparing coronary artery disease treatment with coronary artery bypass grafting (CABG) and percutaneous coronary intervention (PCI) showed superiority of CABG mainly in prevention of further interventions,1,2 especially in patients with anatomically complex lesions. With the continuous development of new techniques and devices used in PCI, it became necessary to compare CABG and PCI in patients with more severe disease. The SYNTAX study compared the two treatment methods in patients with left main or three-vessel disease. 3 A score created to grade the anatomical complexity of coronary lesions showed that patients with less severe disease could be treated with both CABG and PCI, with similar results.
The 2014 European Society of Cardiology/European Association for Cardio-Thoracic Surgery (ESC/EACTS) guidelines for myocardial revascularization included for the first time the SYNTAX score (SS) to help choose the intervention method. 4 Despite its unquestionable usefulness,5–7 there may be some degree of variability among different physicians in the interpretation of angiographies, due to the presence of some subjectivity in scoring. Interpretations can differ between more or less conservative professionals, more or less experienced professionals, and even by the same professionals at different times.8,9 The aim of this study was to compare the inter-observer variation in SS calculation performed by clinical cardiologists (CC), interventional cardiologists (IC), and cardiac surgeons (CS).
Methods
We conducted a cross-sectional study among CS, CC, and IC. All were recognized specialists in their medical societies, and all worked in tertiary care/teaching hospitals in southern Brazil. In all, 94 specialists were invited to participate either personally or by email, and 30 completed the study.
We chose seven coronary angiographies from patients with left main and/or three-vessel disease, chosen by a heart team comprising a CS, a CC and an IC. The same heart team analyzed the films together to determine the gold standard score for each film. None of this group participated as research subjects in the study. We created a website where we uploaded the coronary angiographies, so the research subjects could analyze the films from anywhere they could use a computer. After analyzing each film, they were redirected to the SYNTAX calculator website, 10 and the results were sent directly to our database.
This project was submitted and approved by the local Ethics Committee. Informed consent was filled at the moment the physician logged into the website. There was no violation of confidentiality of data regarding the patients whose angiography was analyzed.
Statistical analysis
SS was analyzed as both continuous and categorized variables, according to the original article, where low complexity was a score from 0 to 22, medium complexity was a score from 23 to 32, and high complexity was a score above 32. Variables were expressed as mean and standard deviation (SD), or median and interquartile range (IQR). One-way ANOVA was performed to determine the difference between means of syntax score among the three groups. Variation between groups and reproducibility of SS was determined by Kappa index. Fisher’s exact test was used in the analysis of contingency tables. Bimodal p value <0.05 is considered to indicate statistical significance. Data were analyzed using SPSS (version 18.0.0; IBM Company, Chicago, IL, USA) and MedCalc (version 12.5.0.0, bvba , Mariakerke, Belgium).
Sample size was calculated based on an inter-observer and intra-observer concordance of 0.45 and 0.59, respectively. 8 Assuming a Kappa coefficient of 0.75 between the evaluators, we estimated that a sample of 294 evaluations would be enough to determine the correlation coefficient with a power of 80% and level of significance of 5%.
Results
We included 10 CC, 10 CS, and 10 IC. When we analyzed SYNTAX as a continuous variable, mean SS was not different among different specialists (Figure 1). Mean SS among different professionals are presented in Table 1.

Mean/Median* SYNTAX score among different cardiology professionals.
Syntax score among different professionals and Heart Team’s Gold Standard.
CC, clinical cardiologist; CS, cardiac surgeon; IC, interventional cardiologist; SD, standard deviation.
Kappa concordance was very low between CC and CS (k = 0.176), moderate between CS and IC (k = 0.563), and moderate between CC and IC (0.553; Table 2).
Mean/median* Syntax score and (risk category) by case.
Kappa index: CC versus CS = 0.176; IC (p = 0.392) versus CS = 0.563 (p = 0.032); CC versus IC = 0.533 (p = 0.024).
CC, clinical cardiologist; CS, cardiac surgeon; IC, interventional cardiologist; SD, standard deviation.
There was a statistically significant difference between CC, who classified more cases as low complexity (70%), and CS, who classified more cases as moderate complexity (80%; p = 0.041). Agreement rates of categorized syntax score calculated by different specialists compared with the gold standard were as low as 10% and as high as 100%, depending on specialist group and coronary angiography (Table 3).
Agreement rate in categorized Syntax score among specialists.
Discussion
The present study indicates that, in patients with left main and/or three-vessel disease, concordance between SS analyzed by CC, CS and IC is from moderate to very low. Moreover, IC tend to evaluate cases as being of lower severity. This study is, to our knowledge, the first study to compare the variability of SS calculated by different cardiology specialists.
The SYNTAX trial represented a change of paradigm in coronary artery disease, by permitting patients with complex coronary anatomy to choose between different treatment strategies depending on their complexity score. This finding was so striking that the 2014 ESC/EATS guidelines for myocardial revascularization included SS as a method to help choose the intervention method. 4
Accurate coronary anatomy determination is essential, and SS was created to make the decision process as objective as possible. However, a certain degree of subjectivity exists, and characteristics such as conservativeness and experience in interpretation angiographies may change the final result of analysis. Généreux et al. found poor inter-observer agreement (k = 0.33) between three IC who analyzed 30 multivessel angiograms, with a substantial improvement (k = 0.76) after advanced training. 11 Two other studies analyzing inter-observer variability between two IC showed only moderate agreement (k = 0.56 and 0.58).12,13
The aim of our study was to compare analysis of SS mainly by CS and IC. Our hypothesis was that CS could be less conservative and overestimate the result; on the other hand, IC could be more conservative due to possible professional bias. The CC would be a group to balance the other two described above. Interventional bias is likely the manifestation of two well-recognized forms of bias: self-interest bias and confirmation bias. 14 This is a reality of modern medicine, and recognition is the first step towards overcoming it.
We did not find the differences we were expecting. Mean SS was not statistically different between groups, although there was a trend to significance in some cases analyzed. This may be explained by the reduced sample of 10 observers in each group, characterizing a Beta error. Kappa concordance was from moderate to poor. Moreover, there was a clear difference between and within groups regarding SS categorical classification as low, moderate, and high disease severity. In case 5 (Figure 2), for example, the average score of CC and IC classified SS as moderate, which would make angioplasty acceptable. CS, however, classified SS as high, which would prohibit PCI in this case. This indicates that SS may be a valuable tool for the global orientation of the severity of the lesions in a particular patient, but the final decision should depend on a Heart Team decision, as well as local expertise (treating more complex left main stenosis, for example) and patient conditions.

Case 5 coronary angiography images. (a) Right anterior oblique caudal projection of left coronary artery. (b) Posteroanterior cranial projection of left coronary artery. (c) Left anterior oblique cranial projection of left coronary artery. (d) Left anterior oblique projection of right coronary artery.
A guideline recommendation of such an important decision needs external validity, and our findings corroborate the literature showing poor agreement of the score in a real-world situation. Two recent studies with a very similar methodology compared CABG and PCI in patients with left main coronary disease, and have shown opposite results.15,16 Several theories have been proposed to explain such differences, and different SS calculation may be one of them.
Limitations
The main limitation of our study is the reduced sample of observers in each group, although we have reached the calculated sample size. SS is not a user-friendly tool to determine coronary complexity, making enrollment a difficult job. However, we do bring the message that SS has a poor reproducibility, and recommendations regarding its use in treatment decisions should be made with caution.
Conclusion
In conclusion, SS concordance between CC, CS, and IC is low. The usefulness of SS in decision-making of coronary revascularization strategy is undeniable and evidence supports its use. However, this study highlights the importance of well-trained professionals in calculating the SS, which could avoid misclassification of borderline cases.
Footnotes
Approval number
2010-0044
Conflict of interest statement
The authors declare that there is no conflict of interest.
Ethics Committee
Comitê de Ética em Pesquisa do Hospital de Clínicas de Porto Alegre (CEP-HCPA).
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by grants from the Fundo de Incentivo à Pesquisa do Hospital de Clínicas de Porto Alegre (FIPE-HCPA), Porto Alegre, Brazil.
