Abstract
Introduction:
The inter and intra-observer reproducibility of measuring the Wound Ischemia foot Infection (WIfI) score is unknown. The aims of this study were to compare the reproducibility, completion times and ability to predict 30-day amputation of the WIfI, University of Texas Wound Classification System (UTWCS), Site, Ischemia, Neuropathy, Bacterial Infection and Depth (SINBAD) and Wagner classifications systems using photographs of diabetes-related foot ulcers.
Methods:
Three trained observers independently scored the diabetes-related foot ulcers of 45 participants on two separate occasions using photographs. The inter- and intra-observer reproducibility were calculated using Krippendorff’s α. The completion times were compared with Kruskal-Wallis and Dunn’s post-hoc tests. The ability of the scores to predict 30-day amputation rates were assessed using receiver operator characteristic curves and area under the curves.
Results:
There was excellent intra-observer agreement (α >0.900) and substantial agreement between observers (α=0.788) in WIfI scoring. There was moderate, substantial, or excellent agreement within the three observers (α>0.599 in all instances except one) and fair or moderate agreement between observers (α of UTWCS=0.306, α of SINBAD=0.516, α of Wagner=0.374) for the other three classification systems. The WIfI score took significantly longer (P<.001) to complete compared to the other three scores (medians and inter quartile ranges of the WIfI, UTWCS, SINBAD, and Wagner being 1.00 [0.88-1.00], 0.75 [0.50-0.75], 0.50 [0.50-0.50], and 0.25 [0.25-0.50] minutes). None of the classifications were predictive of 30-day amputation (P>.05 in all instances).
Conclusion:
The WIfI score can be completed with substantial agreement between trained observers but was not predictive of 30-day amputation.
Keywords
Background
People with diabetes-related foot ulcers (DFUs) are at high risk of major complications such as minor and major amputation. 1 DFU is a leading cause of global disability and requirement for hospital admission.1-3 Grading the severity of DFUs using a classification system is of potential value for predicting the risk of these complications. 4 Commonly used DFU classification systems include the Wagner, 5 University of Texas Wound Classification System (UTWCS), 6 the Site, Ischemia, Neuropathy, Bacterial Infection, and Depth (SINBAD) score, 7 and the Wound Ischemia foot Infection (WIfI) score. 8 These systems are typically designed to aid treatment decisions, communication between health professionals, in conducting audits, benchmarking between services and predicting outcomes.9,10 It is important that any DFU classification system can be repeated by different clinicians in a rapid time frame and the findings predict outcome. 11
The International Working Group on Diabetic Foot (IWGDF) guideline recommends the use of the WIfI classification system. 10 Whilst, the reproducibility of a number of other different DFU classification systems—such as the UTWCS, SINBAD, and Wagner—have been previously reported,12-15 to our knowledge the reproducibility of the WIfI score has not been assessed or compared to other systems. 16 Furthermore, whilst studies have compared the ability of these different classification systems to predict one-year risk of amputation, none to our knowledge have investigated their ability to predict 30-day amputation risk.9,17-19
The primary aim of this study was to compare the inter- and intra-observer reproducibility of the WIfI, UTWCS, SINBAD and Wagner classifications using photographs of DFUs. Secondary aims were to compare completion times and the ability of these scoring systems to predict 30-day risk of amputation.
Methods
This was a prospective single-centre observational cohort study of patients who were admitted to the Townsville University Hospital (TUH) in North Queensland, Australia, for inpatient treatment of a DFU. Recruitment occurred from January 1, 2020 to June 30, 2020. Inclusion criteria were diagnosis with type I or II diabetes, an active DFU, age over 18years and written informed consent. Patients who presented with gangrene or who had wound debridement or amputations before they could be recruited to the study were excluded. Ethical approval for the study was granted by the Townsville Hospital and Health Services Ethics Committee (HREC/12/QTHS/202 and HREC/12/QTHS/203) and all participants provided written informed consent.
The following data were collected on study entry, which were self-reported by the patients and later verified with the medical records: age, time since diagnosis of diabetes, height, weight, smoking history, previous history of hospital admission for the treatment of DFU or amputation. Examination was performed to assess DFU location and the presence of peripheral neuropathy using a 10-g Semmes-Weinstein monofilament and 128-Hz tuning fork. Peripheral neuropathy was defined when one or more of four sites in the foot (plantar surfaces of the great toe, the 1st, the 2nd and the 3rd metatarsal head areas) were insensitive to the monofilament or tuning fork. 20 Participant’s heart rate, temperature and respiratory rate were also recorded by the admitting doctors and were obtained from the medical records. Signs of systemic infection were defined to include high pulse rate [>90 beats per minute], high respiratory rate [>20per minute], and abnormal temperature [>38°C or <36°C). White blood cell count and circulating concentrations of C-reactive protein and fasting sugar were also measured at the time of hospital admission. Ankle brachial pressure index (ABPI) was measured in all participants as previously described 21 and the toe pressure (TP) was measured in participants who did not have an ulcer or prior amputation of the hallux using a Huntleigh Dopplex S/W-V1.6 kit according to the manufacturer’s instructions (Huntleigh Healthcare Ltd, UK). Ischaemia (ABPI<0.8 or TP<60mmHg) was defined as per definitions given in the WIfI classification. 6 ABPI was also categorised as high (>1.40), normal (0.90-1.40) and low (<0.9). The ABPI measurements were performed by a single investigator (first author) and were comparable with those measured by vascular sonographers (intraclass correlation coefficient=0.883, n=16). 22
In order to standardise the assessment of DFUs, photographs were taken of the affected foot and these were used for grading using previously described methods.23,24 All three assessors classified all ulcers based on one system and then with the next system. The photographs were taken using both a Silhouette star camera (The SilhouetteStar, Aranz Medical Ltd.) and an iPhone XR (iOS 12.0 software, Apple Inc.). These photographs along with clinical data and information on ischemia were used to classify ulcers according to the different grading systems (5-8). This allowed for the remote assessment of DFUs while following appropriate infection control protocols during the COVID-19 pandemic and minimising patient-clinician contact. 25
Three assessors (a vascular surgeon [CG], a podiatrist [MF] and a medical physician [CA]) independently graded the DFUs. All had extensive prior experience in assessing DFUs in clinical practice. Prior to starting the study, each assessor attended a two-hour training session focused on a standardised method of using the classification systems and grading wounds aimed to optimise consistency in grading. This session involved independent evaluation and grading of three examples of DFUs using each system by each assessor. This was followed by a discussion of scores. Once training was completed, the three observers independently undertook the grading of each DFU using each classification system and then repeated the scoring a second time at least seven days later (using the same image) to assess the intra-observer agreement. The time taken to complete each score for each participant was recorded using a stopwatch.
The main outcome measure was the reproducibility of the different classification systems and the secondary outcome was requirement for any lower limb amputation, defined to include amputation of the toes or forefoot, or below or above knee amputation (either minor or major amputations) within 30days of hospital admission. The patients were followed up while they were in hospital and then via out-patient review for 30days. 26
The sample size was calculated based on the assumption that three observers scoring the ulcer photographs independently would have a substantial inter-observer agreement (80%), with a relative error of 10% (11). The required sample size (80% power; α = 0.05) was 45 patients. 27
The continuous variables were not normally distributed, as evidenced by the Shapiro-Wilk test and therefore were presented as median and inter-quartile range (IQR). Nominal and ordinal data were summarised as percentages. The inter-observer and intra-observer reproducibility of the different classification systems were measured using Krippendorff’s α for ordinal data. 28 Values were interpreted as ≤0=no agreement; 0.01-0.20=slight agreement; 0.21-0.40=fair agreement; 0.41-0.60=moderate agreement; 0.61-0.80 substantial agreement; and 0.81-1.00=excellent agreement 28 and calculated using R software (R Core Team [2020]. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Version: 4.02 using rel: Reliability Coefficients. R package, version 1.4.2 and irr: Various Coefficients of Interrater Reliability and Agreement. R package version 0.84.1). The time taken to grade each ulcer was compared between the different scoring systems using the Kruskal-Wallis test and post hoc comparisons were performed using Dunn’s test. The median of the six scores of each DFU was compared between participants that did and did not subsequently undergo amputation within 30days of admission using Mann Whitney U test. The scores were also used to construct receiver operating characteristic (ROC) curves to assess the predictive ability of each classification system for amputation. 29 Area under the curve (AUC) was calculated and interpreted as >0.90=excellent, 0.80-0.89=good, 0.70-0.79=fair, and 0.60-0.69=poor. 29 Analyses were performed using SPSS (released 2020, IBM SPSS statistics for Windows, Version 26.0. Armonk, NY, IB Corp). ROC curves were drawn using GraphPad PRISM software, version 7.03 (GraphPad software, Inc, La Jolla, CA).
Results
A total of 45 patients were recruited. The baseline demographic characteristics and risk factors of the participants are summarised in Table 1. The median (IQR) age of the participants was 68.1 (56.1-74.1) years and 80% were males. The median (IQR) duration of diabetes was 19.0 (10.5-25.0) years.
Demographic Characteristics of the Included Patients.
Note. Data shown are numbers (percentage) or median (inter-quartile range).
Time to Complete DFU Grading
The median time taken to classify each ulcer varied significantly between all four grading systems (P<.001; Table 2). The Wagner score had the lowest median time for completion, and this progressively increased for the SINBAD, Wagner, and WIfI scores (P values for bivariate comparisons shown in Table 2).
Median Time Taken to Assess the Severity of Diabetes-Associated Foot Ulcers Using Different Classification Systems.
Abbreviations: NA, not applicable; SINBAD, Site, Ischemia, Neuropathy, Bacterial Infection, and Depth; UTWCS, University of Texas Wound Classification System; WIfI, Wound Ischemia foot Infection.
Note. Completion time shown as median (inter-quartile range). P values were obtained from Dunn’s test in post hoc comparisons following Kruskal-Wallis test.
Reproducibility
The WIfI classification had substantial inter-observer agreement (α=0.788) and excellent intra-observer agreement (α>0.900) between assessors based on Krippendorff’s α values (Table 3). Inter-observer agreement for SINBAD scores was moderate (α=0.516). Inter-observer agreements for Wagner and UTWCS scores were fair (α=0.374 and 0.306, respectively). Intra-observer agreement for all classification systems was moderate (α>0.599) except for observer 3 where the agreement was fair for the UTWCS score (Table 3).
Krippendorff’s α Values for the Inter- and Intra-Observer Agreement of Different Classification Systems for Assessing Severity of Diabetes-Associated Foot Ulcers.
Abbreviations: SINBAD, Site, Ischemia, Neuropathy, Bacterial Infection, and Depth; UTWCS, University of Texas Wound Classification System; WIfI, Wound Ischemia foot Infection.
Note. Data shown are the Krippendorff’s α values for agreement between two different observers (as listed), all three observers or within observers. Observer 1: General Practitioner/Physician: Chanika Alahakoon (CA). Observer 2: Podiatrist: Malindu Fernando (MF). Observer 3: Vascular Surgeon: Charith Galappaththy (CG).
Prediction of Amputations Within 30Days
Eighteen (40.0%) participants had a minor amputation and one (2.2%) had a major amputation within 30days of hospital admission. The median scores for the different classification system of participants who required an amputation and those who did not have an amputation are summarised in Table 4. The median scores for the Wagner (P=.041), but not UTWCS, SINBAD and WIfI classifications, were significantly more severe for participants who had an amputation compared to those who did not (Table 4). However, based on the AUC, none of the classifications were significantly predictive of the requirement for amputation (Table 4).
Median Scores and Area Under the Curve for Different Classifications of the Severity of Diabetes-Associated Ulcers in Patients That Did and Did Not Require an Amputation.
Abbreviations: AUC: area under the curve; CI: confidence interval; ROC: receiver operating characteristic; SINBAD, Site, Ischemia, Neuropathy, Bacterial Infection, and Depth; UTWCS, University of Texas Wound Classification System; WIfI, Wound Ischemia foot Infection.
Note. Data shown are median (inter-quartile range) of scores. Bold indicates statistical significance.
Discussion
Many classification systems are available for grading the severity of a DFU.5-7,30 The ideal clinical grading system for DFUs would be rapid to complete, reproducible within and between different health professionals and reliably predict important clinical outcomes. In the current study, the reproducibility, completion time and ability of four commonly used grading systems to predict 30-day amputation were assessed. In this study, photographs of DFUs were examined, which simulates assessments that are commonly needed in clinical practice due to the increasing use of telehealth to access DFUs. 25 It was found the WIfI system had substantial inter-observer and excellent intra-observer reproducibility. The SINBAD system had moderate inter-observer and excellent intra-observer reproducibility. The UTWCS and Wagner classifications had only fair inter-observer and moderate intra-observer reproducibility. None of these scoring systems were able to reliably predict 30-day amputation rates. The median time to complete all of the four ulcer grading systems was one minute or less, making them highly feasible to use in routine clinical practice by busy clinicians.
A number of previous studies have examined the reproducibility of DFU classifications systems. The Wagner, SINBAD and UTWCS classifications have previously been reported to have moderate agreement.12-14 These findings are similar to those of the current study. The current study is the first to report the reproducibility of the WIfI classification system, which had substantial agreement between different observers and almost perfect intra-observer agreement. 10 Although prior studies have reported the reproducibility and external validity of DFU grading systems, they were not good at predicting the likelihood of amputation within 30days in the current study. 31 The WIfI classification system has however been previously reported to predict the risk of major amputation within one year for both people with and without diabetes.32-35 The WIfI score has been predominantly used in people with peripheral arterial disease previously.10,36-38 No prior reports of any of the scoring systems predicting early requirement for any amputation were identified.
A recent retrospective study that classified ulcers based on photographs using five ulcer classification systems reported that the Wagner and UTWCS classifications were better predictors of amputation over an unspecified follow-up time. 39 In the current study, it was found that the Wagner classification had significantly different median scores between those participants who did and did not require any amputation within 30days. Based on AUC, however, the Wagner classification was not a good predictor of 30-day amputation likelihood. Ankle brachial pressure index <0.5, toe pressure <30mmHg, and transcutaneous oxygen pressure <25mmHg have been reported to be associated with a risk of major amputation of greater than 25%. 40 It is noteworthy that WIfI is the only scoring system that objectively assesses ischemia, but it was not predictive of 30-day amputation rate in the current study.
A number of limitations of the current study should be noted, including the inability of the observers to assess DFUs in-person during a global pandemic, the use of two types of cameras to photograph the foot, the small sample size and the limited number of assessors. Given the increasing role of remote assessment of DFUs, the results of this study are highly relevant and topical within the field. 25 The study was not designed to test whether the classification systems were predictive of 30-day major amputation alone. Furthermore, the outcomes of patients were only assessed up to 30days and none of the classification systems have been previously validated for the prediction of 30-day amputation incidence. It is therefore possible that the grading systems may have had better predictive ability for outcomes assessed over a longer period, as has been previously reported32-35 and should be the focus of future studies.
Conclusion
This study suggests that of the four classification systems examined, the WIfI score has the best inter-observer agreement. The time taken to complete the WIfI score was slightly longer than the other classification systems and WIfI did not predict immediate requirement for any amputation.
Footnotes
Acknowledgements
The authors would like to acknowledge all staff of QRCPVD and all the health professionals who were involved with managing the patients who were admitted to the Townsville University Hospital for accommodating the research personnel and their support given.
Abbreviations
ABPI, Ankle Brachial Pressure Index; AUC, area under the curve; DFU, diabetes-related foot ulcers; IQR, inter-quartile range; IWGDF, International Working Group on Diabetic Foot; NA, not applicable; ROC curves, receiver operating characteristic curves; SINBAD score, Site, Ischaemia, Neuropathy, Bacterial Infection and Depth score; TP, toe pressure; TUH, Townsville University Hospital; UTWCS, University of Texas Score; WIFI score, Wound Ischaemia Foot Infection score.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by grants from the Townsville and Hospital Health Services (SERTA), James Cook University (SRIF) and Queensland Government. JG holds a Practitioner Fellowship from the NHMRC (1117061) and a Senior Clinical Research Fellowship from the Queensland Government.
