Abstract
Introduction
Urgent radiological studies obtained during on-call hours are often preliminarily read by on-call residents before consultant radiologists finalise the reports at a later time. Such provisional radiology reports provide important information to guide initial patient management. This study aims to determine discrepancy rates between provisional reports and final interpretations, and to assess the clinical significance of such discrepancies.
Methods
This retrospective quality assurance project reviewed a total of 1218 cross-sectional imaging studies of the body (thorax, abdomen and pelvis) done between July 2015 and May 2016 during on-call hours. The studies included 1201 Computed tomography (CT) scans and 17 Magnetic Resonance Imaging (MRI) scans. Studies with incomplete or unavailable reports were excluded. Conclusions of both the provisional and final reports of each study were reviewed for concordance, with reference to the full report if needed. Discrepancies were graded according to the ACR 2016 RADPEER scoring system.
Results
There were 1210 studies with complete reports. Discrepant reports were noted in 183 (15.1%) studies. Of these, 89 (7.3%) were assessed to be clinically significant and the majority of these (55) were due to interpretations which should be made most of the time. CT of the abdomen and pelvis were the most prone to discrepant reports, accounting for 148 cases (80.9%).
Conclusion
The majority of preliminary reports for on-call body scans were concordant with final interpretations. The discrepancy rates for provisional body scan reports provided by residents while on call were comparable to those previously reported in literature.
Introduction
Most major hospitals and trauma centres provide round-the-clock services to ensure uninterrupted delivery of patient care. As not all hospitals have the resources to provide round-the-clock subspeciality expertise, many hospitals have on-call radiology residents providing preliminary reads of urgent radiological studies done after usual working hours. 1 By providing such coverage, residents learn independent responsibility in scan interpretation which is integral to their training. 2 Consultant radiologists then finalise the reports after variable time intervals, depending on institution practice. Medical teams use provisional radiology reports to guide time-sensitive interventions. Inadequate radiological reports can lead to unnecessary investigations, delays in diagnosis and possible harm to the patient. 3 Hence, such preliminary reports need to be accurate.
This study aims to determine the rate of discrepancy between provisional reports and final interpretations, and to assess the clinical significance of such discrepancies.
Methods
The ACR 2016 RADPEER Discrepancy Scoring System.

Summary of the review process.
Results
Of the 1218 scans done during on-call hours from 1 July 2015 to 31 May 2016, 8 had incomplete records. A total of 1210 scan reports (1195 CT scans and 15 MRI scans) were then reviewed. 653 (54.0%) scans were done after office hours on weekdays from 6 p.m. to 8.30 a.m. while 557 (46.0%) were done on weekends and public holidays.
Breakdown of types of scans done during on-call hours.
aCT: Computed Tomography.

Type of scans done during on-call hours.
Breakdown of discrepancy rates based on ACR 2016 RADPEER Classification System.
There were 73 preliminary scan reports with discrepancies in interpretation which should be made most of the time. Of these, 55 were assessed to be likely clinically significant discrepancies (category 3b).
Just over half of the discrepant reports (97 reports or 53.0%) were done between 6 p.m. and 8.30 a.m. on weekdays, while the remainder (86 reports or 47.0%) were done on weekends and public holidays. The vast majority of discrepant preliminary scan reports (148 reports or 80.9%) involved the abdomen and pelvis.
Discrepancies graded under RADPEER 2a (understandable miss, unlikely to be clinically significant).
Discrepancies graded under RADPEER 2b (understandable miss, likely to be clinically significant).
aMore common discrepancies are highlighted in
Discrepancies graded under RADPEER 3a (correct interpretation should be made most of the time, but discrepancy is unlikely to be clinically significant).
Discrepancies graded under RADPEER 3b (correct interpretation should be made most of the time, and discrepancy is likely to be clinically significant).
aMore common discrepancies are highlighted in

(a and b) Coronal CT images. Graded as RADPEER 2b - Gall bladder mass with enlarged necrotic lymph nodes, suspicious for gall bladder carcinoma which was provisionally reported as ‘perforated acute cholecystitis’.

Axial CT image. Graded as RADPEER 2b - Enhancing caecal and appendiceal tumour (adenocarcinoma) initially reported as ‘abscess’.

Coronal CT image. Graded as RADPEER 2b - Perforated appendicitis with adjacent abscess provisionally reported as ‘ileocolic intussusception’.

Axial CT Image. Graded as RADPEER 3b - Missed dilated appendix with adjacent fat stranding due to acute appendicitis.

Coronal CT Image. Graded as RADPEER 3b - Missed site of small bowel perforation with peri-enteric contrast leak.

Contrast Enhanced Axial CT Image. Graded as RADPEER 3b - Missed enhancing lesion in pancreatic head.
Discussion
On-call residents play an important role in the provision of uninterrupted patient care for public sector hospitals in Singapore. As part of the healthcare team on duty, radiology residents are responsible for providing timely and accurate provisional scan reports to guide the clinical teams.
It is therefore important for radiology educators to understand common ‘misses’ that residents may make while on call. Armed with information on common discrepancies in preliminary scan reports, educators can enhance training programmes to improve reporting accuracy.
Kim and Mansfield suggested 12 types of errors in diagnostic radiology, ranging from issues related to the person doing the interpretation (e.g. lack of knowledge, faulty reasoning and complacency) to factors involving the scan (e.g. limitations of scan or technique). 5 In our series, we noticed the following broad categories of ‘misses’.
Firstly, the lesion was simply not detected or misinterpreted as not significant. Residents may have missed the lesion entirely, or picked it up but dismissed it as a normal finding or an artefact.
Secondly, the anatomical region in question may be tricky to evaluate. Some areas are notoriously difficult to interpret, such as the pancreas, bowel and post-transplant organs.
Thirdly, findings were missed in ‘blind spots’ that are commonly overlooked, such as bones, blood vessels and lung bases in studies focussing on the abdomen and pelvis.
Fourthly, residents may encounter difficult studies due to patient’s condition or scan protocol. Examples include post-operative cases with substantially altered anatomy, acutely unwell individuals with multiple pathologies or a scan with multiple phases acquired. Such cases may be more common in tertiary or quaternary hospitals where complicated subspeciality surgical services are available.
Lastly, residents may find themselves in a contextual conundrum during interpretation. Subtle findings such as focal fat stranding, mild mural thickening and gas pockets may or may not be relevant depending on the clinical context. One example is pneumoperitoneum which can either be a surgical emergency or an unremarkable expected finding (in the context of recent peritoneal drainage, biopsy or laparotomy). Subtle but relevant findings in a given clinical presentation may be dismissed leading to inaccurate interpretation.
There is paucity of literature on discrepancies in interpretation of body imaging studies. Most prior studies used the terms ‘major’ and ‘minor’ discrepancies, which is analogous to our classification on whether a discrepancy is clinically significant or not. Our clinically significant (major) discrepancy rate (DR) was 7.3%, while that of non-clinically significant (minor) was 7.8%. In a study done by Howlett et al. on the accuracy of interpretation of emergency abdominal CT in patients who presented with non-traumatic abdominal pain in the United Kingdom, the major DR and minor DR was 4.6% and 8.4%, respectively, for both surgical and non-surgical scans provisionally reported by registrars. 6
A similar study by Tieng et al. on the interpretation of Emergency Department body CT scans by radiology residents has a major DR of 10% and minor DR of 20%. 1 Another study in literature published in 1996 by Wechsler et al. on the effects of training and experience in interpretation of emergency body CT scans had a major DR of 7.8% and a minor of 5.9% among senior residents. 7
In a more recent study by Wu et al. on the discrepancy rates in acute abdominal CT, 3 the major DR by registrar was 6.86% while the minor was 1.82%. The lower DRs in this study may be due to the fact that this study only involved patients who presented with non-traumatic abdominal pain who subsequently underwent emergency laparotomy. As these cases had emergency laparotomies performed, there could have been more overt radiological signs present on CT which were identified and hence accounting for the lower DRs.
Our study has several limitations. Firstly, only the conclusions of the provisional and final reports were reviewed at the first instance. Full provisional reports were not reviewed unless there was a discrepancy between the conclusions of the provisional and final reports. We assessed that most clinical decisions would be made based on report conclusions, and thus felt that this initial assessment of the conclusions only was reasonable.
Secondly, actual images from the studies were not re-read and interpreted for all cases. Hence, the “ground truth” was the final report and not a separate independent review of all the images. It is therefore plausible that some discrepancies may have been missed by our review if they were not evident in either the provisional or final report.
Thirdly, the scope of our study was limited to scans from the body subspeciality. To fully assess the performance of radiology residents on call, other studies beyond the thorax, abdomen and pelvis should also be reviewed. We chose to focus on the body scans as these were anecdotally identified by residents as being one of the more challenging scans encountered while on call. The overall DR is likely to be different, and possibly lower if studies from all subspecialities were assessed together.
Fourthly, assessment of clinical significance was based on a change of diagnosis (if any) in the final report and the likely clinical management. The full electronic medical records for these patients were not reviewed. Future studies may consider reviews of the electronic medical records for a more robust assessment of the clinical impact of the discrepancies.
Finally, the discrepancies found were not systematically analysed for the root causes. For example, the respective reporting residents were not identified and approached to evaluate whether the discrepancy was due to an error in detection or interpretation. Further classification of discrepancies may help highlight useful lessons for educating future batches of radiology residents.
Notwithstanding the abovementioned limitations, we feel that our study shows that the overall discrepancy rates, and in particular, clinically significant discrepancies for provisional body imaging radiology reports by our residents is low and comparable to that of other previously published studies. This shows that the skills acquired by our residents during the training programmes allows them to perform at a similar level while on call compared to their counterparts from other countries. Further studies of such discrepancies and the trends over time may also provide useful information for refinements to existing training programmes.
Conclusion
The majority of preliminary reports for on-call body CT scans were concordant with final interpretations. Our study revealed an overall DR of 15.1% and a clinically significant discrepancy rate of 7.3%, which is comparable to those previously reported in literature. Further studies can be done to assess DRs in different subspecialities across radiology, to identify commonly missed regions or conditions and further tailor training programmes for future batches of radiology residents.
Footnotes
Acknowledgements
We would like to thank the Department of Diagnostic Radiology for the assistance rendered in this research.
Author Contributions
Lionel Tim-Ee Cheng was involved in conceptualisation, protocol development and data analysis.
Jonathan Kia-Sheng Phua was involved in data collection, data analysis and drafting of the manuscript.
All authors reviewed and edited the manuscript and approved the final version of the manuscript.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical approval
This was a department quality assurance project which qualified for IRB exemption.
Informed Consent
This was a retrospective department quality assurance project with no direct or indirect patient involvement.
Availability of data and materials
The datasets generated and/or analysed during the current study are available from the Department of Diagnostic Radiology.
