Abstract
Background
Quality control (QC) in mammography is crucial for breast imaging and it relies predominantly on subjective assessment of image quality (IQ) in phantom.
Purpose
To validate the efficacy of automated software assessment in comparison with human observers.
Material and Methods
A total of 80 processed images of the ACR DM phantom were collected from mammography systems supplied by five different vendors. IQ was assessed using in-house developed software and 11 human observers (five experts and six non-experts). Three different target objects of various sizes and shapes were scored (six fibers, six speck groups, six masses) for each image.
Results
The software assessment demonstrated good to moderate agreement with the human observers’ scoring of target objects, especially with expert observers. The intraclass correlation coefficient (ICC) values between the software and all observers were 0.77, 0.61, and 0.78 for fibers, speck groups, and masses, respectively. There was variability in the scoring of low contrast objects, especially with non-expert observers. Meanwhile, high contrast objects (e.g. specks) showed the highest visibility rate and received more consistent scores. The comparison between the software and all observers indicated mean differences of 0.53 for fibers, 0.27 for specks, and 0.36 for masses.
Conclusion
The software assessment effectively scored the phantom IQ, demonstrating comparability to assessments made by human observers, regardless of the image acquisition factors and manufacturers’ design specifications. This software should therefore ensure consistent IQ assessment that mitigates the limitations of subjective human assessment in mammography quality testing.
Get full access to this article
View all access options for this article.
