Audit feedback on reading performance of screening mammograms: An international comparison

Abstract

Objective

Providing feedback to mammography radiologists and facilities may improve interpretive performance. We conducted a web-based survey to investigate how and why such feedback is undertaken and used in mammographic screening programmes.

Methods

The survey was sent to representatives in 30 International Cancer Screening Network member countries where mammographic screening is offered.

Results

Seventeen programmes in 14 countries responded to the survey. Audit feedback was aimed at readers in 14 programmes, and facilities in 12 programmes. Monitoring quality assurance was the most common purpose of audit feedback. Screening volume, recall rate, and rate of screen-detected cancers were typically reported performance measures. Audit reports were commonly provided annually, but more frequently when target guidelines were not reached.

Conclusion

The purpose, target audience, performance measures included, form and frequency of the audit feedback varied amongst mammographic screening programmes. These variations may provide a basis for those developing and improving such programmes.

Keywords

Audit feedback mammography breast screening

Introduction

The effectiveness of mammographic screening depends on the performance of equipment and technologists, and the screen reader’s ability to perceive and accurately interpret mammographic abnormalities.¹ Screening programmes show substantial variation in the sensitivity and specificity of reader performance,² and in organization and delivery, including single versus double and independent double reading, with and without consensus, computer-aided detection (CAD), and real time versus batch reading. Higher recall rates are more common in programmes using single versus double reading, leading to increased false positive rates and decreased specificity.³ The critical reading challenge is to obtain an optimal balance between sensitivity and specificity. Continuous education to maintain and improve interpretative skills is required, and in some countries audit feedback systems have been developed to help mammographic screen readers both assess and improve their reading skills.^1,4,5

Audit feedback has been shown to improve performance in many areas of medicine, including mammography,^1,5–8 but no studies have comprehensively compared audit feedback procedures and practice among mammographic screening programmes. We conducted a web-based survey among member countries of the International Cancer Screening Network (ICSN)⁹ that offer mammographic screening, to investigate aspects of the reader and facility audit feedback undertaken, examining the purpose, target audience, performance measures, and form and frequency of the audit feedback, in addition to actions taken if the recommendations or requirements were not met.

Methods

An ICSN working group developed the survey and the web version was created by the United States’ National Cancer Institute. Access details were emailed in 2012 to the ICSN mammographic screening programme contact in 30 countries. Representatives from non-responding countries received a further request to participate, at the ICSN Conference in Australia in October 2012, and a reminder was emailed to those still non-responding in March 2013. Representatives could delegate response to the survey. For countries with a nationwide organized screening programme, we considered the response to represent the country. For countries with opportunistic screening (e.g. the United States (US)), or a mix between organized and opportunistic screening (e.g. Switzerland), the response was considered to represent a group of screening sites. We refer to both organized screening programmes and clusters of opportunistic screening sites (e.g. the Breast Cancer Surveillance Consortium (BCSC) in the US) as screening programmes.¹⁰

The four-part web survey consisted of 50 questions, some allowing more than one response. Part A included questions about the characteristics of the programme (e.g. standard number of views used and ratio of analogue/digital mammographic screening examinations), the reading procedure (e.g. single reading; double reading where the second reader knows the first reader’s interpretation; independent double reading where the second reader does not know the first reader’s interpretation; use of computer-aided detection (CAD)), and the number of readers interpreting screening mammograms in 2010, or the most recent year in the programme. Part B sought information about recommendations and requirements to start and continue reading, the content of eventual recommendations and requirements, and who had made the recommendations and requirements. Parts C and D asked to whom the audit feedback was directed, the individual reader and/or the facility, respectively. The same questions about audit feedback were asked to the reader and the facility, and included general questions as well as a section focusing on which performance measures were included in the audit feedback. General information about the screening programmes (year in which the programme was implemented, age groups covered, etc.) had been collected in a general survey in 2012 developed by another ICSN working group.⁹

Descriptive statistics (numbers and percentages) report the responses, for each question and for combinations of responses. STATA, version 12.1 (StataCorp, Station, Texas, USA) was used for the descriptive analyses, Microsoft Excel 10 for the figures. Screening volume was defined as the number of screening examinations interpreted by one reader, recall as the number of call backs for further evaluation after a positive screening exam, and positive predictive value (PPV) as the number of screen-detected breast cancers (ductal carcinoma in situ (DCIS) or invasive cancer) divided by the number of recall examinations due to positive screening mammography. A screen-detected cancer was breast cancer detected as a result of screening participation, and an interval cancer was a diagnosed breast cancer within a defined time span for women with a negative screening result. Tumour characteristics included information such as tumour size and lymph node involvement.

Results

Of the 30 countries invited to participate, 26 responded to the general ICSN survey, and 14 (Australia, Canada, Denmark, France, Israel, Japan, Luxembourg, the Netherlands, Norway, Spain, Sweden, Switzerland, United Kingdom (UK), and the US (from the BCSC¹⁰)) responded to one or more parts of the audit feedback survey (Table 1). Two programmes responded from Spain (Catalonia and Navarra), and three from Canada (Ontario, Quebec, and Saskatchewan). The survey presents information on 17 programmes.

Table 1.

Basic characteristics of the screening programmes.

	Provision of	Year	Number of women	Centralized		Age group	Recommended		Main	Source of	Recommendations to		Requirements to
Countries	audit feedback	programme began	screened (2010)	breast centres	Screening facilities (n)	included (year)	screening interval (years)	Participation rate (%)	method of recruitment	payment for exam	Start reading	Continue reading	Start reading	Continue reading
Australia	Readers and facilities	1991	–	Yes (n = 35)	–	40–75+	2	na	PI³	G-ment	Yes	Yes	Yes	Yes
Catalonia^a	Readers and facilities	1995	527,000	–	–	50–69	2	67	–	G-ment	Yes	Yes	Yes	Yes
Denmark	–	1991	663,398	Yes (n = 5)	–	50–69	2	78	PI	G-ment	Yes	No	Yes	No
France	Reader only	1989	2,343,980	No	2500	50–74	2	52	PI	G-ment	Yes	No	Yes	Yes
Israel	Readers and facilities	1997	220,000	Yes (n = 20)	65	50–74	2	72	PI	G-ment	Yes	Yes	Yes	Yes
Japan	Readers and facilities	1977	2,492,863	Yes (n = 1)	4423	50–69	2	19	PI	Woman	Yes	Yes	No	No
Luxembourg	Readers and facilities	1992	1459	No	6	50–69	2	64	PI	G-ment	Yes	Yes	Yes	Yes
Navarra^b	Reader only	1990	40,016	–	–	45–69	2	87	PI	G-ment	Yes	No	Yes	No
The Netherlands	Facility only	1989	961,766	No	66	50–74	2	81	PI	G-ment	Yes	Yes	Yes	Yes
Norway	Readers and facilities	1996	199,818	Yes (n = 16)	28	50–69	2	75	PI	G-ment	Yes	Yes	Yes	Yes
Ontario^b	Readers and facilities	1990	467,531	No	161		2	43	Self-referral physician	G-ment	No	No	Yes	Yes
Saskatchewan^b	Reader only	1990	38,628	Yes (n = 5)	9	50–69	2	46	PI	G-ment	Yes	Yes	Yes	No
Quebec^b	Readers and facilities	1998	315,784	Yes (n = 96)	96	50–69	2	58	PI	G-ment	No	Yes	Yes	Yes
Sweden	Facility only	1989	1,414,000	Yes	–	40–74	1.5 (40–49) 2 (50–74)	70–92	PI	G-ment woman	Yes	No	Yes	No
Switzerland	Reader only	1999	60,700	No	50	50–69	2	48	PI	Woman insurance	Yes	Yes	Yes	No
United Kingdom	Readers and facilities	1988	1,957,124	Yes (n = 93)	93	50–69	3	73	PI	G-ment insurance	Yes	No	Yes	Yes
United States	Readers and facilities	1994	416,000	No	102	40–75+	1–2 (40+)	67	Physician	G-ment insurance	Yes	No	Yes	Yes

Catalonia and Navarra represent Spain.

Ontario, Saskatchewan, and Quebec represent Canada.

G-ment: government; PI: personally invited.

The number in the target group of the screening programmes varied from 55,000 (Luxembourg) to more than 38 million women (Japan) (Table 1). Participation rates varied from 19% (Japan) to 81% (the Netherlands). Seven programmes reported independent double reading (Australia, France, Luxembourg, the Netherlands, Norway, Switzerland, and UK) (data not shown), two (Sweden and Japan) reported double reading, one independent double or double (Catalonia), and one independent double reading with or without CAD/other (Denmark). We defined independent double reading in the survey as “two readers, the second reader does not know the interpretation from the first reader”. Two programmes reported single reading only (Navarra and Ontario). The other programmes (Saskatchewan, Quebec, and the US) reported a mix of different reading procedures. Data were not provided about reading procedures in Israel.

All programmes had recommendations or requirements for individuals to be eligible both to start and/or to continue reading screening mammograms (Table 1). These were set by the national screening programme for six programmes (Australia, Luxembourg, Norway, Switzerland, UK, Catalonia) and by the government agency or professional organizations for three (Japan, the Netherlands, and Quebec) (not in table). Eight programmes did not respond to the question. To start reading screening mammograms, evidence of a training course was recommended for eight programmes and required for six (Table 2). Review by mentor was recommended for eight programmes and required for seven, and academic skills were recommended for six and required for seven programmes. Reading screening test sets was recommended in three programmes and required in one. To continue reading, achieving standards set in guidelines/benchmarks were recommended in eight programmes and required in five (Table 2). Reading a specific number of mammograms annually (varying from 400 in Luxembourg to 5000 in Norway and UK) was recommended in four programmes and required in 10.

Table 2.

Recommendations and requirements to start reading screening mammograms.

	Start reading												Continue reading
	Academic		Shadow reads		Screen read test set		Review by mentor		Evidence of training course		Other		Screen read test set		Participate in formal audit of reading performance		Read specific number of mammograms		Number of mammograms		Take part in screen and diagnostic mammography		Achieve standard of reading according to guidelines		Other
	Rec	Req	Rec	Req	Rec	Req	Rec	Req	Rec	Req	Rec	Req	Rec	Req	Rec	Req	Rec	Req	Rec	Req	Rec	Req	Rec	Req	Rec	Req
Australia	–	X	X	–	X	–	X	–	X	–	–	–	–	–	X	–	–	X	–	2000	–	–	X	–	–	–
Catalonia^a	X	X	X	X	–	–	–	–	–	–	–	–	–	–	X	X	X	X	2000	2000	X	–	X	X	–	–
Denmark	X	X	–	–	–	–	X	X	–	–	X	X
France	–	–	–	–	–	–	–	–	X	X	–	–		–		–		X		500		–		X		–
Israel	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–	–
Japan	–		–		X		–		X		–		X		–		–		–		–		–		–
Luxembourg	–	–	–	–	–	–	X	X	X	X	–	–	–	–	–	–	X	X	400	400	X	X	X	X	–	–
Navarra^a	–	–	–	–	–	–	–	–	–	–	–	–
The Netherlands	–	–	–	–	–	–	X	–	–	X	–	–	X	–	–	X	–	X	–	3000	–	X	X	–	–	X
Norway	X	X	–	–	–	–	–	–	–	–	–	–	–	–	–	–	X	X	5000	5000	X	X	X	X	–	–
Ontario^b		X		–		–		X		–		–		–		X		X		1000		–		–		–
Saskatchewan^b	X	–	X	–	–	–	X	X	X	–	–	–	–		X		X		2000		–		X		–
Quebec^b	–	–	–	–	–	–	–	–	–	X	–	–	–	–	–	–	–	X	–	500	–	–	X	–	–	X
Sweden	–	–	X	X	–	–	X	X	X	X	–	–
Switzerland	–	–	–	–	–	–	X	X	X		–	X	–		–		–		–		–		X		–
United Kingdom	X	X	X	–	X	X	X	X	X	X	–	X		X		X		X		5000		–		X		X
United States	X	X	–	–	–	–	–	–	–	–	–	–		–		–		X		–		–		–		X

Catalonia and Navarra represent Spain.

Ontario, Saskatchewan, and Quebec represent Canada.

No mark means that the programme has not responded to that part of the survey.

(–) means that the programme has not checked for that alternative in the questionnaire. Rec: recommendations; Req: requirements.

Monitoring performance as part of continuous quality assurance (QA) was the most frequent purpose for the reader and facility audit feedback (13 and 11 programmes, respectively) (Table 3). The purpose of reader feedback was identification of outliers in eight programmes and determining readers who needed intervention or specific training in six programmes.

Table 3.

Purpose of the reader and facility audit feedback.

	Identify result outliers		Monitor performance for quality assurance		Identify readers or facilities that need special training		Compare between readers		Document the results
	Reader	Facility	Reader	Facility	Reader	Facility	Reader	Facility	Reader	Facility
Australia	X	X	X	X	X	X	–	–	–	–
Catalonia^a	X	X	X	X	–	–	X	–	X	X
Denmark
France	–		X		X		–		X
Israel	–	–	–	X	–	X	–	–	–	–
Japan	–	–	X	X	–	–	–	–	–	–
Luxembourg	–	–	X	X	X	–	X	X	X	X
Navarra^a	–		X		–		–		–
The Netherlands		X		X		X		X		X
Norway	X	–	X	X	–	–	–	–	–	–
Ontario^b	X	–	X	X	X	–	X	–	X	–
Saskatchewan^b	X		X		X		X		X
Quebec^b	–	X	X	X	–	X	X	X	–	–
Sweden		–		–		–		X		X
Switzerland	X		X		–		X		X
United Kingdom	X	X	X	X	X	X	X	X	–	X
United States	X	–	X	X	–	–	–	–	–	X

Catalonia and Navarra represent Spain.

Ontario, Saskatchewan, and Quebec represent Canada.

No mark means that the programme has not responded to that part of the survey.

(–) means that the programme has not checked for that alternative in the questionnaire.

The readers were the target audience in 13/14 of the programmes that responded to part C of the survey (that reader audit feedback is provided) (data not shown). Facilities and health administrators were the target audiences for Australia, Catalonia, France, and Luxembourg, while Norway and Saskatchewan reported the target audience as just the facility. The main target audience for facility level feedback was readers in eight programmes (Israel, Japan, Luxembourg, the Netherlands, Norway, Quebec, Sweden, and UK), facilities in seven programmes (Australia, Luxembourg, the Netherlands, Quebec, Sweden, US, and UK), and health administrators in four programmes (the Netherlands, Quebec, Sweden, and US).

Readers were responsible for running the analyses for their own performance in two programmes, while analyses were performed on the local level for six programmes, regional level for two, and on a national level for one programme (Table 4). Facility level audit feedback was run on regional (n = 3) and national (n = 3) level, in addition to independent units (n = 3), medical leader (n = 3), and by others (n = 1).

Table 4.

Responsible for running the analysis for reader and facility audit feedback.

	The readers^a	Local level	Regional level		National level		Independent unit		Medical leader		Other
	Reader	Reader	Reader	Facility	Reader	Facility	Reader	Facility	Reader	Facility	Reader	Facility
Australia	–	X	X	X	–	X	–	–	–	–	–	–
Catalonia^b	–	X	–	–	–	X	–	–	–	–	–	–
Denmark
France	X	–	X		–		–		–		–
Israel	–	–	–	–	–	X	–	–	–	–	–	–
Japan	–	X	–	–	–	–	–	–	–	–	–	–
Luxembourg	–	–	–	–	X	–	–	–	–	–	–	–
Navarra^b	–	–	–		–		–		–		X
The Netherlands				–		–		X		–		X
Norway	–	–	–	–	–	–	–	–	X	X	–	–
Ontario^a	–	–	–	–	–	–	X	X	–	–	X	–
Saskatchewan^a	–	X	–		–		–		X		X
Quebec^a	–	–	–	–	–	–	X	X	–	–	–	–
Sweden				X		–		–		X		–
Switzerland	–	X	–		–		–		–		–
United Kingdom	X	X	–	X	–	–	–	–	X	–	X	–
United States	–	–	–	–	–	–	X	X	–	–	–	–

Ontario, Saskatchewan, and Quebec represent Canada.

Catalonia and Navarra represent Spain.

No mark means that the programme has not responded to that part of the survey.

(–) means that the programme has not checked for that alternative in the questionnaire.

Reader audit feedback reports included various measures (Figure 1). Fourteen programmes provided screening volume and recall rate on a reader level, 13 provided screen-detected cancers (Figure 1(a)), eight included interval cancer rate, and three gave characteristics of the interval tumours. Of the 12 programmes that reported performance measures at the facility level, 11 provided information about recall rate and 10 reported screening volume and rate of screen-detected cancer (Figure 1(b)). Six programmes reported the interval cancer rate. All but Australia, Israel, Japan, Ontario, and the US reported PPV. Histopathologic tumour characteristics were given for screen-detected cancers in nine programmes and for interval cancers in five.

Figure 1.

Performance measures included in (a) reader and (b) facility audit feedback. NB: Black box with programme name: the alternative is representative for the programme. White box with programme name: the alternative is not representative for the programme. White box without programme name: No response to that part of the survey. 1: Catalonia (CAT) and Navarra (NAV) represent Spain; 2: Ontario (ONT), Saskatchewan (SAS), and Quebec (QUE) represent Canada.

Individual audit feedback was given annually or more frequently for 10 programmes (Australia, France, Israel, Japan, Luxembourg, Ontario, Quebec, Sweden, US, and UK). Four programmes (Australia, Navarra, Norway, and UK) presented the results ad hoc. In the US, web-based data were accessible all the time. Facility audit feedback was given annually for seven programmes (Australia, Israel, Japan, Quebec, Sweden, US, and UK). Ad hoc presentation was reported by Norway, infrequently by Luxembourg, and “other” interval by the Netherlands.

Only Quebec and the US reported confidence intervals for the performance measures on the reader audit feedback. This was included for Australia, Sweden, and Quebec for facility audit feedback.

If guidelines/benchmarks were not achieved, four programmes offered remedial support/training for readers; one offered remedial training for the facility (Table 5); and Switzerland, US, and Quebec reported no action for the reader if the targeted guidelines were not met. Australia and Luxembourg removed readers from the programme (data not shown), Norway provided more skills, and Navarra offered meetings to discuss actions. Actions regarding the facility were remedial training (n = 1), more frequent monitoring (n = 2), and further investigation (n = 7). Norway reported providing more skills, and the Netherlands reported identification of facilities that need intervention with a subsequent decision on the action that should be taken by the National Institute for Health and Environment. A written report was the most common format for reader level feedback (Figure 2). The UK and US offered web-based information in addition to a written report. Norway offered web-based information only. Eleven out of the 12 programmes responding to this question used a written report for facility level feedback.

Figure 2.

Format of (a) reader and (b) facility level audit feedback. Black box with programme name: the alternative is representative for the programme. White box with programme name: the alternative is not representative for the programme. White box without programme name: No response to that part of the survey. 1: Catalonia (CAT) and Navarra (NAV) represent Spain; 2: Ontario (ONT), Saskatchewan (SAS), and Quebec (QUE) represent Canada.

Table 5.

Actions if targeted guidelines are not achieved for readers and facilities.

	Remedial training		More frequent monitoring		Investigate further		Other
	Reader	Facility	Reader	Facility	Reader	Facility	Reader	Facility
Australia	X	–	X	X		X	X	–
Catalonia^a	–	–	X	–	–	X	–	–
Denmark
France	–		–		X		–
Israel	–	–	–	–	–	–	–	–
Japan	–	–	–	–	–	–	–	–
Luxembourg	X		–	–	–	X	X	–
Navarra^a	–		–		–		X
The Netherlands		X		X		X		X
Norway	–	–	–	–	–	–	X	X
Ontario^b	X	–	–	–	X	X	–	–
Saskatchewan^b	–		–		–		X
Quebec^b	–	–	–	–		X	–	–
Sweden		–		–		–		–
Switzerland	–		–		X		X
United Kingdom	X	–	–	–	X	X	–	–
USA	–	–	–	–	–	–	–	–

Catalonia and Navarra represent Spain.

Ontario, Saskatchewan, and Quebec represent Canada.

No mark means that the programme has not responded to that part of the survey.

(–) means that the programme has not checked for that alternative in the questionnaire.

Discussion

This is the first international survey to explore the provision of audit feedback to readers and facilities in mammographic screening programmes, with results from 17 programmes in 14 countries across four continents. Programmes varied substantially in organization, size, and participation rate. The most common requirement to start reading screening mammograms was reviewing mammograms with a mentor. The most frequent purpose of reader and facility audit feedback was monitoring performance for continuous QA. The readers were the most common target audience for the audit feedback, but facilities and health administrators were also targeted. The most common performance measures included in reader audit feedback reports were screening volume, recall rates, and rates of screen-detected and interval cancer.

Knowledge about the best ways to provide audit feedback is limited,^1,4,5 and studies have shown only modest effects. A meta-analysis of audit feedback studies concluded that feedback was most effective when delivered by a supervisor or respected colleague, presented frequently, and included specific goals and actions.¹ Our survey did not identify who delivered the feedback. We only asked who was responsible for running the analyses, and at three programmes this was the medical leader. Most programmes provided audit feedback annually, but some measures that require cancer outcomes may be delayed by as much as two or more years, so annual feedback would not be timely. In the majority of the programmes, audit results were compared with benchmarks and/or guidelines, which can be considered a goal.

When audit feedback showed deficits in the interpretive scores, the three most common remedial actions for the readers were remedial training (four programmes), more frequent training (two programmes), and further investigation (four programmes). Four programmes reported other actions, including meetings to discuss possible causes, and removing readers. Several countries reported no action if results from subsequent audit feedback showed that targeted guidelines were not met. It would be interesting to understand why these programmes have no official action for under performers. Our survey did not include questions to determine if there was a measured effect of using audit feedback in the screening programmes. Other studies that measured the results of audit feedback showed mixed and small effects,¹¹ but many of them combined audit feedback with other interventions, so it is difficult to determine the extent to which the positive effects were directly related to the audit feedback.

How feedback is delivered may also be important for the learning process. To give and receive feedback on an individual level might be more sensitive than at facility level. However, individual feedback enables targeted actions, which then affects the facility. Individual feedback might be difficult in some cases due to the small number of cancer cases in a screening setting. Reporting results both on individual and on facility level to compare with colleagues and other screening centres might therefore be preferable. In a US study, radiologists perceived their performance to be better than it actually was, and at least as good as their peers, and they had particular difficulty estimating their false positive rates and PPV.¹² Radiology is practical work, and learning by mentoring may be the best way to achieve the applied skills needed to fulfil the criteria in the recommendations and requirements. In our survey, all programmes had recommendations or requirements to start screen reading, consisting of a combination of academic and practical training. Fourteen programmes also had recommendations or requirements to continue screen reading, most including reading a minimum number of mammograms.

Clinical audit in mammographic screening suffers from the low prevalence of breast cancer. Readers may see a limited number of cases from which to gain experience, and it may take over two years to obtain sufficient data from clinical audit to identify falling clinical performance.¹³ To address this problem, screen reading test sets enriched with cancer cases have been used as training and QA tools. Reasonable levels of agreement between clinical performance and test set performance can be achieved.¹⁴ UK screen readers are required to participate in PERFORMS (Personal Performance in Mammographic Screening), an educational self-assessment and training scheme.¹⁵ In Australia (BreastScreen Australia) and New Zealand (BreastScreen Aotearoa), readers are encouraged to participate in BREAST (BreastScreen Reader Assessment Strategy).¹⁶ Statistically significant moderate positive correlations have been demonstrated between reader performance at BREAST test sets and performance demonstrated by clinical audit.¹⁷ In the US, a digital versatile disc (DVD) intervention with enhanced cancer cases that also provided immediate feedback to the radiologist¹⁸ was found in a randomized controlled trial to significantly improve interpretive performance in a post-test set. The effect was not tested in clinical practice. In a test set study from the Netherlands,¹⁹ the areas under the receiver operating characteristic curve, case sensitivity, and lesion sensitivity were satisfactory, and recall agreement was substantial, although agreement in lesion type and breast imaging reporting and data system was not satisfactory.

A qualitative study which provided feedback on audit content and presentation conducted amongst American radiologists suggested that they liked seeing their audit data compared with both their peers and benchmarks.¹¹ A web-based audit feedback tool was developed and tested based on this feedback,²⁰ and radiologists who used this tool found it very useful, although there has been no investigation into whether the tool made a difference in their clinical work.

Mammography screening relies on multiple disciplines all of which can affect the interpretation accuracy, and failure to achieve targeted guidelines may not necessarily reflect screen-reading skills. QA is required for all aspects of the screening process,²¹ and schemes exist for other professions involved in mammography screening,²² as well as for other cancer screening programmes.¹⁸

In this survey, responses were received from less than half of ICSN member countries. The ICSN has only one contact in each country, to whom the survey invitation was sent. We do not know if the low response rate was a result of not having audit feedback or not being able to identify an appropriate person to complete the survey. Having only one person representing a programme respond to the survey might also bias the results. Direct contact with the person responsible for the audits would have been the best way to obtain the most accurate responses to the survey questions. A check box indicating whether audit feedback is or is not performed would have assured us that the survey was received. In addition, the survey was conducted in English and some countries may not have participated because the designated responder was not fluent in English. Although parts of the survey were not completed, we cannot tell whether this is because the programme does not provide that kind of audit feedback, if data were not available, or if that part was simply not filled in.

Our survey did not include questions about reviewing false negative or false positive screening exams. Such reviews are valuable for learning and are recommended in the European guidelines for breast cancer screening and treatment.²¹

Most mammographic screening programmes that responded to this survey use audit feedback, which is expected to increase the quality of the performance. This study presents different methods of audit feedback and may assist programmes to develop new and improved systems.

Footnotes

Acknowledgement

Data from the NCI-funded Breast Cancer Surveillance Consortium (BCSC) co-operative agreement [U01CA63740, U01CA86076, U01CA86082, U01CA63736, U01CA70013, U01CA69976, U01CA63731, U01CA70040] was provided by a special competitive supplement to U01CA70013.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

References

Davis

Thomson

Oxman

. Changing physician performance. A systematic review of the effect of continuing medical education strategies. JAMA 1995; 274: 700–705.

Hofvind

Geller

Skelly

. Sensitivity and specificity of mammographic screening as practised in Vermont and Norway. Br J Radiol 2012; 85: e1226–e1232.

Hofvind

Vacek

Skelly

. Comparing screening mammography for early breast cancer detection in Vermont and Norway. J Natl Cancer Inst 2008; 100: 1082–1091.

Hysong

Teal

Khan

. Improving quality of care through audit and feedback. Implement Sci 2012; 2: 45–45.

Ivers

Grimshaw

Jamtvedt

. Growing literature, stagnant science? Systematic review, meta-regression and cumulative analysis of audit and feedback interventions in health care. J Gen Intern Med 2014; 29: 1534–1541.

Perry N. Interpretive skills in the National Health Service Breast Screening Programme: Performance indicators and remedial measures. Semin Breast Dis 2003; 6: 6.

Adcock

. Initiative to improve mammogram interpretation. Permanente J 2004; 8: 12–18.

Kiefe

Allison

Williams

. Improving quality improvement using achievable benchmarks for physician feedback: a randomized controlled trial. JAMA 2001; 285: 2871–2879.

Webpage ICSN, http://appliedresearch.cancer.gov/icsn/ (accessed 9 August 2015).

10.

Ballard-Barbash R, Taplin SH, Yankaskas BC, et al. Breast cancer surveillance consortium: A national mammography screening and outcomes database. Am J Roentgenol 1997; 169: 1001–1008.

11.

Aiello Bowles

Geller

. Best ways to provide feedback to radiologists on mammography performance. Am J Roentgenol 2009; 193: 157–164.

12.

Cook

Elmore

Zhu

. Mammographic interpretation: radiologists' ability to accurately estimate their performance and compare it with that of their peers. Am J Roentgenol 2012; 199: 695–702.

13.

Soh

Lee

Kench

. Assessing reader performance in radiology, an imperfect science: lessons from breast screening. Clin Radiol 2012; 67: 623–628.

14.

Soh

Lee

McEntee

. Screening mammography: test set data can reasonably describe actual clinical reporting. Radiology 2013; 268: 46–53.

15.

Scott

Gale

. Breast screening: PERFORMS identifies key mammographic training needs. Br J Radiol 2006; 79: S127–S133.

16.

Webpage BREAST, http://sydney.edu.au/health-sciences/breastaustralia/ (accessed 9 August 2015).

17.

Soh

Lee

Mello-Thoms

. Certain performance values arising from mammographic test set readings correlate well with clinical audit. J Med Imaging Radiat Oncol 2015; 59: 403–410.

18.

Geller

Bogart

Carney

. Educational interventions to improve screening mammography interpretation: a randomized controlled trial. Am J Roentgenol 2014; 202: 586–596.

19.

Timmers

Verbeek

Pijnappel

. Experiences with a self-test for Dutch breast screening radiologists: lessons learnt. Eur Radiol 2014; 24: 294–304.

20.

Geller

Ichikawa

Miglioretti

. Web-based mammography audit feedback, Vienna, Austria: European Congress of Radiology, 2011.

21.

Perry N, Broeders M, deWolf C, et al. European Guidelines for Quality Assurance in Breast Cancer Screening and Diagnosis. Belgium: European Communities, 2006.

22.

Webpage: EQA Scheme for Breast Screening Histopathology, http://www.esqa.nhs.uk/esqa/sites/default/files/5_Pathology_EQA_Scheme.pdf (accessed 9 August 2015).