The validity of the Annual Review of Competence Progression: a qualitative interview study of the perceptions of junior doctors and their trainers

Abstract

Objective

To investigate trainee doctors’ and trainers’ perceptions of the validity of the Annual Review of Competence Progression (ARCP) using Messick’s conceptualisation of construct validity.

Design

Qualitative semi-structured focus groups and interviews with trainees and trainers.

Setting

Postgraduate medical training in London, Kent Surrey and Sussex, Yorkshire and Humber, and Wales in November/December 2015. Part of a larger study about the fairness of postgraduate medical training.

Participants

Ninety-six trainees and 41 trainers, comprising UK and international medical graduates from Foundation, General Practice, Medicine, Obstetrics and Gynaecology, Psychiatry, Radiology, and Surgery, at all levels of training.

Main outcome measures

Trainee and trainer perceptions of the validity of the ARCP as an assessment tool.

Results

Participants recognised the need for assessment, but were generally dissatisfied with ARCPs, especially UK graduate trainees. Participants criticised the perceived tick-box nature of ARCPs as measuring clerical rather than clinical ability, and which they found detrimental to learning. Trainees described being able to populate their e-portfolios with just positive feedback; they also experienced difficulty getting assessments signed off by supervisors. ARCPs were perceived as poor at identifying struggling trainees and/or as discouraging excellence by focussing on minimal competency. Positive experiences of ARCPs arose when trainees could discuss their progress with interested supervisors.

Conclusions

Trainee and trainer criticisms of ARCPs can be conceptualised as evidence that ARCPs lack validity as an assessment tool. Ongoing reforms to workplace-based assessments could address negative perceptions of the ‘tick-box’ elements, encourage constructive input from seniors and allow trainees to demonstrate excellence as well as minimal competency, while keeping patients safe.

Keywords

Qualitative medical training medical education assessment validity ARCP workplace-based assessment

Introduction

All doctors in training in the UK (‘trainees’) are assessed annually by the Annual Review of Competence Progression (ARCP). The ARCP is a formal and structured way of monitoring trainees at each stage of training. It is intended to protect patients and ensure that doctors gain suitable and sufficient experience and training to progress.¹ Using an electronic portfolio (‘e-portfolio’), trainees collate evidence of their learning and experience which includes evidence that they have undertaken a set number of workplace-based assessments such as direct observations of procedural skills, case-note reviews and self-reflective learning logs. The e-portfolio is reviewed against a relevant curriculum by a panel who decides whether the trainee can progress.²

There is little research on the ARCP panel; however, workplace-based assessments have received more attention, with findings showing they are not always positively received by trainees. A narrative review of medical and dental workplace-based assessments found negativity due to assessments having unclear purpose, providing insufficient quality feedback and a lack of time to carry them out.³ Formative workplace-based assessments are felt to be more educationally valuable than summative ones,^4,5 and there are calls to move away from the tick-box culture of workplace-based assessments.^6–8 Opinions on specific summative workplace-based assessments are mixed: mini-clinical evaluation exercises are viewed as beneficial for development but difficult to implement,^9,10 and multi-source feedback can be effective but also unhelpful.¹¹ Relatively few studies have looked specifically at ARCPs in the UK, and those few report mixed views, with some trainees finding the e-portfolio confusing and lacking in educational value¹² and others feeling confident using the system and finding the panel fair but lacking individual feedback.¹³ There are also concerns about the fairness of ARCP outcomes because doctors who graduated from medical schools outside the UK are at increased risk of poorer outcomes than UK graduates.¹⁴

The quality of an assessment or test is typically considered in terms of psychometric validity and reliability. Most assessments in medical education measure constructs¹⁵ such as ‘educational achievement’ and ‘educational ability’, so all validity can be considered construct validity.^15,16 Construct validity relates to how an assessment is constructed and administered in practice, and how its results are interpreted and used. Multiple sources of evidence are required to consider an assessment valid for use in a specific context for a specific purpose,¹⁵ of which there are five main sources (Table 1).^16,17

Table 1.

Five sources of validity evidence based on Messick¹⁶ and Standards for Educational and Psychological Testing,¹⁷ adapted from Cook et al.¹⁸ and Cook and Beckman.¹⁹

1. Content	Examines the relationship between the construct being measured and the content of the test used to do so.¹⁸
2. Response process	Looks at the processes connecting what is being observed to the documentation of the observation.¹⁸
3. Internal structure	The reliability of the assessment and the coherence between the assessment components.¹⁹
4. Relationship with other variables	Considers how well the assessment correlates with other assessments testing the same construct.¹⁹
5. Consequences	Refers to the impact of the assessment on the trainee, the institution, patients and any other affected parties.¹⁸

In this study, we examined the validity of ARCPs by exploring how ARCPs are perceived by trainees and trainers, using Messick¹⁶ and Cook et al.¹⁸ to guide our analysis. The data were gathered as part of a study of perceptions of the fairness of postgraduate medical training, commissioned by the General Medical Council.²⁰

Method

Participants and data gathering

During November and December 2015, 96 trainees and 41 trainers were interviewed individually or in focus groups about their experiences of postgraduate medical training, by AR (health psychologist), RV (linguist), KW (academic psychologist) and SN (clinical teaching fellow and trainee). We spoke to trainees and trainers in Foundation and six specialties: General Practice, Medicine, Obstetrics and Gynaecology, Psychiatry, Radiology and Surgery. Participants worked in London, Wales, Yorkshire and Humber, or Kent Surrey and Sussex. We asked about aspects of teaching, learning and fairness, with two questions on ARCPs (Table 2). Ethical approval was provided by University College London Ethics Committee (ref: 0511/011). Information sheets explaining the research were provided to participants before agreeing to take part; written consent was obtained at focus groups and face-to-face interviews, with verbal consent obtained for telephone interviews.

Table 2.

Questions specific to ARCPs asked in interviews and focus groups.

Trainees	We are particularly interested in assessments, including ARCPs and Royal College examinations.
	• What comes into your head when I say ‘ARCP’?
	• How fair do you think ARCPs are?
	• Anyone failed an exam or an ARCP? (prompt: if exam, was it written or clinical)
	• Why do you think you failed?
	• Do you think failing affected you in any way? (if necessary: How?)
Trainers	We are very interested in assessments, including ARCPs and Royal College examinations.
	• What comes into your head when I say ‘ARCP’?
	• How fair do you think ARCPs are?

Analysis

Data were transcribed professionally. The research team (RV, AR and KW) examined the data to identify emerging themes, using thematic analysis²¹ guided by Mountford-Zimdars et al.’s²² analytic framework. A final coding framework was refined by KW, AR and RV after discussion. The whole dataset was coded by RV, with portions of the dataset second coded by the rest of the research team; consistency was ensured by discussing the framework with all team members and agreeing descriptors for each code; coding discrepancies were resolved through discussion. Coding was conducted using QSR International’s NVivo 10© software.²³ Two primary themes emerged around ARCPs: ‘ARCPs are fair’ and ‘ARCPs are not fair’. On further examination further subthemes emerged around why ARCPs were fair or unfair; these concerned the validity of ARCPs and were then analysed using Messick’s¹⁷ framework as a guide (Figure 1).

Figure 1.

Subthemes within the five sources of construct validity evidence.

Results

Trainee and trainer perceptions of the validity of ARCPs are presented according to the five main sources of validity evidence described in Table 1. Subthemes are shown in Figure 1. Although the data for this analysis are participants’ perceptions and experiences of ARCPs, the various themes raised relate to all five sources of validity evidence and not just participants’ engagement with the process and its consequences.

Overall there was general dissatisfaction with ARCPs, especially among UK graduates – international medical graduates were more positive. Trainers tended to view the process more positively although they did voice negative views.

Content

‘Tick-box exercise’

ARCPs were described as a ‘tick-box exercise’ in 27 of the 65 interviews and focus groups^a; this was generally a criticism of populating the e-portfolio. ARCPs were felt to test clerical ability rather than clinical ability, which some believed were inversely correlated:

I’d say that people I’ve found who are very good at filling all their logs and do extremely well in the e-portfolio are actually the ones who are not very good clinically. (Trainee/GP/ST1-3/UKG/white/female)^b

Many trainees felt that competence should not be a function of the number of times a trainee has performed a procedure but whether they can perform it unsupervised.

Another common criticism was that the competencies being assessed were irrelevant to trainees' current or future work and a waste of time:

With the practical, procedural skills that we have to get done, I feel like they’re more a tick-box exercise, and they’re not actually that useful because a lot of the skills are becoming more done by Radiology. (Trainee/Medicine/ST1-3/UKG/white/female)

Conversely, a few trainees and trainers felt that ARCPs covered a wide range of skills and competencies and thus provided a good sense of overall ability:

I think it is fair because it looks at ‘Have they passed the exams? … Have they done their workplace based assessment?’ … So I think it is fair, it does look at a large aspect of a broad training scheme. (Trainer/GP/UKG/white/male)

Changing goalposts

Some trainees described assessment criteria changing with little or no notice. One trainee said that after completing half of the necessary workplace-based assessments she discovered that the criteria had changed and the assessments she had completed were now redundant (Trainee/Medicine/ST4+/UK/white/female). In another instance, a trainee reported that miscommunication caused an entire cohort to fail for having an incorrect number of supervisor reports (Trainee/Medicine/ST4+/UKG/white/female).

Response process

Trainee choice about what to include

There was a perception that trainees could exclude anything negative from their e-portfolio. For example, trainees can carefully choose seniors to sign them off or give positive feedback because the trainee fits in socially:

All my [Case Based Discussions], everything has been from registrars who have generally said, ‘Yeah, I’ll just do one for you’. It’s not been a formalised thing. It’s basically been the same as the rugby tie, but rather than wearing a tie, I’ve just known them and get on with them, and then they’ll do the thing for me. (Trainee/GP/ST1-3/UKG/white/male)

Difficulties getting assessments signed off

Several trainees described difficulties getting seniors to sign off assessments due to lack of engagement or system difficulties. Other trainees were unable to complete all the necessary supervised procedures, either due to the unavailability of clinical opportunities or because they were deemed competent to carry them out unsupervised. In extreme cases the failure of supervisors to sign off affected ARCP outcomes:

Trainee 1 I’ve sat in an ACRP and been told I’ve not got enough assessments signed, I’ve just pulled out the list of all the tickets that I’d sent out that hadn’t been completed by consultants. … You can’t defend yourself in that situation. …

Trainee 2 Yeah, the assumption is that the lack of work-based assessments is a reflection on the laziness or the lack of motivation or the lack of-

Trainee 1 ‘Failure to engage with the portfolio’ is the phrase they use here. (Trainee 1/Medicine/ST4+/UKG/white/male) (Trainee 2/Medicine/ST1-3/UKG/white/female)

Standardisation

International medical graduates were most likely to say that ‘ticking boxes’ resulted in a standardised approach making the process fairer, although other trainees felt that standardisation did not make up for a lack of content validity:

I think [ARCPs] are fair in the sense that they are a piece of standardised paperwork which anyone can learn to get filled out. I think if you ask ‘Are they an effective assessment of any practical measure of doctoring ability?’, that one I'm probably less certain about their quality. (Trainee/Medicine/ST4+/UKG/BME/male)

Internal structure

Reliability

There was concern at lack of consistency in ARCPs across specialties, regions and training grades. One trainee described completing extra documentation implemented by her training programme director, which was not used in other subspecialties or regions (Trainee/Medicine/ST4+/UKG/white/female). Some voiced concern about panel reliability and fairness. For example, one trainee described how a black colleague received a lower outcome and more ‘hassle’ than another equally experienced white trainee, which he thought was due to their ethnicity (Trainee/Surgery/ST4+/UKG/BME/male).

Several GP trainers voiced frustration at the apparent disconnect between their assessment of a trainee and that of the panel. GP trainees attend panel only if their progress raises concerns, which was compared to ‘Sending them to the headmaster’s office for a telling off’ (Trainer/GP/UKG/white/male). Yet if the panel passed a trainee, this undermined the trainers and could damage the trainer–trainee relationship irrevocably (Trainer/GP/IMG/BME/female).

Integration of different aspects of the assessment

Some trainees felt that the ARCP’s different criteria were poorly integrated; trainees could fail an ARCP for something trivial while significant achievements were ignored (see also ‘Discouraging excellence’ section):

… it had been a big achievement and it sort of felt like in any process that’s supposed to be about your achievements and what you’ve done it’s just completely bonkers that they hadn’t mentioned that and they had mentioned the really basic thing that I hadn’t got a tick-box signature to do. (Trainee/O&G/ST4+/UKG/white/female)

Poor psychometric discrimination

Several trainees felt that ARCPs were poor at discriminating between trainees of different abilities:

They’re not fit for purpose because they don’t identify poorly performing trainees. They don’t identify excellent trainees. (Trainee/Medicine/ST4+/UKG/white/male)

Relationships with other variables

Only one participant spoke about the relationship between ARCPs and other assessments. An internationally trained trainee knew two internationally trained colleagues who progressed well in their GP training (presumably passing their ARCPs) but failed their GP exit exam (Trainee/Psychiatry/ST4+/IMG/black/male).

Consequences

Influence on learning

In general, trainees did not feel that populating their e-portfolios encouraged learning; instead completing a large number of assessments impeded learning, either by demotivating trainees as ‘the more you do the less value you attach to each one’ (Trainee/Surgery/ST4+/UKG/BME/male) or by taking up time that could be better spent on another educational activity:

Our e-portfolio seems to be ever expanding and sprawling and it gets to the point you wonder what the actual benefit of it is from an educational point of view. You find that you spend more time filling in boxes than you do reading about a subject. (Trainee/O&G/ST4+/UKG/white/female)

There were mixed views about attending ARCP panels. Several hospital medicine trainees disliked attending panels, finding them stressful, whereas others would prefer to attend the panel and have the opportunity to raise issues and get individualised feedback.

A few trainees commented that trainers’ understanding of the system and their willingness to engage with it influenced how useful the ARCP process was for learning:

It all boils down to who’s your supervisor and whether they understand the system, whether they’re committed to you as a trainee. (Trainee/Medicine/ST1-3/UKG/white/female)

International medical graduates were more likely than UK graduates to speak about the ARCP as a supportive mechanism to ensure trainees are ready for their post-training roles:

Obviously people need to make sure that you are where you should be. That’s just it, they’re just trying to ensure that you’re getting the support you should be getting. (Trainee/Psychiatry/ST4+/IMG/BME/female)

Trainers were generally more positive than trainees. Some had been active in improving ARCPs to be more supportive and useful for trainees, in addition to its role of checking trainees’ progress.

Discouraging excellence

Some trainees felt that ARCPs encouraged minimal competency at the expense of excellence, and that trainees could effectively be penalised for being competent when they started a placement:

There’s the expectation to show development through the year, so you’re supposed to start off bad and end up better. But if you start off good you’re in real trouble. (Trainee/Medicine/ST4+/UKG/white/male)

Quality control

Quality control was seen as an important purpose of ARCPs by trainees and trainers, but the concerns described above made trainees and some trainers question whether ARCPs were able to prevent poor trainees progressing and protect patients.

Discussion

Statement of principle findings

Many trainees and trainers felt that ARCPs could be useful and that assessment is necessary to check progress; however, the way that ARCPs are currently conducted is problematic. Viewed in terms of psychometric validity, participants’ – especially trainees’ – views suggested a lack of evidence for the validity of ARCPs as a means of assessing progress. In particular, there was poor evidence for the e-portfolio’s content validity with its ‘tick-box’ nature viewed as assessing trainees on clerical rather than clinical ability and concerns that trainees could select only positive assessments for their e-portfolio. Other major concerns were that ARCPs encourage minimal competency instead of excellence while not being sensitive enough to identify poorly performing trainees, and that ARCPs discourage learning and disengage trainees. Attending the panel could be stressful but also an opportunity to gain individualised feedback. Positive experiences of ARCPs arose when trainees could discuss their progress with interested supervisors. International medical graduates felt more positive about having standardised boxes to tick which they felt was fair.

Strengths and weaknesses of the study

This was a large study across four regions in England and Wales, involving doctors from six specialties plus foundation, from all stages at trainee or trainer level. Participants included female and male UK and international medical graduates from various ethnic backgrounds. The scale of the research resulted in a rich qualitative dataset with 137 participants from across the UK. However, we spoke to more GPs and fewer radiologists, limiting our ability to examine differences between specialties. The research was conducted during negotiations between the British Medical Association and the UK government regarding the new junior doctors’ contract in England; however, the negative opinions expressed by participants reflect those reported in earlier research^24,25 and so the political climate did not appear to influence participants’ reports unduly. Response bias is possible, as participants volunteered in response to circulated information about the research and those with negative experiences may have been more interested in taking part; however, most participants shared both negative and positive experiences.

Strengths and weaknesses in relation to other studies

The findings presented here reflect prior research on trainees’ opinions about ARCPs and workplace-based assessments; for example, that trainees feel disillusioned with the process and ARCPs discourage excellence,²⁵ and that ARCP outcomes are not a useful evaluation measure of a curriculum.²⁴ Our research suggests that ‘tick-boxes’ are often perceived as reductionist and that assessments which provide more quality formative feedback during training as well as at the annual review would be beneficial. Much previous research has focused on limited geographical areas or specific specialties;^12,13,26 as this study involved participants from across England and Wales, and across specialties, its findings may have a greater reach.

Meaning of the study: possible mechanisms and implications for clinicians or policymakers

In September 2016, Health Education England acknowledged that the ‘tick box culture of the ARCP’ has become problematic and announced a review of ARCPs to begin in October 2016.²⁷ Similarly, the Joint Royal College of Physicians Training Board has outlined the move within UK internal medicine training to a model of entrustable professional activities,²⁸ with the outcome of training being that trainees are ‘trusted to undertake all the key critical tasks needed to work as a consultant’.⁶ To make a trust judgement, supervisors will need a holistic view of trainees’ abilities and our findings suggest that the input from trainers required to accurately form this holistic view (such as frequent formative assessment) is something trainees would value. Indeed, in related work, we found that good relationships with trainers are a key influence on trainees’ learning,²⁰ a point echoed in a call to return to an apprenticeship model in surgery training.²⁹ More opportunity for constructive feedback from the ARCP panel could be similarly beneficial. The 2008 Tooke report³⁰ emphasised the need for excellence in selection into postgraduate training, and our findings suggest that the revisions to workplace-based assessments should similarly allow trainees to demonstrate excellence as well as minimal competency, while keeping patients safe.

Unanswered questions and future research

Some participants mentioned differences in the ARCP process across specialties, grades, and regions; however, as ARCPs were not the sole focus of the research from which this paper stemmed, we were unable to include more detailed questions on this. It would be useful to establish if there are differences along these lines, and if so to investigate what they are and why they exist. Further work on the ARCPs’ components, including the panel, would also be of interest, as from our data it appears that different types of workplace-based assessment are prone to different problems. With the anticipated changes to workplace-based assessments⁶ and a call to improve the training system for trainees, including strengthening the trainee–trainer relationship,²⁹ it will be useful to monitor trainees’ and trainers’ perceptions of the ARCP (or its replacement) after this change.

Footnotes

Declarations

Notes

References

The UK Foundation Programme Office. Guide to the Foundation Annual Review of Competence Progression (ARCP) Progress – 2016, Birmingham, http://www.foundationprogramme.nhs.uk/pages/home/foundation-ARCP (2016, accessed 9 June 2016).

National Health Service. The gold guide: a reference guide for postgraduate specialty training in the UK. 6th edn., London, http://www.copmed.org.uk/publications/the-gold-guide (2016, accessed 10 June 2016).

Massie

Ali

. Workplace-based assessment: a review of user perceptions and strategies to address the identified shortcomings. Adv Health Sci Educ 2016; 21: 455–473.

O’Leary

Al-Taiar

Brown

Bajorek

Ghazirad

Shaddel

. Workplace assessment in crisis? The way forward. BJPsych Bull 2016; 40: 61–63.

Rees

Cleland

Dennis

Kelly

Mattick

Monrouxe

. Supervised learning events in the Foundation Programme: a UK-wide narrative interview study. BMJ Open 2014; 4: e005980.

Black

. An end to box ticking: an overhaul of competency based education. BMJ Careers. http://careers.bmj.com/careers/advice/An_end_to_box_ticking%3A_an_overhaul_of_competency_based_education (2016, accessed 12 August 2016).

Talbot

. Monkey see, monkey do: a critique of the competency model in graduate medical education. Med Educ 2004; 38: 587–592.

Miller

Archer

. Impact of workplace based assessment on doctors’ education and performance: a systematic review. BMJ 2010; 341: c5064–c5064.

Castanelli

Jowsey

Chen

Weller

. Perceptions of purpose, value, and process of the mini-Clinical Evaluation Exercise in anesthesia training. Can J Anesth 2016; 63: 1345–1356.

10.

Brazil

Ratcliffe

Zhang

Davin

. Mini-CEX as a workplace-based assessment tool for interns in an emergency department – does cost outweigh value? Med Teach 2012; 34: 1017–1023.

11.

Brown

Lowe

Fillingham

Murphy

Bamforth

Shaw

. An investigation into the use of multi-source feedback (MSF) as a work-based assessment tool. Med Teach 2014; 36: 997–1004.

12.

Vance

Williamson

Frearson

O'Connor

Davison

Steele

. Evaluation of an established learning portfolio. Clin Teach 2013; 10: 21–26.

13.

Eynon-Lewis

Price

. Reviewing the ARCP process: experiences of users in one English deanery. BMJ Careers. http://careers.bmj.com/careers/advice/view-article.html?id=20008262 (2012, accessed 4 April 2016).

14.

Tiffin

Illing

Kasim

McLachlan

. Annual Review of Competence Progression (ARCP) performance of doctors who passed Professional and Linguistic Assessments Board (PLAB) tests compared with UK medical graduates: national data linkage study. BMJ 2014; 348: g2622–g2622.

15.

Downing

. Validity: on the meaningful interpretation of assessment data. Med Educ 2003; 37: 830–837.

16.

Messick S. Validity. In: Linn RL, ed. Educational measurement. 3rd edn. New York, NY: American Council on Education and Macmillan, 1989:13–103.

17.

American Educational Research Association, American Psychological Association and National Council on Measurement in Education. Standards for educational and psychological testing. Washington, DC: American Educational Research Association, 2014.

18.

Cook

Kuper

Hatala

Ginsburg

. When assessment data are words: validity evidence for qualitative educational assessments. Acad Med 2016; 91: 1359–1369.

19.

Cook

Beckman

. Current concepts in validity and reliability for psychometric instruments: theory and application. Am J Med 2006; 119: 166.e7–166.e16–166.e7–166.e16.

20.

Woolf

Rich

Viney

Rigby

Needleman

Griffin

. Fair training pathways for all: understanding experiences of progression, London: General Medical Council, 2016.

21.

Braun

Clarke

. Using thematic analysis in psychology. Qual Res Psychol 2006; 3: 77–101.

22.

Mountford-Zimdars A, Sabri D, Moore J, Sanders J, Jones S and Higham L. Causes of differences in student outcomes. Higher Education Funding Council for England, Bristol, 2015.

23.

QSR. NVivo qualitative data analysis Software Version 10. Version 10 ed. QSR International Pty Ltd. See http://www.qsrinternational.com/ (2012, accessed 4 November 2015).

24.

Laskaratos

F-M

Gkotsi

Panteliou

. A critical review of the core medical training curriculum in the UK: a medical education perspective. JRSM Open 2014; 5: 2042533313514049.

25.

Dormandy

Laycock

. Triumph of process over practice: changes to assessment of physicians. BMJ Careers. http://careers.bmj.com/careers/advice/Triumph_of_process_over_practice:_changes_to_assessment_of_physicians (2015, accessed 4 April 2016).

26.

Goodyear

Wall

Bindal

. Annual review of competence: trainees’ perspective. Clin Teach 2013; 10: 394–398.

27.

Reid

. Transforming postgraduate medical training in the NHS. BMJ Careers. http://careers.bmj.com/careers/advice/Transforming_postgraduate_medical_training_in_the_NHS (2016, accessed 12 September 2016).

28.

ten Cate

. Nuts and bolts of entrustable professional activities. J Grad Med Educ 2013; 5: 157–158.

29.

Lavelle-Jones M. Letter to Secretary of State for Health Jeremy Hunt. Royal College of Surgeons of Edinburgh. See https://www.rcsed.ac.uk/media/414502/hunt.pdf (2016, accessed 12 August 2016).

30.

Tooke

. Aspiring to excellence: final report of the independent inquiry into modernising medical careers, London: MMC Inquiry, 2008.