Sage Journals: Discover world-class research

Abstract

Dear Editor,

We read the publication by Amos et al.¹ with surprise and concern. Changing the College’s assessment program is complex and requires informed and nuanced conversations. The College has engaged with stakeholders through several stakeholder forums (SHFs).

We were, therefore, surprised to see the Journal publish a paper that only discusses the measurement aspects of assessment. That is a disservice to the complexity of the matter and the College’s careful approach and at the same time publicly casts doubt on the competence of recently ‘fellowed’ psychiatrists.

The publication is rife with misunderstandings and incorrect assumptions around assessment. For an extensive review of these issues, see Sidhu and Fleming’s critical narrative review.² In this short letter, we can only indicate the main errors.

For example, the authors confuse predictive validity with construct validity. Predictive validity is not useful in assessment for reasons identified by Cronbach and Meehl in 1955³; construct validity has been universally used instead.⁴ Predictive validity does not work because there is no single measurable gold standard. Competence is far too complex for that: like ‘health’, it cannot be captured in a single number (‘you are 42% healthy’).

In the context of assessment, the notion of false-positive/negative is mainly useful as an illustration not as a real calculation. The OSCE data presented at the SHFs were, therefore, about measurement imprecision and were based on actual data instead of assumed.

As a result, the authors confuse Standard Error of Measurement (SEM) used in the SHF communique with more standard comparisons of means of distribution in their comparison. But comparing means of vastly different assessments (one based on a single measurement and the other on longitudinal assessment and feedback) is not informative and can lead to harmful misconceptions. A more meaningful comparison would have been between the proportions of measurement error of the OSCE and the AAP.

The authors cite selectively. For example, Prentice et al did publish a literature review but also a meta-regression on national data showing a large positive effect of early assessment and intervention on learning outcomes.⁵

Finally, the authors confuse the learning effects of cramming for a single examination with learning for long-term retention and application in practice. There is a body of research that shows the huge difference in favour of the latter.⁶ Longitudinal assessment with feedback leads to better learning outcomes.

Simply comparing pass rates is, therefore, comparing apples and oranges.

Footnotes

Disclosure

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Nick O’Connor

References

Amos

Weightman

Miller

, et al. Modelling the rate of trainees transitioning to Fellowship. Australas Psychiatr 2023; 31(6): 741–745.

Sidhu

Fleming

. Re-examining single-moment-in-time high-stakes examinations in specialist training: a critical narrative review. Med Teach 2023: 1–9.

Cronbach

Meehl

. Construct validity in psychological tests. Psychol Bull 1955; 52(4): 281–302.

Kane

. Current concerns in validity theory. J Educ Meas 2001; 38(4): 319–342.

Prentice

Benson

Schuwirth

, et al. A meta-analysis and quality analysis of flagging and exam performance in general practice training. Aust J Prim Health 2019; 25(3).

Soderstrom

Bjork

. Learning versus performance: an integrative review. Perspect Psychol Sci 2015; 10(2): 176–199.

Modelling the rate of trainees transitioning to Fellowship… response to Amos et al