Abstract
Background
Faecal Immunochemical tests (FITs) in the assessment of patients presenting with symptoms have generally used a single sample. Little evidence pertains to the use of replicate, where a number of tests are done prior to decision-making or repeat FIT, where additional FIT are performed following clinical decision-making. Overwhelmingly, research has focussed on FIT to help identify colorectal cancer (CRC). The aim of this review is to assess the available literature concerning replicate and repeat FIT in symptomatic patients to help generate consensus and guide future research.
Methods
The terms ‘faecal immunochemical test’ or ‘FIT’ were combined with ‘multiple’ or ‘repeat’. EMBASE, Medline and PubMed database and other searches were conducted. All papers published in English were included with no exclusion date limits until November 2021.
Results
Of the 161 initial papers screened, seven were included for review. Qualitative and quantitative FIT outcomes were assessed in the studies. The primary aims of most related to whether replicate FIT increased diagnostic yield of CRC, with colonoscopy used as the reference standard. One publication assessed the impact of a new COVID-adapted pathway on CRC detection. No consensus on replicate FIT was apparent. Some concluded that FITs may help minimise missed CRC diagnoses: others showed no increase in diagnostic yield of CRC.
Conclusions
Current evidence on replicate and repeat FIT is both minimal and conflicting. FIT is a superb clinical tool, but significant gaps surrounding application remain. Further studies relating to replicate and repeat FIT are required.
Keywords
Introduction
The advent of faecal immunochemical tests (FITs) has revolutionised asymptomatic colorectal cancer (CRC) screening since it has become the most widely used non-invasive investigation. 1 Further, as a clinical adjunct, FIT have proved invaluable in assessment of patients presenting in primary (and secondary) care with lower gastrointestinal symptoms, which account for 10% of consultations with general practitioners. 2 Through measuring faecal haemoglobin concentration (f-Hb), numerous studies have highlighted quantitative FIT as a highly sensitive test that can be used to identify those symptomatic patients with a higher risk of significant colorectal disease (SCD). 3 Evidence from prospective studies have demonstrated the successful incorporation of FIT in referral pathways for patients with low and high-risk symptoms.4–6
The finding of a f-Hb below a chosen threshold can be reassuring that a symptomatic patient is unlikely to have SCD. Mowat et al. concluded that FIT may have a role as a risk assessment tool, stating that, ‘FIT in conjunction with clinical assessment, can safely and objectively determine individual risk of CRC and decide on simple reassurance or urgent or routine referral’. 7 Studies have demonstrated high negative predictive values, which indicate that there may be a role for FIT to be employed as a rule-out investigation. 8 However, a f-Hb below the threshold applied in practice does not totally exclude the risk of a patient having SCD.
Faecal immunochemical tests in the assessment of symptomatic patients has centred on the use of a single test. Little evidence pertains to the use of replicate, where a number of tests are done prior to decision-making, or repeat FIT, where additional FIT or FITs is or are performed following clinical decision-making. In their comprehensive recent review on FIT, Pin-Vieto et al. document that there is little evidence to date on replicate and repeat FITs and simply comment, ‘there is the option of using more than one FIT determination’. 2
To date, research has focussed primarily on FIT to help identify CRC in the screening population. 9 Although of considerable value in the assessment of SCD, some gaps in knowledge remain as to whether replicate or repeat FITs can play a reassuring role to those with undetectable f-Hb or f-Hb below the threshold applied. As Benton and Fraser 10 state, in their editorial, ‘a current problem is that there is no objective evidence to support or refute the use of repeat FITs in patients with f-Hb <10 μg Hb/g faeces’. Questions remain as to whether a single f-Hb result below the set threshold is sufficient to guide clinical practice and offer reassurance to those patients who have presented with symptoms. Limited evidence currently exists in assessing whether replicate and/or repeat FITs can improve diagnostic accuracy. The impact of the heterogeneity of haemoglobin in faeces in the context of multiple FIT samples from the same patient is also not fully understood.
With ever-mounting numbers of two-week wait (2-WW) referrals (using NICE NG12 guidelines 11 ) to secondary care in England without parallel increases in CRC detection pre-pandemic, strategies to address this important issue are required. 12 The aim of this review is to assess the available literature concerning replicate and repeat FITs in symptomatic patients to help generate consensus and guide future research.
Methods
The terms ‘faecal immunochemical test’ or ‘FIT’ were combined with ‘multiple’ or ‘repeat’. EMBASE, Medline and PubMed database and other searches of websites, citations and abstracts (highlighted from relevant publications) were conducted. All papers published in English relating to humans were included with no exclusion date limits, the searches were performed to find publications in all languages up until November 2021. Only studies relating to symptomatic patients were included for analysis, studies involving screening were excluded. Papers using either qualitative or quantitative FIT methods were included. Where studies made anecdotal reference to small numbers (<20) of replicate or repeat tests these were included in our Discussion rather than the main Results section. The PRISMA diagram
13
(Figure 1) outlines the searches undertaken. Database and other searches PRISMA diagram.
Searches relating to ‘faecal haemoglobin’ and ‘faecal occult blood test’ and ‘multiple’ or ‘repeat’ did not yield any additional studies for inclusion to those including ‘faecal immunochemical test’ or ‘FIT’. The terms used for the searches were selected as they were considered to represent the most common nomenclature and would therefore highlight the greatest number of relevant works. ‘Replicate’ was not searched for, as this expression in the context of FIT is novel.
Two independent reviewers (NGF and CGF) undertook the searches, prior to evaluation and the establishment of final list of articles that met the inclusion criteria. Given the limited data relating to the subject, one relevant poster abstract identified via the database searches was incorporated.
Results
Data from publications included from searches for detailed analysis.
ACRN: 1advanced colorectal neoplasia; HGD: High-grade dysplasia; CRC; colorectal cancer; f-Hb: faecal haemoglobin concentration.
Auge et al. 14 investigated a cohort of 208 symptomatic patients awaiting colonoscopy. FIT was performed on two consecutive bowel motions prior to this investigation. Samples were collected (using Kyowa-Medex Co, Ltd devices) and analysed using the fully automated analyser HM-JACKarc (Kyowa-Medex Co, Ltd, Tokyo, Japan). The limit of detection of this analyser is 0.6 μg Hb/g faeces, whilst the limit of quantitation is 7 μg Hb/g faeces. The authors evaluated the first f-Hb result (FIT/1) and the higher f-Hb result of the two FIT samples (FIT/max). The study demonstrated that both FIT/1 and FIT/max were both significantly higher in patients with advanced colorectal neoplasia (CRC plus advanced adenomas) than with low-risk adenoma, other less significant colorectal findings, and normal colonoscopy. At a threshold of 10 μg Hb/g faeces, sensitivity and specificity were 34.5% and 87.2%, respectively. It was concluded that the diagnostic yield of two FIT samples could be achieved with one, although a lower (f-Hb) threshold of 10 μg Hb/g faeces rather than 20 μg Hb/g faeces would be required.
Högberg et al. 15 evaluated all CRC and adenomas with high-grade dysplasia from 2005 to 2009 in a county in Sweden. Three separate qualitative FITs were performed in 160 cases. A qualitative FIT, visually read dipstick-type test (Oy Medix Biochemica Ab, Kauniainen, Finland) with a manufacturer quoted detection limit of 25–50 μg Hb/g faeces was used. The samples were taken from three consecutive bowel motions. Of the 160 patients included, 139 had at least one positive result. Within this group, the likelihood of finding a positive test result when only one FIT was performed was 0.91. This would potentially mean a missed diagnosis in 13 of 139 cases. Four cases could potentially have been missed when two FITs were assessed and zero when three FITs were positive. Equally, in 21 of 160 cases, all three FIT results were negative leading the authors to suggest that relying on single FIT is insufficient.
Turvill et al. 16 conducted a prospective blinded single centre observational study. Two separate faecal samples were assessed with FIT and faecal calprotectin with a view to determining whether repeat biomarkers improved diagnostic accuracy for CRC or clinically significant disease (CSD). A cohort of 474 patients was recruited. Samples were collected using devices and examined with a turbidimetric analytical system (Kyowa-Medex). The limit of quantitation of the analyser was seven μg Hb/g faeces. The optimal threshold for CRC diagnosis was ≥12 μg Hb/g faeces (sensitivity of 84.6% and specificity of 88.5%) for a single FIT. For two FITs, with either sample positive, a threshold of ≥43 μg Hb/g faeces was advocated; however, with both samples positive, a threshold of f-Hb ≥2 μg Hb/g faeces was noted as optimal. No benefit was shown using two FITs, two faecal calprotectin estimations, or a combination of both in improving diagnostic accuracy when compared to a single FIT. The conclusions drawn support the use of single FIT rather than replicate FITs.
Douglas et al., 17 assessed the value of replicate FITs in 746 symptomatic patients referred to a tertiary colorectal centre. The median interval of collection between the two FIT samples was 17 days and a threshold f-Hb of ≥80 μg/g faeces taken from the Scottish Bowel Cancer Screening Programme was used. In total, 86.3% of patients had two FIT results <80 μg/g faeces, with 5.2% having an initial sample ≥80 μg/g faeces and a second <80 μg/g faeces and 3.1% having an initial FIT <80 μg/g faeces and a second ≥80 μg/g faeces. This descriptive study demonstrates heterogeneity of results with variability observed between first and second FIT results but without improving pathology diagnosis. The authors report that ‘Pearsons’ correlation coefficient between the first and second FIT result was 0.538, showing “moderate test-retest reliability.”’ It should be noted that this data has been taken from a poster abstract rather than a peer-reviewed publication.
A further study by Högberg et al. 18 assesses the usefulness of three qualitative FITs in primary care when diagnosing CRC in 2027 patients with and without a history of rectal bleeding. Qualitative FIT Actim Fecal Blood tests (Actim, Espoo, Finland), with a manufacturer quoted detection limit of 25–50 μg Hb/g faeces, were used with samples taken from consecutive bowel motions. This retrospective study demonstrated that the diagnostic performance of qualitative three-sample FITs was similar in those patients with and without a history of rectal bleeding. It was stated that FITs may be useful to prioritise patients for further investigations since they gave better outcomes than a history of rectal bleeding alone.
Miller et al. 19 evaluated their Centre’s COVID-adapted pathway to help diagnose CRC in symptomatic patients during the COVID pandemic. This study evaluated the impact of a change in practice to the triage pathway with 442 patients included. Both replicate and repeat FITs were incorporated into a novel complex investigative safety-netting strategy alongside colonoscopy and CT. The three arms in this study were ‘High risk symptoms, FIT +CT’, ‘Rectal mass, OP clinic’ and ‘Low risk symptoms, FIT only’. The HM-JACKarc analytical system (Hitachi Chemical Diagnostics Systems Co, Ltd, Tokyo, Japan) was used to analyse samples with a quantitation limit of 10 μg/g Hb/g faeces. Replicate FITs were used in patients with no ‘high risk’ symptoms where the initial FIT ≤79 μg Hb/g faeces and in those patients with initial f-Hb of 80–399 μg/g faeces alongside a CT. Repeat FIT was utilised in patients with ‘high-risk’ symptoms (iron deficiency anaemia, persistent diarrhoea and abdominal mass) who had already returned a FIT sample and undergone CT which did not demonstrate pathology. It was reported that seven of 13 CRC from the cohort were diagnosed via the high-risk symptoms (CT and FIT) arm, two from the low-risk symptoms (FIT only) arm and four from the rectal mass (outpatient clinic) arm.
Chapman et al. 20 highlight 114 patients who returned two FIT samples from the same bowel motion within their study assessing the heterogeneity of results between two FIT analyser systems (OC-Sensor and HM-JACKarc). This subgroup analysis was to enable sensitivity analysis for the OC-Sensor FIT results. With thresholds of greater than or equal to 4, 10 and 150 μg Hb/g faeces for both tests, an agreement of 90.4% (Cohen’s Kappa =0.80), 96.5% (Cohen’s Kappa =0.91) and 100% (Cohen’s Kappa =1.00), respectively, was noted. The authors stated that, using two OC-Sensor results on a single sample had a much closer agreement between f-Hb results than with OC-Sensor versus HM-JACKarc.
Discussion
It is evident from the comprehensive searches performed that there are few studies published on the topic of repeat and/or replicate FITs and that no consensus exists. The notion that a single FIT offers sufficient accuracy in aiding the diagnosis of CRC is refuted by Högberg et al. 15 who conclude that ‘use of a one-sample POC FIT instead of three-sample POC FIT as a diagnostic aid in primary care may result in the missing of one-tenth of symptomatic CRCs and adenomas with HGD’. However, as highlighted previously, the utilisation of qualitative FITs with a higher threshold (of 50 μg Hb/g faeces) in a small-defined population with known pathology have significant drawbacks. The authors do acknowledge the alternative of a single FIT but with a ‘very low’ threshold, but this would require quantitative FIT away from primary care.
In contrast, the publications by Auge et al. 14 and Turvill et al. 16 support the use of a single rather than replicate FIT. Both studies aimed to determine whether replicate FITs impact the diagnostic yield of CRC and concluded that multiple FITs did not provide any additional benefit. Utilisation of a lower thresholds of ≥12 μg Hb/g faeces and ≥10 μg Hb/g faeces, respectively, for single FIT delivered comparable results to two FITs. Similarly, the conclusions of Douglas et al. 17 support these works: however, with so few (six) CRC detected, it is difficult to determine the true impact of replicate FITs from this study.
This evidence should be contextualised since both studies (Auge et al. 14 and Turvill et al. 16 ) incorporate small numbers of patients and CRC. When comparing outcomes to the results of Högberg et al. 15 the differing methodology requires consideration given the unmatched cohort groups. Patients with confirmed pathology (CRC and high-grade adenomas) were included in the study whereas all symptomatic patients were represented in the study groups in the other publications. Equally, the qualitative tests used in the Högberg et al. 15 study have higher detection limits of 25–50 μg Hb/g faeces compared to the thresholds of the quantitative FITs used in other studies. Similarly, the settings differed with primary care, point of care qualitative testing being utilised by Högberg et al. 15 whereas quantitative FIT was undertaken in the other studies. Such factors inherently reduce the ability to compare in a like for like fashion given the mismatched aims, methodology and groups of the respective studies.
Given the paucity of data in symptomatic patients concerning replicate and repeat FITs, it is worth considering the literature from screening studies. Hernandez et al. 21 highlight a methodology similar to Auge et al., 14 in which two separate samples were taken from consecutive bowel motions prior to colonoscopy, with the first result and the higher of the two results recorded. Varying thresholds ranging from ≥50 to ≥200 μg Hb/g faeces were assessed. The authors determined that two FITs did not improve diagnostic accuracy whilst the number needed to scope 14 and cost were increased. Such thresholds are similar to those utilised by Hogberg et al. 15 in their qualitative FIT study: however, conflicting outcomes are noted. Other studies such as those by Kapidzic et al. 22 and Schreduers et al. 9 support the conclusions of Hernandez et al. with single FIT preferable to multiple in the screening population.
The advent of COVID has required a dramatic shift away from previously established pathways in the triage and work up of symptomatic patients, particularly with endoscopy being rendered impractical and unsafe for a number of months. Consequently, healthcare systems have had to adapt and look for alternative approaches to overcome these challenges. FIT has proved a valuable adjunct during this time with a number of studies using FIT as part of strategies to triage patients.23–25
Miller et al. 19 demonstrated novel use of FIT with both replicate and repeat FITs incorporated into their triage pathway. The study concluded that the COVID adapted pathway proved an effective tool to help mitigate risk during a time when access to endoscopy services was severely limited. Despite a substantial reduction in referrals, detection of neoplasia did not drop. The authors acknowledge various limitations to their study including the inability to validate FIT since there is no reference standard. In addition, the authors recognised that the threshold used to define a ‘negative’ result was lower in the symptomatic population than the value of <80 μg/g faeces used. Consequently, the authors’ state, ‘the use of FIT alone could miss cancers in some patients’. Maeda et al., 26 from the same group, considered further assessment of this pathway in correspondence published in the British Journal of Surgery. The article highlighted the limited validity of FITs as a risk stratification tool citing the ‘considerable variation in interval double-test FIT values’ resulting in a five percent potential enrichment when compared to single FIT.
The COVID pandemic has increased interest in repeat testing strategies involving FIT on national levels and this has been reflected in guidelines issued by governments.27,28 Such guidance is particularly valuable when results are equivocal, indeterminate or in safety-netting strategies. NHS England 27 stated that ‘symptomatic patients with a FIT <10 μg Hb/g faeces may be observed for the time being without colonoscopy but should have a repeat test within three months’. The Scottish Government advocated that ‘If a GP has a patient with a FIT result <10 μg Hb/g faeces, but with persistent symptoms, a primary care review within six weeks is recommended, or if there is still doubt as to whether or not to refer. A repeat FIT may be of value’. 28 The formulation of such guidance during the pandemic demonstrates how healthcare providers have had to adapt to cope with additional strain on services whilst attempting to mitigate risk. Nevertheless, the subtle differences in recommendations merely highlight the lack of clear universal evidence to support such policies currently. There is evidence that the pandemic has resulted in fewer referrals, CRC diagnoses and patients being treated. 29 Thus, these strategies require contextualisation and careful evaluation before determining whether such policies can be deemed successes.
A few publications mention the anecdotal use of repeat FITs in routine practice regarding the assessment of patients presenting with symptoms.30,31 Chapman et al. 30 highlight 11 patients with repeat FITs, all of whom returned two samples <4 μg Hb/g faeces. Byun et al. highlight seven patients on whom FIT was performed more than once from their small retrospective audit, however, data relating to time intervals between tests was not provided 31 ; five patients received two tests and two patients received three tests. One patient had f-Hb above the threshold initially only to have f-Hb below the threshold three weeks later. Byun et al. state that the reason for repeating FITs was for reassurance. This sub-group of patients undergoing repeat FITs in both primary and secondary care represents one of particular interest. Both studies represent a very small subgroup of repeat tests from within larger studies; therefore, it is difficult to draw any valid conclusions as to their value in this context. The notion that test repetition helps offer both clinicians and/or patients added assurance warrants further exploration.
Jung et al. 32 concluded in a recent review and meta-analysis assessing the impact of antiplatelet agents and anticoagulants on the performance of FITs that their use adversely affected the positive predictive value (PPV). They stated that the ‘false-positive’ rate may be high in individuals receiving anti-thrombotic agents. It is suggested that, for individuals with positive FIT results, efforts should be taken to reduce the number of unnecessary colonoscopies by improving the accuracy of FIT by repeating tests. However, there appears to be no evidence to support this proposal.
Large screening population studies have demonstrated variation in repeat FIT results in participants found to have advanced neoplasia. 33 Discordance in FIT results is also seen in symptomatic patients resulting in varying suggested thresholds to investigate. Given that faeces is a heterogeneous matrix that will vary from one sample to another, or even within a single sample, multiple tests carried out in the same manner from the same patient may differ.
As evidenced by the studies discussed in this review, different FIT systems with different thresholds for a positive test result were employed; consequently, inter-study comparisons are limited by these factors when interpreting outcomes. The impact of differences in specimen testing method and FIT system are highlighted by Chapman et al. 20 The authors report discrepancies between the different manufacturers, stating ‘differences (between OC-Sensor and HM-JACKarc, the two most commonly used FIT systems in the United Kingdom) are not simply due to sampling variation within the bowel motion. Wide variations did however still occur, in both settings, highlighting the importance of repeat testing if concerns still exist’. Such findings are not unexpected, given the lack of FIT method harmonisation and pre-analytical variations. The call for repeat testing in this context relates to the diagnostic accuracy of each device rather than clinical questions posed by the studies included in this review.
Some doubt has been expressed over the reliability of repeat FITs. Such concerns have been raised specifically within the bowel cancer screening community and their significance to symptomatic patient groups currently remains unknown. It has been suggested that the addition of another biomarker, such as urinary volatile organic compounds (VOC), might offer a better strategy. 34 Widlak et al. 35 suggest a two-stage process using VOC testing in patients with negative FIT results may be superior to repeat FIT results since sensitivity for CRC is improved.
As evidenced by the current body of work concerning replicate and repeat FITs, the primary objective for these studies has been CRC detection. However, the role of replicate and repeat FITs may best serve those patients with negative initial results. There is uncertainty at present about the further management of high-risk symptomatic patients who have f-Hb below the threshold applied. Current options include, in England, sending a 2-WW referral, with the FIT result used to guide decision making in secondary care or safety netting of these patients in primary care. Similar strategies could be adopted in other countries. By returning additional samples, a more comprehensive safety-netting process could be implemented or additional reassurance provided to patients in primary or secondary care. Such practice could have wide reaching impact particularly on 2-WW pathways, referral numbers, endoscopy demands and other imaging investigations.
There is a current lack of evidence as to whether repeat FITs have value as part of a safety-netting approach. A body of evidence to support such decision-making is totally lacking. In the editorial by Benton and Fraser, a number of valuable suggestions are made regarding investigation of this group in particular, 10 the time-frame between samples, the number of repeat FITs undertaken, the threshold for providing reassurance. Such areas will offer researchers important topics to guide future studies and address key questions relating to FIT and its ongoing application.
It is already known that returning a FIT with f-Hb below the threshold can effectively rule out CRC. 8 However, further guidance relating to acceptable thresholds for reassurance alone, non-urgent follow up and urgent assessment are needed. As Miller et al. 19 have demonstrated, multiple FIT can be used practically and safely even when high thresholds are employed.
Patients with detectable f-Hb, but below the threshold for referral, may represent a sub-population of FIT negative patients in whom a repeat f-Hb may be of value as part of safety-netting. 36 Such a scenario may occur when higher thresholds are set to meet escalating demand for endoscopy services. Turvill et al. 16 identified an optimal threshold value of ≥2 μg/g Hb/g faeces for replicate FITs in two samples (with both samples FIT positive) which is at the limit of detection. However, further work evaluating the economic feasibility and practicality of such a pathway is needed.
Currently, the application of FIT is recommended in the recent document from the Get It Right First Time (GIRFT) report on gastroenterology. 37 It is possible that a different FIT based strategy may be needed for patients with different symptomology and this should be a focus of future work to inform guidance and clinical practice. Consideration of factors such as anaemia, rectal bleeding, rectal examination and weight loss could be incorporated to help further stratify patients and associated risk. However, additional evaluation is required to determine whether such strategies are feasible, safe and risk averse.
As evidenced by the studies discussed in this review, the application of replicate and repeat FITs has been diverse, from retrospective qualitative point of care studies, to prospective quantitative clinical studies to risk mitigating novel pathways. Future work is required relating to both replicate and repeat FITs in symptomatic patients. There is a need for larger more robust clinical trials to be established to help determine whether multiple FITs can help reassure those symptomatic patients with a negative (single) FIT result and further triage the low positive (2–100 μg Hb/g faeces) FIT result patients. Such data will help contribute towards national guidance and impact on referral pathways, healthcare budgets and provide clarity to primary care physicians referring onwards.
Conclusion
This review highlights an important area within the topic of CRC and SCD detection in symptomatic patients. Current evidence on replicate and repeat FITs is both minimal and conflicting, resulting in more questions than answers. It is evident that FIT is a superb additional clinical tool, but undoubtedly there are significant gaps in understanding surrounding their application.
Looking to the future, further studies on replicate and repeat FITs are required to assess the potential of this approach to streamline pathways, reduce pressure on endoscopy and provide reassurance to patients and healthcare providers.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Ethical approval
No approval required.
Guarantor
SCB.
Contributorship
NGF, CGF and SCB conceived the study. NGF, CGF and SCB researched the literature. NGF and CGF undertook data analysis. NGF and CGF wrote drafts of the manuscript. All authors (NGF, CGF, WM, IJ, TR and SCB) reviewed and edited the manuscript and approved the final version of the manuscript.
