Abstract
Introduction:
Some studies suggest that the accuracy of
Methods:
A systematic search was carried out in seven databases until November 2019. We collected or calculated true and false positive and negative values, and constructed 2×2 diagnostic contingency tables with reference standards including histology, rapid urease test, urea breath test, serology, stool antigen test, culture, and polymerase chain reaction. We ranked the index tests by the superiority indices (SI) and calculated pooled sensitivity and specificity of each test.
Discussion:
Our search yielded 40 eligible studies with 27 different diagnostic strategies for
Conclusion:
Use of combined tests may have a rationale in clinical practice due to their higher sensitivities. The differences between the included DTA studies limited the comparison of the testing strategies.
Introduction
Peptic ulcer bleeding (PUB) is the most frequent cause of acute nonvariceal upper gastrointestinal (GI) bleeding,1–4 and has a reported mortality of between 11% and 13.1%.5,6
An optimal testing strategy would be desirable; however, the international guidelines on PUB do not provide clear guidance for clinicians concerning HPI testing in the acute setting. The American College of Gastroenterology (ACG) guidelines recommend biopsy-based testing, 4 whereas the European Society of Gastroenterological Endoscopy (ESGE)1,2 does not specify a diagnostic method. The American Society of Gastroenterological Endoscopy (ASGE) recommends the eradication of HPI, but it does not determine the method of testing. 3
There is a lack of guidance in the Maastricht/Florence V guideline on the ideal timing of eradication therapy after an acute episode of PUB, 11 and there are multiple logistical factors that hamper early testing and eradication.
Multiple previous studies and reviews proved a decreased diagnostic performance of HPI tests in PUB. The reasons include recent proton pump inhibitor (PPI) use, which can change the number and load of detectable organisms. Intragastric blood contains albumin and human plasma with killing factors, which can interfere with bacteria.11–13 A meta-regression reported that testing on the index admission underestimates the true prevalence of HPI in PUB, likely due to the decreased diagnostic performance. 14 Therefore, an optimal strategy or preference for the available tests is needed.
The meta-analysis of Gisbert
We aimed to assess the diagnostic performance of all HPI testing strategies in PUB in a diagnostic test accuracy network meta-analysis.
Materials and methods
Protocol
A diagnostic accuracy meta-analysis and systematic review were planned using the Preferred Reporting Items for Systematic Review and Meta-Analysis (PRISMA) protocols for DTA studies. 18 The analysis was registered in advance on PROSPERO with registration number CRD42019113083, and the protocol was later updated as a network meta-analysis due to the significant variation between the comparisons of testing strategies. 19
Data sources and searches
We included studies from adult populations with PUB where index tests were compared with reference tests for identifying HPI. The outcomes were the diagnostic performance measures of the different diagnostic tests.
A systematic search was conducted in seven databases: Medline
Keywords for the computer-aided search were (bleed* OR haemorrhage OR hemorrhage OR haematemesis OR hematemesis OR melaena OR melena) AND (‘upper gastrointestinal’ OR ‘upper GI’ OR nonvariceal OR peptic OR gastric OR duodenal OR gastroduodenal OR antrum OR antral OR pylorus OR pyloric OR GU OR DU OR PU OR ulcer OR stomach OR curling) AND (helicobacter OR pylori). Additional articles were identified from the reference lists of primarily eligible studies.
Study selection
Records were managed by EndNote X7.4, software (Clarivate Analytics, Philadelphia, PA, USA). After exclusion of duplicates, the remaining studies were screened by title, abstract and finally by full text by two independent authors (NV, ME). Additional articles were searched manually and identified from the reference lists of primarily eligible studies. We calculated Cohen’s kappa coefficient to measure the agreement between two raters (NV, ME) in three levels (title, abstract and full text) of the selection process. 20 Disagreements were resolved by consensus and the involvement of the senior reviewer (BE).
Eligibility criteria
All prospective, cross-sectional DTA studies with relevant information about the accuracy of any HPI diagnostic test without language restriction were included in our analysis. Articles without direct or indirect information on the true positives (TP), true negatives (TN), false positives (FP) and false negatives (FN) were excluded from the analysis. Conference abstracts were excluded after we discovered they did not contain enough information.
Data extraction
Data were extracted independently by two investigators (NV, ME) and populated manually into a purpose-designed Excel 2016 sheet (Office 365, Microsoft, Redmond, WA, USA). Data were collected on the year of publication, geographical location, study type, number of enrolled patients, and basic demographics (age, sex ratio). Most importantly, the raw data (TP, TN, FP, FN), the name, manufacturer, cut off value, biopsy site and timing of both index tests and reference standards were collected. Data about therapy after admission before a diagnosis of HPI, the timing of endoscopic examination, the bleeding source and the risk factors (smoking, alcohol consumption, nonsteroidal anti-inflammatory drug and aspirin use, history of GI bleeding and PUB) were also collected. Other relevant findings were mentioned in an additional column as free text. Disagreements were resolved by consensus and the involvement of the corresponding author.
Risk of bias and applicability
A Revised Tool for the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) was used for the quality assessment of the DTA studies, and the result of the assessment was graphically demonstrated. 21 Risk of bias was assessed independently by two investigators (NV, ME). Disagreements were resolved by consensus and the involvement of the corresponding author.
Statistical analysis
We performed a DTA network meta-analysis to investigate which diagnostic method can be the best choice to detect HPI in PUB. This method allows us to make direct as well as indirect comparisons through a common comparator (i.e. the reference standard).
We collected the raw data of diagnostic tests, TP, TN, FP, and FN values and created 2×2 tables for each study. If raw data on the diagnostic accuracy were not provided, but detailed indirect data on the diagnostic performance were available, TP, TN, FP, FN were calculated. To assess the relative performance of a diagnostic test, we calculated pooled sensitivity and specificity for the index tests compared with the reference standard, and ranked them according to superiority indices (SI). The larger the SI, the more accurately a test is expected to predict the targeted condition compared with other tests. The network meta-analytical calculations were performed by the R programming language (R Core Team 2019, Vienna, Austria, R version 3.6.1) developed by Nyaga
To display the network, we constructed a graph where nodes represent different screening methods, and edges represent head-to-head comparisons. The size of nodes correlates with the number of studies. The thickness of edges represents the number of comparisons between the two tests. The potential nodes of the network were the single tests that had enough connections with other tests to allow statistical analysis in networks.
Since we performed a network meta-analysis of diagnostic accuracy studies in R software, inconsistency should have been calculated instead of heterogeneity. However, since the conditions of node splitting analysis were not met, we were unable to assess inconsistency.
Results
Study selection
Our final statistical analyses included 40 observational cross-sectional studies.12,22–60 The study selection process with the Cohen’s kappa values is shown in Figure 1. Eligible studies were reported between 1998 and 2016 from four continents. The number of study participants ranged between 32 and 324. Characteristics of the studies included in our analysis are shown in Table 1.

PRISMA flow chart for the study selection procedure. K value is the Cohen’s kappa coefficient, if K is between 0.81 and 1.00 it means an almost perfect or perfect agreement.
Main characteristics of included studies.
Results of meta-analysis
The included 40 studies used 27 different definitions of the reference standard. In 32 articles, the reference standard was a combination of multiple tests. In 12 studies, the index tests were compared with a single testing method. We could form seven networks with the single tests (histology, rapid urease test, urea breath test, serology, stool antigen test, culture, polymerase chain reaction) serving as gold standards of the following networks.
In the seven networks, the top three index tests based on their SI are shown in Table 2. None of the index tests had better diagnostic accuracy (SI between 9.94 and 2.17) compared with the individual index tests as all the confidence intervals included 1. Combined testing strategies had higher sensitivities (0.92–0.62) and lower specificities (0.85–0.46), while single tests proved to have higher specificities (0.83–0.77) and lower sensitivities (0.73–0.42). Out of the single tests, only the urea breath test against histology, and culture against polymerase chain reaction, had SI values ranked within the top three. The graphically displayed networks, results of the full analysis and ranking are detailed in Supplementary Files S1–S7. When we ranked the index tests based on their pooled sensitivity, only combinations of tests ranked in the top three in all seven networks. Ranking of index tests based on their pooled specificity identified nine single tests among all 21 top three ranks. However, all sensitivity and specificity values had wide 95% confidence intervals (CIs) ranging between 0.0 and 1. The pooled specificity values with a corresponding CIs and top three specificity and sensitivity values highlighted are shown in Supplementary Files S1–S7.
Summary of results.
CI, confidence interval.
Risk of bias and applicability assessment
With the use of QUADAS-2 assessment tool, overall, only two studies56,58 proved to be free of an unclear or high risk of bias. Five studies42,43,45,47,51 were found to carry an unclear risk of bias. The remaining 33 articles carried a high risk of bias. Detailed results of the risk assessment are shown in Table 3.
Results of the study quality assessment.
low risk.
high risk.
unclear risk of bias.
Discussion
Our results from the network meta-analysis of the eligible DTA studies could order the index tests for the detection of HPI in PUB based on their superiority index. Still, their wide confidence intervals could not prove this order beyond doubt. Combined index tests showed a tendency of higher sensitivity, while single index tests had higher specificity values when ranked.
Reasons for combined tests as diagnostic gold standards in the included studies
The majority of the included DTA studies (28 of 40) used a combination of multiple testing methods for HPI as a gold standard. None of these studies gave specific reasons for the approach above. This seems to be an established strategy across the studies without sufficient evidence. With the use of the combined tests, one can increase the sensitivity of the testing. The use of this strategy in DTA studies is controversial, as it identifies more patients with HPI but compromises the validity of the results of the DTA – even more so, as 15 of 30 DTA studies included the index test in their combined method of the gold standard.
We believe that HPI in PUB should be detected as soon as possible and it is a preferable strategy over delayed testing. With this approach, clinicians can maximise the number of patients who genuinely need eradication and, at the same time, a small but increased proportion of patients will receive eradication unnecessarily (FNs). Given the significant risks of untreated HPI after PUB and the low risks of potential side effects (diarrhoea 1.6%; bloating or abdominal pain 1.3%; nausea or vomiting 0.4%) from the unnecessary antibiotic therapy, 61 the clinical approach of combined testing seems reasonable.
Another reason for the combined and more aggressive testing approach can be the risk of loss to follow up after hospitalisation for PUB. The previous Maastricht/Florence IV guideline recommended the initiation of eradication at the time of introduction of oral feeding,
62
arguing that a proportion of patients would be lost to follow up. In the study of Yoon
Reasons for missing the opportunity to test HPI during admission for PUB
Invasive tests requiring tissue sampling during endoscopy
The endoscopic procedures for patients presenting with PUB are often stressful and are done out of hours, when access to diagnostic tools may be limited. Endoscopic management of acute GI bleeding is likely associated with patient and operator fatigue, and results in poorer adherence to guidelines. After the successful endoscopic termination of an acute PUB, the endoscopist may feel that histological sampling could contribute to a recurrent episode of bleeding. Finally, poor visualisation due to residual blood during the intervention may prevent safe sampling.
Another essential clinical problem leading to reluctance to take biopsies is PUB aggravated by anticoagulant and antiplatelet treatment. A recent multicentric retrospective study from France on upper GI bleeding found that 475 of 2498 patients (19%) took oral anticoagulants, either Vitamin K antagonist or direct oral anticoagulant. 65 A French prospective multicenter study in 2011 reported 8.1% antiplatelet use in upper GI bleeding. 66
Non-invasive tests
The urea breath test has very low feasibility in the acute setting of PUB, as patients have to fast before and often after the index endoscopy.
Stool antigen testing has a similar problem concerning feasibility; the opportunity of stool sampling for HPI testing in the admission department is often missed. Also, a Dutch study revealed a high rate of false-positive results in PUB patients explained by a cross-reaction with the blood. 56
Also, patients with PUB receive, and are committed to, long-term PPI treatment before a urea breath test and stool antigen test is performed.
1
The current guideline suggests a 14 day PPI free period before a urea breath test can be performed.
11
Even 3 days of high dose PPI treatment reduces the detection of
Serology would seem the most feasible out of all non-invasive tests, but the presence of antibodies may only indicate a previous infection instead of an acute one. 68
Strengths of the study
The update of the most recent DTA meta-analysis on the same topic was published in 2006, 15 with the then most recent study published in 2004; therefore, an update was necessary. We used a comprehensive and rigorous search strategy in seven databases. Detailed data extraction covered 70 items, which are shown in Supplementary file S8. We used a new statistical method of network meta-analysis developed for DTA studies. Assessment of the risks of bias was performed by the purpose-designed tool of QUADAS-2. Due to the clear study designs of the included reports, the patient population matched the review question.
Limitations of the study
As detailed in Table 3, the overall quality of the included studies was suboptimal, with high and unclear risks of bias. In many studies, it was unclear whether enrollment was consecutive, and inappropriate exclusion of subjects often occurred. Blinding the interpretation of the index tests and threshold of cut off value were not defined in many studies. Blinding the interpretation of the reference tests were not pre-specified. Another significant limitation was the unclear or prolonged interval between the index and reference tests. Not all patients received the same reference standard test when a combination was used. In some studies, a few participants were excluded from the final analysis.
Not only the combination of tests and index tests, but also the actual tests differed (manufacturer, methodology, etc.). In some articles, PPI use preceded either or both index and reference tests. The tissue sampling was not uniform across studies: some used antral, others antral and gastric body mucosal samples.
Implications for clinical practice
Combined tests may have a role in HPI testing in PUB as they have higher sensitivities. Endoscopic and gastroenterology units should have a tailored approach based on the availability of the individual tests.
Implication for research
Future DTA studies should use uniform gold standards. Also, they should focus on the feasibility and cost-effectiveness of the combined testing strategies.
In conclusion, our network meta-analysis demonstrated that none of the individual tests or the strategy of combined tests is superior in the detection of HPI. The combined tests have an increased sensitivity, which can translate to an optimized eradication strategy as it can result in the identification of most patients needing eradication therapy.
Supplemental Material
sj-pdf-1-tag-10.1177_1756284820965324 – Supplemental material for Accuracy of the Helicobacter pylori diagnostic tests in patients with peptic ulcer bleeding: a systematic review and network meta-analysis
Supplemental material, sj-pdf-1-tag-10.1177_1756284820965324 for Accuracy of the Helicobacter pylori diagnostic tests in patients with peptic ulcer bleeding: a systematic review and network meta-analysis by Nóra Vörhendi, Alexandra Soós, Marie Anne Engh, Benedek Tinusz, Zsolt Szakács, Dániel Pécsi, Alexandra Mikó, Patrícia Sarlós, Péter Hegyi and Bálint Eröss in Therapeutic Advances in Gastroenterology
Research Data
sj-xlsx-2-tag-10.1177_1756284820965324 – Research Data for Accuracy of the Helicobacter pylori diagnostic tests in patients with peptic ulcer bleeding: a systematic review and network meta-analysis
Research Data, sj-xlsx-2-tag-10.1177_1756284820965324 for Accuracy of the Helicobacter pylori diagnostic tests in patients with peptic ulcer bleeding: a systematic review and network meta-analysis by Nóra Vörhendi, Alexandra Soós, Marie Anne Engh, Benedek Tinusz, Zsolt Szakács, Dániel Pécsi, Alexandra Mikó, Patrícia Sarlós, Péter Hegyi and Bálint Eröss in Therapeutic Advances in Gastroenterology
Footnotes
Author contributions
NV, BT, AS and PH conceptualized and designed the study in cooperation with BE; DP, SP, and ZS constructed the search query. NV, AM, and ZS carried out the search process. NV and ME screened the articles for eligibility. NV, ME, and BE performed the data extraction; NV, ME and BE conducted the quality assessment. NV, BE, BT and ZS wrote the article. AS carried out the statistical analysis. DP, PS, AM, BT, ME, and PH provided valuable feedback after critically reviewing the first drafts of the manuscript. All authors contributed and approved the final manuscript for publication.
Conflict of interest statement
The authors declare that there is no conflict of interest.
Ethical approval
No ethical approval was required for this review as all data were already published in peer reviewed journals. No patients were involved in the design, conduct or interpretation of our review.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The research was supported by Project Grants (K131996 to PH and FK131864 to MA), an Economic Development and Innovation Operative Programme Grant (GINOP 2.3.2-15-2016-00048 to PH), The Grant of the Hungarian Science Foundation (FK 132834 to PS) and a Human Resources Development Operational Programme Grant (EFOP-3.6.2-16-2017-00006 to PH) of the National Research, Development and Innovation Office.
Guarantor
Bálint Eröss
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
