Abstract
Background
Patients with persistent gastroesophageal reflux disease symptoms despite proton pump inhibitors are increasingly encountered. It remains controversial if proton pump inhibitors should be stopped before functional oesophageal tests.
Aim
This meta-analysis compares the positive yield of oesophageal studies performed off versus on proton pump inhibitors.
Methods
Pubmed, Embase and the Cochrane Library were searched for eligible studies. Outcomes assessed were the number of subjects with: elevated oesophageal acid exposure time when studied off versus on proton pump inhibitors; positive symptom index (≥50%) and/or positive symptom association probability (≥95%) for acid reflux; and/or non-acid reflux events off versus on proton pump inhibitors. The random effects model was applied.
Results
Fifteen studies (n = 5033 individuals; 33% on proton pump inhibitors; 32% men; mean age 52.1 years) were analysed. Pooled risk ratio for the comparison of high oesophageal acid exposure time off versus on proton pump inhibitors was 2.16 (95% confidence interval (CI) 1.42–3.28). The risk ratio of a positive symptom index (acid reflux) was 2.64 (95% CI 1.52–4.57) and the risk ratio of a positive symptom association probability (acid reflux) was 2.94 (95% CI 2.31–3.74). Conversely, the risk ratio of a positive symptom index (non-acid reflux) was 0.96 (95% CI 0.49–1.88) and risk ratio of a positive symptom association probability (non-acid reflux) was 0.54 (95% CI 0.30–0.99).
Conclusions
Oesophageal studies after proton pump inhibitor cessation improve the positive yield for acid reflux-related events but reduce the detection of symptomatic non-acid reflux events.
Introduction
Gastroesophageal reflux disease (GERD) is a chronic disease that affects 10–20% of adults in the USA and Europe. 1 It is characterised by the presence of troublesome symptoms and signs attributed to the reflux of gastric contents into the oesophagus. 2 A therapeutic trial of once daily acid suppressant therapy with proton pump inhibitor (PPI) medication has become a cost-effective standard of care in patients who present with GERD symptoms. 3 The role of acid reflux (AR) in symptom generation is demonstrated by the immediate relief of symptoms following a course of PPI therapy in the majority of patients. However, patients with typical and/or atypical GERD symptoms that persist despite PPI therapy are increasingly encountered in clinical practice. 4 Approximately 40% of patients with erosive oesophagitis and up to 60% of patients with non-erosive reflux disease were reported to suffer from persistent symptoms. 5 The challenge faced by the gastroenterologist is to determine if these symptoms are related to GERD. Causes of persistent symptoms include: (a) ongoing AR with high oesophageal acid exposure; (b) acid and/or non-acid reflux (NAR) into a hypersensitive oesophagus; and (c) symptoms unrelated to AR and NAR events.
The armamentarium of diagnostic tools available for GERD evaluation includes the 24-hour nasopharyngeal pH catheter, 48-hour wireless oesophageal pH capsule, the 24-hour combined multichannel intraluminal impedance-pH (MII-pH) catheter system, which can be combined with ambulatory oesophageal manometry, and the Bilitec system which measures light absorbance in the bilirubin spectrum to identify non-acid bile reflux events.6–8 In the original description by Johnson-DeMeester 9 prolonged 24-hour ambulatory oesophageal pH monitoring with the nasopharygneal pH catheter was used to measure the percentage of total recording time when distal oesophageal pH was less than 4. The acid exposure time (AET) was defined as the percentage of total recording time that oesophageal pH was less than 4. An abnormal AET was defined by greater than 4.2% of recording time when pH was less than 4. With technological advances, the combined MII-pH monitoring system that allows for the characterisation of acid (pH < 4), weakly acidic (4 ≥ pH < 7) and weakly alkaline (pH ≥ 7) reflux episodes and for the timed correlation of oesophageal pH changes with reflux events is now considered the most sensitive tool for characterisation of GERD.10–12 A simple classification of acid (pH < 4) and non-acid (pH ≥ 7 including weakly acidic and weakly alkaline) reflux provides a more pragmatic separation of reflux. 3 The diagnosis of NAR is based on a number of different parameters obtained on 24-hour impedance-pH monitoring, including bolus exposure time, bolus clearance time, numbers of NAR episodes10–12 and the symptom association profiles which include the symptom index (SI) 13 and symptom association probability (SAP). 14
Theoretically, studying patients off PPI therapy identifies predominantly AR events, while studying patients on PPI therapy identifies NAR events, as PPIs convert AR to NAR events. 15 In a recent systematic review, 16 persistent reflux symptoms in patients on PPI were attributed to weakly acidic reflux events. Prior to the latest definition of GERD 17 patients who had a normal oesophageal AET but a positive symptom association (SI/SAP) for AR (acid-hypersensitive oesophagus) and NAR (non-acid hypersensitive oesophagus) were classified under the GERD spectrum.18,19
Management algorithms have proposed ambulatory MII-pH monitoring under PPI therapy in patients with suspected GERD with insufficient treatment response to evaluate the role of ongoing AR or NAR.20,21 Conversely, the use of MII-pH monitoring on PPIs has been challenged because of the reportedly low yield.22,23 Hence, it remains controversial if ambulatory oesophageal studies should be conducted on PPIs or after PPI washout. The aim of this study was to perform a systematic review and meta-analysis to compare the positive yield of ambulatory oesophageal studies conducted off versus on PPI.
Methods
Literature search and eligibility criteria
A comprehensive literature search was performed in Pubmed/MEDLINE (1946 to December 2016), Embase (1974 to December 2016) and the Cochrane library (1992 to December 2016). The specific concepts used in the search strategy were ‘pH monitoring’, ‘proton pump inhibitors (PPIs)’ and ‘gastroesophageal reflux’. The detailed search strategies are listed in Appendix 1. We used both medical subject headings (MeSH)/Emtree and free text searches. In addition, we reviewed the reference lists of included papers, relevant review articles and practice guidelines manually to identify additional studies of interest. Two reviewers (DA and QZ) independently screened for eligible studies based on predefined eligibility criteria. Clinical trials and cohort studies which reported outcomes comparing patients who were studied off versus on PPIs for GERD evaluation were included. For studies that had published duplicate results with accumulating numbers of patients or increased lengths of follow-up, only the most recent or complete reports were included. Studies that did not provide sufficient information on acid exposure profiles or symptom association profiles, or did not provide sufficient data for these proportions to be calculated were excluded. In addition, mechanistic studies that evaluated particular time frames, such as post-prandial periods or periods in supine position, instead of continuous 24-hour measurements were excluded. Review articles, technical reports, editorials, letters to the editor, case reports and abstracts not published as a full text paper were excluded. Any discrepancies regarding whether articles met inclusion criteria were resolved by consensus.
Gastroesophageal reflux parameters and symptom reflux association analysis
Total 24-hour AET was defined as the total time oesophageal pH was less than 4 divided by the time of monitoring. SI 13 was defined as the number of symptoms associated with reflux divided by the total number of symptoms. A positive SI was defined by SI of 50% or greater (i.e. at least half of symptoms associated with reflux). The SAP 14 involves dividing the 24-hour recording period into 2-minute segments. For each 2-minute segment, it was determined if reflux or symptoms occurred. A 2 × 2 contingency table is constructed in which the numbers of 2-minute segments with/without symptoms and with/without reflux is tabulated. A chi-square test is used to calculate the probability that the observed distribution occurred by chance. The SAP was calculated as (1 – P)100% with the P value calculated using Fisher’s exact test and positive if SAP was 95% or greater.
Data extraction and risk of bias assessment
Data were extracted by two independent reviewers (DA and QZ) for: (a) study characteristics (publication year, country of population, nature of studies and study design); (b) baseline characteristics (mean age, numbers and proportion of men); and (c) outcome events including 24-hour AET, SI and SAP classified by reflux type (i.e. AR and NAR).
The quality of each study was evaluated using the Cochrane Collaboration’s tool for assessing risk of bias for randomised controlled trials and the risk of bias in non-randomised studies – of interventions (ROBINS-I) tool for non-randomised trials and cohort studies, by two independent reviewers (QZ and LS). The Cochrane Collaboration’s tool addresses seven specific domains, which are sequence generation, allocation concealment, blinding of participants and personnel, blinding of outcome assessment, incomplete outcome data, selective outcome reporting and ‘other issues’. The judgement was made as ‘low risk’, ‘high risk’ or ‘unclear risk’ of bias. ROBINS-I also assesses seven domains, covering confounding effects, selection of participants into the study, classification of the interventions, deviations from intended interventions, missing data, measurement of outcomes and selection of the reported result. The risk of bias assessment by each domain was informed by the responses to the relevant signalling questions, which guided the formulation of domain-specific and overall judgement of risk of bias: ‘low risk’, ‘moderate risk’, ‘serious risk’ and ‘critical risk’ of bias. Any disagreement in quality assessment was resolved by discussion and consensus.
Statistical analysis
Statistical analyses were performed using Review Manager 5.3. 24 A random effects model was applied to synthesise the current evidence using risk ratio (RR) with 95% confidence interval (95% CI) reported for summarising efficacy for dichotomies outcomes (i.e. AET, SI and SAP). Statistical heterogeneity was assessed by the chi-square test and I2 value. Subgroup analyses on potential source of effect modification were conducted for AET events, including study design, AET cut-off, pH monitoring tools and regions of study population. Sensitivity analyses, by excluding studies with ‘serious’ to ‘critical’ risk of bias, were conducted to check the robustness of results for AET events.
Results
Characteristics of included studies
Summary of study characteristics.
NR: not reported; D: days; H: hours.

PRISMA flowchart of search results.
Summary of studies.
Oesophageal AET
Oesophageal AET was reported in all 15 studies (n = 1672 subjects studied on PPI therapy; n = 3361 subjects studied off PPIs) (Figure 2). Our meta-analysis showed an overall RR of 2.16 (95% CI 1.42–3.28) of detecting a high AET when patients were studied off versus on PPIs. This translates to a 116% increased rate of detecting a high AET off versus on PPIs for all studies combined and a 170% higher chance based on results from prospective cohort studies only.
Forest plot of risk ratios (RRs) and 95% confidence intervals (CIs) for the detection of raised oesophageal acid exposure time (AET) off or on proton pump inhibitors (PPIs).
Summary of results from sensitivity analysis and subgroup analysis based on cut-off levels for oesophageal acid exposure time.
SI for AR and NAR events
Five studies reported the SI for AR25,26,28,35,37 (n = 291 subjects studied on PPIs; n = 473 subjects off PPI) (Figure 3). For AR, the RR was 2.64 (95% CI 1.52–4.57), indicating that off PPI therapy was associated with a 164% higher rate of detecting a positive SI for AR events compared to on PPI therapy. Three studies reported the SI for NAR events.26,28,35 Compared to AR events, the RR for a positive SI for NAR events decreased to 0.96 (95% CI 0.49–1.88), showing a non-significant decreased diagnostic yield of detecting NAR while off PPI. Both analyses showed moderate heterogeneity with I2 = 48% and 44%, respectively.
Forest plot of risk ratios and 95% confidence intervals (CIs) for positive symptom index (SI) off or on proton pump inhibitors (PPIs), stratified by reflux types.
SAP for AR and NAR events
Four studies26,32,35,37 compared SAP for AR events in 414 subjects on PPI and 513 subjects off PPI therapy. When studies were performed off PPI therapy, there was a 194% higher rate of detecting a positive SAP for AR events compared to on PPI therapy (RR 2.94, 95% CI 2.31–3.74). However, the RR for NAR events decreased to 0.54 (95% CI 0.30–0.99), showing a significantly improved detection of symptomatic NAR events on PPI compared to off PPI (P < 0.001) (Figure 4).
Forest plot of risk ratios and 95% confidence intervals (CIs) for positive symptom association probability (SAP) off or on proton pump inhibitors (PPIs), stratified by reflux types.
Sensitivity analysis
For sensitivity analysis, similar results were observed by excluding one study with ‘serious’ risk of bias for a raised oesophageal AET with RR 2.33 (95% CI 1.57–3.46) (Table 3). The excluded study showed limited impact on the overall results.
Discussion
Patients with persistent GERD-like symptoms who do not respond to therapy comprise a significant proportion of gastroenterology referrals. The challenge is to determine if patients indeed have GERD, and if persistent symptoms are attributed to AR and/or NAR. Current guidelines 40 recommend ambulatory reflux monitoring in the following circumstances: (a) documenting reflux in endoscopy negative patients who are being considered for anti-reflux surgery; (b) to determine if persistent symptoms are due to reflux in patients who have undergone prior surgical or endoscopic anti-reflux procedures; (c) to access adequacy of acid control in patients with GERD complications (e.g. Barrett’s oesophagus); and (d) evaluating symptoms in patients with PPI refractory symptoms. The latter is the most common indication for ambulatory reflux monitoring.
Should diagnostic tests for GERD evaluation be performed on or off PPIs? As the studies were widely heterogeneous, we included only controlled studies; i.e. studies that included separate cohorts of patients who were studied either on or off PPIs. We defined a positive study based on a raised oesophageal AET, the SI of 50% or greater 13 and/or SAP of 95% or greater. 14 Our data provide a pooled analysis of all studies performed in patients who had typical and/or atypical GERD symptoms based on the Montreal consensus 2 and patients who had a positive symptom association (SI/SAP) for reflux events based on the earlier definition of GERD. 41
The three main areas to address when GERD-like symptoms persist despite PPIs include: (a) inadequate acid suppression; 42 (b) NAR events;42–45 and (c) an erroneous diagnosis of GERD.42,46
Inadequate acid suppression
The AET remains the most robust measure for detecting AR,21,41 and our meta-analysis confirms the increased chance of detecting a high AET when studies were performed off PPIs. Based on the overall RR of 2.16 of detecting a high AET off PPI from pooled studies, the RR of detecting a high AET on PPI was 0.46; or a 54% lower chance of detecting a high AET on PPIs.
NAR events
Many studies defined NAR events based on positive symptom indices. In our analysis, a significant RR was achieved from pooled results for a positive SI (2.64, 95% CI 1.52–4.57) and SAP (2.94, 95% CI 2.31–3.74) for AR-related events off versus on PPIs, respectively. Among studies that reported the SAP for NAR events, our pooled analysis confirmed an improved yield of a positive SAP for NAR events when patients were studied on versus off PPI therapy (RR 1.85, 95% CI 1.01–3.33). However, this effect was not observed when the SI was used (RR 1.04, 95% CI 0.53–2.04). This may be attributed to the statistically more robust nature of the SAP compared with the SI. 47
Misdiagnosis of GERD
Ambulatory oesophageal tests are useful to exclude GERD as a cause of ongoing symptoms despite PPIs.33, 48–50 Apart from ongoing AR and NAR, persistent symptoms may be attributed to functional dyspepsia. 33
Our study had limitations. Data for this review were obtained from retrospective or prospective observational studies. We acknowledge that studies differed in the definitions for high AET, hence we performed subgroup analysis based on different cut-off values. We observed the highest RR of 3.26 (95% CI 1.75–6.06) using a AET cut-off value of 4.0–4.2% compared with a higher AET cut-off value of 5.0–6.3% (RR 1.82, 95% CI 0.81–4.05). Regardless of the cut-off values, performing oesophageal studies after PPI cessation improved the positive yield. Even with a more restrictive cut-off value of 5.0–6.3%, we observed a trend towards a positive study, although the lack of significance may be attributed to inadequate participant numbers.
Apart from different AET cut-off values, the patients who were studied included those with typical and/or atypical symptom profiles, and the decision to perform studies on or off PPIs was often left to the treating physician. From the subgroup analysis, the wireless Bravo capsule was highly sensitive for detecting abnormal AET with the RR of 3.20 (95% CI 1.42–3.28) when subjects were studied off versus on PPIs. Similarly, studies conducted in North America presented higher RRs compared to those in Europe, which may be attributed to different patient characteristics, operation procedures or different brands of PPIs used.
We acknowledge the limitations of the SI and SAP in GERD diagnosis. 51 The SI and SAP rely on precise timing of symptom recording by patients, together with accurate reflux detection by the test device. Hence, symptoms should be short-lived, with a definite start and end point. Both typical and extraeosphageal reflux symptoms were included in the studies which reported the symptom association profile, although heartburn remains the only symptom for which a positive symptom association has been validated in studies performed off PPI. Furthermore, despite a positive symptom association profile, we are mindful that the causal association between reflux events and symptoms cannot be established, especially in the absence of outcome studies. To date, there remains a paucity of data supporting the association of patients with a positive SI and/or SAP and favourable clinical outcomes.19,52–54 However, even for a robust parameter such as the oesophageal AET, there are very few studies that have shown that high AET is a predictor of response to PPIs. 47 All symptom reflux association indices have their shortcomings, but in the studies conducted to date, these indices have proved to be useful in the overall clinical evaluation of patients with suspected GERD. We are cognizant that a ‘positive SI or SAP’ does not translate directly to a positive diagnostic yield. Until outcome studies are available, these measures of a positive study have previously been used as a surrogate for GERD diagnosis prior to the latest Rome IV definition of GERD.
In view of the heterogenous population of patients studied, our analysis provides a global comparison of the overall number of positive studies across all indications when patients were studied on versus off PPIs. Intuitively, studying patients with typical GERD symptoms off PPIs would enhance the chance of a positive diagnostic study, while studying patients with PPI refractory symptoms on PPIs would determine if ongoing GERD is the cause of symptoms. This distinction was made in only two studies.26,27 The AET off PPIs remains the most robust measure for detecting AR,21, 41 and our meta-analysis confirms this. For NAR, the symptom indices (SI and SAP) were the most commonly used parameters and a positive SAP for NAR occurred more frequently in subjects studied on PPIs. What is the impact of our findings on overall clinical care? Recognising the limitations of our meta-analysis, the decision to continue or stop PPIs prior to diagnostic tests should be individualised. In a patient with typical symptoms of heartburn and regurgitation despite a normal gastroscopy who is being considered for surgical treatment, ambulatory oesophageal pH monitoring off therapy would suffice. If symptoms persist despite PPIs, acid exposure is less likely to be the cause of symptoms; and hence combined MII-pH monitoring on therapy would provide an improved diagnostic yield. Documenting negative findings on MII-pH monitoring is equally important as a positive study, as it directs the clinician to search for a non-GERD cause and avoid unnecessary anti-reflux treatment.
Study highlights
What is the current knowledge
GERD is a chronic disease that affects a significant proportion of adults in the USA and Europe. Symptoms that persist despite PPIs are commonly encountered. Ambulatory oesophageal tests are useful in objectively quantifying AR and/or NAR, but it remains controversial if PPIs should be continued or stopped prior to performing these tests.
What is new here
Performing oesophageal studies after PPI cessation improves the diagnostic yield for AR events based on AET and the symptom association. Detection of NAR is improved if PPIs are continued during oesophageal tests.
Footnotes
Author contribution
DA: study design, data acquisition, analysis and interpretation of data, drafting of manuscript. QSZ and LMS: analysis and interpretation of data, critical review of manuscript. JT: study concept and design, analysis and interpretation of data, critical revision of manuscript. DA: Guarantor of the article. All authors approved the final version of the manuscript.
Declaration of conflicting interests
The authors declared no conflicting interests.
Funding
The study received funding support from the Singapore National Medical Research Council (NMRC) Centre grant awarded to Changi General Hospital.
Ethics approval
As this is a meta-analysis, no ethics approval was required.
Informed consent
As this is a meta-analysis, informed consent was not required.
