Abstract
Background:
The British Thoracic Society (BTS) currently recommends pre-flight clinical assessment of all symptomatic patients with interstitial lung disease (ILD). This may include hypoxic challenge testing (HCT) to determine whether supplemental in-flight oxygen is required, but it is not universally available.
Objectives:
(1) To validate a previously published pre-flight assessment algorithm in predicting outcomes of HCT in ILD. (2) Compare the sensitivity and specificity of the original algorithm to an amended version published in the BTS clinical statement on air travel.
Design:
Single centre, cohort study.
Methods:
A single-centre retrospective cohort analysis of ILD patients attending for HCT between March 2017 and April 2023.
Results:
A total of 126 patients with a diagnosis of ILD underwent HCT. Median forced vital capacity 75.0% predicted (interquartile range (IQR) 24.8) and transfer factor for carbon monoxide 45.8% predicted (IQR 18.6). Diagnosis of idiopathic pulmonary fibrosis in 50.8% (
Conclusion:
In this validation study, the practical pre-flight algorithm demonstrates good specificity and moderate sensitivity for predicting HCT outcomes. The BTS modified algorithm demonstrates comparable sensitivity and specificity. Additional work is required to further develop practical guidance to reduce both the number of HCT advised and the proportion of patients incorrectly advised to arrange supplemental in-flight oxygen.
Keywords
Introduction
Commercial airline travel poses a potential risk to patients with respiratory disease, with 10%–12% of all in-flight medical emergencies reported to result from respiratory conditions. 1 Although there is no high-quality evidence to determine who should have a formal respiratory review prior to air travel, the British Thoracic Society (BTS) clinical statement currently recommends pre-flight clinical assessment of all symptomatic patients with interstitial lung disease (ILD). 2 Hypoxic challenge testing (HCT) may be used as part of this assessment to determine whether selected patients should be advised to use supplemental oxygen during their flight.2,3 This test involves administering a controlled mixture of nitrogen/oxygen to simulate the lower ambient oxygen levels present within the aircraft cabin at altitude, alongside contemporaneous oxygen assessment. A fall of PaO2 to <6.6 kPa or oxygen saturations (SpO2) < 85% during the test indicates that the individual should be recommended supplemental in-flight oxygen (Failed HCT). 1
There is limited information on which patients with ILD should be referred for HCT.1,2,4 Improved patient selection for HCT may reduce the number of unnecessary tests being performed, reducing patient and health economic burden. Barratt et al.
4
undertook the largest retrospective study of ILD patients attending for HCT and evaluated physiological variables that might predict a hypoxaemic response to HCT (‘failed HCT’). Ground level PaO2 ⩽ 9.42 kPa and transfer factor for carbon monoxide (TLCO) < 50% predicted were independent predictors for ‘failing’ the HCT and were used to propose a practical pre-flight assessment algorithm for evaluation of ILD patients to support decision making on whether (a) supplemental in-flight oxygen was recommended, (b) whether the patient could fly without oxygen or (c) whether advice was given for further pre-flight assessment with HCT (Supplemental Figure 1). This algorithm had a sensitivity of 86% and specificity 84% when all patients with complete datasets for this information (
Objectives
The primary aim of this study was to validate the previously described pre-flight assessment algorithm by Barratt et al. 4 in its ability to predict outcomes of HCT in patients with ILD.
The secondary aim was to compare the sensitivity and specificity of this algorithm to an amended version proposed in the BTS clinical statement for air travel. 2
Methods
The clinical records of all consecutive ILD patients presenting to North Bristol NHS Trust specialist ILD centre for routine HCT (between March 2017 and April 2023) were retrospectively analysed. All patients had an ILD multidisciplinary team (MDT) consensus diagnosis. Baseline demographic data, oxygen saturations (SpO2) using pulse oximetry and capillary ear lobe partial pressure of oxygen (PaO2) were collected. Spirometry, transfer factor for carbon monoxide (corrected) (TLCO) and 6-min walk test (6MWT), performed according to BTS guidelines 5 and within 6 months of the HCT, were also evaluated. HCT was undertaken using the Ventimask method, whereby 100% nitrogen was delivered through a 40% Ventimask at a designated flow rate of 10.0 l/min, resulting in an equivalent inspired fraction of oxygen (FiO2) of 15% O2 6 A fall of PaO2 to <6.6 kPa or SpO2 < 85% during the test indicated that the individual should be recommended supplemental inflight oxygen (Failed HCT), according to BTS guidelines. 1
Statistical analyses were performed using Statistical Product and Services Solution (SPSS) (IBM Corp, version 23, Armonk, N.Y., USA). Categorical variables were presented as counts and percentages, whilst continuous variables were presented as medians with interquartile range (IQR). Mann–Whitney
Results
Baseline demographics
Between March 2017 and April 2023, a total of 172 patients with a diagnosis of ILD underwent hypoxic challenge testing. Of 172 patients identified, 126 had complete datasets for subsequent analysis (Failed HCT

Consort diagram showing the flow of ILD patients completing HCT.
Table 1. demonstrates the baseline demographics and multidisciplinary team consensus diagnoses of the overall cohort. Approximately half of all patients were male (
Demographic data of the ILD cohort undergoing HCT.
‘Other diagnoses’ include ANCA-associated ILD
Desaturation on 6MWT – includes those on ambulatory oxygen therapy (
FEV1, forced expiratory volume in 1 second; FVC, forced vital capacity; ILD, interstitial lung disease; IQR, interquartile range.; KCO, transfer coefficient; kPa, kilopascals; 6MWT, 6 minute walk test; m, metres; MRC, Medical research council breathlessness scale;
Those individuals ‘failing’ the HCT had significantly more impaired lung function (FEV1, FVC, TLCO) at baseline; (median FEV1 passing HCT: 1.90 L (IQR 0.79) versus failing HCT 1.95 L (IQR 0.79); median FVC passing HCT 2.35 L (IQR 1.0) versus failing HCT 2.4 (IQR 0.67) and median TLCO: passing HCT 3.29 (IQR 1.6) versus failing HCT 3.34 (IQR 1.9,
Patients were excluded from the final dataset (
Validation of the practical pre-flight assessment algorithm
The primary aim of this study was to validate the ILD pre-flight algorithm previously developed by Barratt et al. 4 in a separate ILD cohort. Using a cohort of 126 patients with complete datasets, we compared the predicted outcome of the proposed algorithm with the actual HCT test results (Table 2(a)).
Comparison of the predicted outcome of the proposed algorithm for individual patients with actual HCT test results for a cohort of
HCT, hypoxic challenge test;
Using the algorithm cut-offs of PaO2 ⩽ 9.42 kPa and TLCO ⩽ 50% predicted, 23.0% (29/126) patients would have been advised to arrange in-flight supplementary oxygen, although only 10 of these patients ‘failed’ the HCT; therefore 15.1% (19/126) patients would have been unnecessarily advised to arrange in-flight oxygen for their travel abroad. Forty-five patients (35.7%) had both PaO2 > 9.42 kPa and TLCO > 50% predicted; using the algorithm, these patients would have been advised that they could fly without oxygen. Only 2 (4.4%) of these patients ‘failed’ the HCT upon formal testing. The algorithm therefore demonstrated moderate sensitivity of 69.4% (identifying most ‘passed’ cases correctly, 43/62) and good specificity of 83.3% (most ‘failed’ cases correctly identified, 10/12).
Fifty-two (41.3%) patients either had a PaO2 ⩽ 9.42 kPa or TLCO ⩽ 50% predicted and would have been referred for HCT. Of those referred for HCT, 11.5% (
Validation of the modified algorithm published in the BTS clinical statement
The first decision point in the BTS algorithm is whether the ILD patient desaturates on exercise testing to SpO2 < 95%. 2 In this cohort, only 9/126 (7.1%) patients did not desaturate on their 6MWT. Of these, 1/9 (11.1%) ‘failed’ HCT and required in-flight supplemental oxygen (Table 2(b)).
Identical to the original proposed algorithm, using cutoffs of PaO2 ⩽ 9.42 kPa and TLCO ⩽ 50% predicted, 23.0% (29/126) patients would have been advised to arrange in-flight supplementary oxygen, although only 10 of these patients actually ‘failed’ the HCT, resulting in 15.1% (19/126) patients arranging in-flight supplementary oxygen unnecessarily. Forty-seven patients (37.3%) would have been advised that they could fly without oxygen, although 2 (4.3%) of these patients ‘failed’ the HCT upon formal testing. The modified algorithm therefore demonstrated moderate sensitivity of 70.3% (identifying most ‘passed’ cases correctly, 45/64) and good specificity of 83.3% (most ‘failed’ cases correctly identified, 10/12).
Fifty (39.7%) patients would have been referred for HCT, with 11.5% (
Discussion
Despite many patients with respiratory disease travelling by commercial air every year without issue, there are potential risks, and the possible effects of hypobaric hypoxia are well documented.1,2 Current practice places emphasis on the use of HCT to support decision making for supplemental in-flight oxygen, but facilities to undertake this test are not widely available. 8 The present study attempted to validate a previously published practical algorithm to support decision-making surrounding the need for HCT and supplementary in-flight oxygen for patients with ILD. In a cohort of 126 patients with complete datasets, our findings suggest that this algorithm has good specificity and moderate sensitivity in stratifying ILD patients into those who do or do not require supplemental in-flight oxygen and those who require HCT for further assessment. Whilst the specificity is comparable to the derivation cohort (84%), the specificity of the validation cohort is lower (derivation sensitivity of 86%). 4 Compared to not using an algorithm, implementing this algorithm reduces the number of HCTs needed, and may therefore lessen the economic burden placed on NHS organisations or on individuals who may need to pay privately (approximately £400 per test), 8 while also reducing, in a proportion of patients, the associated time and travel demands, psychological stress related to testing, and discomfort from blood sampling. Despite these potential benefits, the authors acknowledge that a non-negligible risk of false negatives remains – patients who are not referred for HCT but would in fact fail – alongside a substantial number of false positives, whereby patients are unnecessarily advised to arrange in-flight oxygen. Travelling with oxygen requires extra planning and may involve additional costs (some airlines charge for medical clearance of portable oxygen concentrators), as well as medical clearance or fitness-to-fly documentation from the treating clinician, which can discourage patients from travelling. Further work is needed to explore patients’ experience of HCT, as well as the process of arranging and travelling with oxygen, and to weigh these factors against the economic costs and accessibility of HCT. This information could help inform future guidance on pre-flight assessments in ILD. The authors also highlight the ongoing need for less invasive and more widely accessible tools to evaluate air travel safety in ILD.
Algorithms incorporating pulse oximetry at rest and during exercise have been found to be a useful tool in differentiating chronic obstructive pulmonary disease (COPD) patients that are able to travel without in-flight oxygen, those who need supplemental in-flight oxygen and those who require further assessment with HCT.9,10 In our previous study, the findings suggested that desaturation during exercise was non-discriminatory, in those 8(20%) patients who passed HCT and 10(30%) of those who failed HCT, desaturated to SaO2 < 84% on 6MWT.
4
The recent BTS clinical statement on air travel uses desaturation SaO2 < 95% on exercise as an initial filter in an amendment to our previously published algorithm.
2
In this cohort the majority of patients desaturated to SaO2 < 95% during 6MWT (
Comparing the original and BTS algorithm outcomes, there were only two divergences identified, both arising from patients who did not desaturate on their 6MWT, resulting in two fewer HCT being advised if using the modified BTS algorithm. Whilst a small difference, on a larger scale and particularly in the context of mild ILD, this might have the potential to allow significant financial cost savings over time. This initial filter, therefore, seems to be a reasonable modification to the practical algorithm previously proposed, particularly for those with mild ILD/mild lung function impairment. However, as with any test, clinicians should recognise that this initial filter is not infallible, and a small proportion of patients would be misclassified as not needing oxygen or HCT. Replication of this work in other cohorts is required to understand if these values are generalisable. As with the original algorithm, 65% (19/29) of patients would have been incorrectly advised to arrange supplemental oxygen, and the broader implications of this have already been discussed.
There are several potential limitations to the current study. Firstly, this was a retrospective study that relied on the collation of data from medical notes, with potential bias relating to missing data and exclusion of patients with incomplete data sets (27%). Secondly, the cohort was derived from a single specialist ILD centre, and thus, results may not be generalisable to the wider ILD population. Our patient sample was subject to inherent selection bias by indication. Nonetheless, we feel that our sample is representative of a broad ILD cohort requiring pre-flight assessment. The HCT offers the ability to titrate and determine the flow rate of supplemental in-flight oxygen required, but it only simulates one aspect of altitude exposure, namely the inhalation of a low inspired fraction of oxygen (FiO2) and overlooks the possible effect of decreased barometric pressure. This might explain discrepancies previously reported between SpO2 obtained during HCT and actual in-flight SpO2. 11 The authors also recognise that there is little high-quality evidence to support the BTS recommendations for the proposed HCT thresholds (85%, PaO2 6.6 kPa). 2 Whilst these may in theory ensure that the SpO2 remains above the steep part of the oxyhaemoglobin dissociation curve, they do not take into consideration how levels may be affected during long-haul air travel. Moreover, this threshold also differs from LTOT guidance (88%, PaO2 7.3 or 8.0 kPa with pulmonary hypertension). Further work could explore the performance of the BTS guidance using a higher HCT threshold. The authors acknowledge the inherent challenges in predicting how patients will respond to the numerous physiological demands of air travel, emphasising that any predictive model should serve solely as a clinical guide rather than a definitive tool. As previously suggested, a patient-centred approach to future guidance on air-travel, considering both the benefits and risks to patients, will need to be adopted. 2
Conclusion
The study presents the largest retrospective validation of a pre-determined practical algorithm for the pre-flight assessment of patients with ILD. It demonstrates good specificity and moderate sensitivity for predicting HCT outcomes. The BTS modified algorithm demonstrates comparable sensitivity and specificity. Additional work is required to further develop practical guidance to attempt to (a) reduce the number of HCT advised and (b) reduce the proportion of patients incorrectly advised to arrange supplemental in-flight oxygen.
Supplemental Material
sj-docx-1-tar-10.1177_17534666261431183 – Supplemental material for Validation of pre-flight algorithms in predicting hypoxic challenge testing (HCT) outcomes in interstitial lung disease (ILD)
Supplemental material, sj-docx-1-tar-10.1177_17534666261431183 for Validation of pre-flight algorithms in predicting hypoxic challenge testing (HCT) outcomes in interstitial lung disease (ILD) by Cameron Bonthrone, Beyazit Durdu, Sarah Mulholland, Naomi Rippon, Louis Luckwell, Michelle Westlake, Giles Dixon, Matthew Wells, Huzaifa Adamali and Shaney L. Barratt in Therapeutic Advances in Respiratory Disease
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
