Abstract
Background
Mortality for non variceal upper gastrointestinal bleeding (UGIB) is clinically relevant in the first 12–24 hours of the onset of haemorrhage and therefore identification of clinical factors predictive of the risk of death before endoscopic examination may allow for early corrective therapeutic intervention.
Aim
1) Identify simple and early clinical variables predictive of the risk of death in patients with non variceal UGIB; 2) assess previsional gain of a predictive model developed with conventional statistics vs. that developed with artificial neural networks (ANNs).
Methods and Results
Analysis was performed on 807 patients with nonvariceal UGIB (527 males, 280 females), as a part of a multicentre Italian study. The mortality was considered “bleeding-related” if occurred within 30 days from the index bleeding episode. A total of 50 independent variables were analysed, 49 of which clinico-anamnestic, all collected prior to endoscopic examination plus the haemoglobin value measured on admission in the emergency department. Death occurred in 42 (5.2%). Conventional statistical techniques (linear discriminant analysis) were compared with ANNs (Twist® system-Semeion) adopting the same result validation protocol with random allocation of the sample in training and testing subsets and subsequent cross-over. ANNs resulted to be significantly more accurate than LDA with an overall accuracy rate near to 90%.
Conclusion
Artificial neural networks technology is highly promising in the development of accurate diagnostic tools designed to recognize patients at high risk of death for UGIB.
Keywords
Introduction
Acute upper gastrointestinal haemorrhage (UGIH) remains a common reason for hospital admission, requiring high levels of assistance and health-care expenditures. The incidence of acute UGIH and hospital related 30-day mortality varies widely, partly because of the changing epidemiology in Western countries, in which a number of factors, such as older age, comorbidites, prescriptions for proton pump inhibitors (PPI), low dose aspirin, oral anticoagulants, cyclooxygenase-2 inhibitors, are involved(1–6). Risk assessment is mainly based on endoscopic recognition of high-risk stigmata at ulcer base(7,8) and controversy exists in the literature about the predicting value of clinical, biochemical or therapeutic variables on the risk of death.
Mortality for non variceal UGIB is clinically relevant in the first 12–24 hours of the onset of haemorrhage and therefore identification of clinical factors predictive of the risk of death before endoscopic examination may allow for early corrective therapeutic intervention.
Predictive models, like Blatchford risk score(9) or Rockall risk score(7), have been developed with conventional statistical procedures to quantify mortality risk linked to UGIB.
The application of these methods at individual level has been hampered by the high interdependence and complex interaction among clinical variables involved. Many clinical variables are inter-dependent and may potentially interact with each other with reciprocal enhancement.
Artificial neural networks, at variance with the classical statistical tests, can manage complexity in a brilliant way and have been employed in the prediction of lower GI bleeding with success(10).
The main aim of this prospective multi-centre database study was to describe the ultimate outcome of patients with non variceal UGIH in a contemporary “real-life” setting. Additional analysis assessed the impact of clinical, endoscopic or therapeutic factors on the risk of death in this patient population.
This study has two aims: 1) identify simple and early clinical variables predictive of the risk of death in patients with non variceal UGIB; 2) assess predictive gain of a predictive model developed with conventional statistics vs. that developed with artificial neural networks (ANNs).
Methods
The PNED Initiative and Data Collection
A dedicated software including an endoscopic reporting system (Cartella Clinica Endoscopia Digestiva, Bracco, Italy) linked to a project-specific research database was developed. This software was distributed to 23 participating sites across Italy, establishing a network of centers that received emergency admissions from which source data were collected–-the
Patient Population
Patients were considered for the study if they had clinical evidence of overt UGIH on admission or a history of haematemesis/coffee ground vomiting, melena, hematochezia or a combination of any of the above within 24 hours preceding the admission, or clinical evidence of acute UGIH while hospitalized for any other reason (in-hospital bleeding), independently of their age. UGIH was confirmed only if either haematemesis, melena, or dark, tarry materials on rectal examination was documented and witnessed by nursing or medical staff. Patients were entered in the registry only if an upper GI endoscopy was performed. In case of bleeding from esophago-gastric varices, data were initially computed as patients having an UGIH, but were then excluded from the prospective database.
An audit of all patients presenting over a fixed time period, at each participating institution, was performed to rule out any selection bias in the way in which the study population was enrolled. Patients initially assessed at another hospital for the bleeding episode and subsequently transferred to one of the participating centres were excluded from the analysis.
Study Variables
The following independent variables were included in the electronic form: demographics (age, sex, site and date of endoscopy); historical data (presenting signs or symptoms, any significant comorbidity, the patient's physical status on presentation using the American Society of Anesthesiologists (ASA) classification 12), relevant past medical history, any concomitant intake of medications in the 7 days preceding the bleeding episode, time elapsed from the onset of bleeding); physical examination findings and laboratory data (haemodynamic data, rectal exam, nasogastric tube use, complete blood count and coagulation parameters).
The outcomes evaluated were the frequency of death, recurrent bleeding and need for surgery. Such outcomes were monitored from the admission to the hospital or the onset of bleeding for in-hospital patients up to 30 days after the endoscopic examination. Both investigators and nurses all worked with the same operational definitions of outcome. A priori definitions for all outcomes were adopted according to established definitions(13). Thyrty-day mortality was the primary investigated outcome; a “bleeding-related” death was defined as any death occurring within 30 days of the index bleeding episode. To ensure the completeness of follow-up information, the study nurses called all patients or their families at 30 days. Furthermore, after PNED had been completed, administrative databases were consulted and all charts of included patients were reviewed for a full 30 days following admission or onset of bleeding while in hospital.
Data Analysis
Advanced intelligent systems based on novel coupling of artificial neural networks and evolutionary algorithms have been applied. The results obtained have been compared with those derived from the use of standard neural networks and classical statistical analysis.
In this study we applied supervised ANNs(14), in order to develop a model able to predict with high degree of accuracy the diagnostic class starting from genotype data alone.
Supervised ANNs are networks which learn by examples, calculating an error function during the training phase and adjusting the connection strengths in order to minimize the error function. The learning constraint of the supervised ANNs make their own output coincide with the predefined target. The general form of these ANNs is: y = f(x,w*), where w* constitutes the set of parameters which best approximate the function.
We employed as benchmark linear discriminant analysis (LDA) applied on the same training and testing data sets used for ANNs. For the analysis of LDA, the SAS version 6.04 (SAS Institute, Cary, NC, U.S.A.) using forward stepwise procedure was employed.
Preprocessing Methods and Experimental Protocols
Data preprocessing was performed using two different re-sampling criteria of the global dataset.
Random Criterion
We employed the so-called 5 × 2 cross-validation protocol(15). In this procedure the study sample is five-times randomly divided into two sub-samples, always different but containing similar distribution of cases and controls: the training one (containing the dependent variable) and the testing one. During the training phase the ANNs learn a model of data distribution and then, on the basis of such a model, classify subjects in the testing set in a blind way. Training and testing sets are then reversed and consequently 10 analyses for every model employed are conducted.
Optimized Criterion: TWIST System
The TWIST system consists in an ensemble of two previously described systems: T&T and IS(16). The T&T system is a robust data resampling technique that is able to arrange the source sample into sub-samples that all possess a similar probability density function. In this way, the data is split into two or more sub-samples in order to train, test and validate the ANN models more effectively. The IS system is an evolutionary wrapper system able to reduce the amount of data while conserving the largest amount of information available in the dataset. The combined action of these two systems allow us to solve two frequent problems in managing Artificial Neural Networks.
Both systems are based on a Genetic Algorithm, the Genetic Doping Algorithm (GenD) developed at Semeion Research Centre(17).
The TWIST system is described in detail in the appendix.
After this processing, the features that were most significant for the classification were selected and at the same time the training set and the testing set were created with a function of probability distribution similar to the one that provided the best results in the classification.
A supervised Multi Layer Perceptron, with four hidden units, was then used for the classification task.
Ethics
The registry was approved by the Institutional Review Board of all participating centers. In addition, all eligible patients were asked to sign a written informed consent.
Results
Study Population
A total of 807 cases with complete data set were identified and entered in ANNs analysis. Patient characteristics are outlined in Table 1. A recent history of drugs’ consumption was recorded in a high proprotion of upper GI bleeders, mostly non-steroidal anti-inflammatory drugs (34%) and low dose aspirin (17.5%). One or more co-morbidities were recorded at the time of presentation in 60.6% of cases, with 25% having more than two comorbidities. The median number of comorbid conditions per patient was 1.0 (IQR 1.0–2.0), mainly cardiovascular diseases affecting almost a half of the patients. Mean length of stay was 7.2 ± 6.3 days (median: 4.0, IQR: 2.0–9.0).
Characteristics of the study population.
IQR: interquartile range; SD: standard deviation; CI: confidence interval; NSAIDs: non steroidal anti-inflammatory drugs; SRH: stigmata of recent haemorrhage.
Mean daily dose of aspirin 100 ± 25 mg.
ASA score refers to the American Society of Anaesthesiologists classification of a patient's severity and acuity of disease index.
Outcomes
Death was registered in 42 out of 807 patients (5.2%). The median time to death was 4 days (95% CI 2–6). In the absence of any comorbidity, mortality was 0.7% (1.1% if no severe comorbidity). Mortality rate increased to 8.4% if one severe comorbidity was recorded, and to 23.1% if two or more severe comorbidities were present (p < 0.01). The mean age of the patients who died was 76.6 ± 14.0 yr.
Classification Performances with ANNs
Table 2 summarizes the input variables used for modelling. The linear correlation index between the 50 input variables and the target variable ranged between -0.1 to +0.13. Results obtained with LDA werecompared with those obtained with a simple Back Propagation approach with 5 × 2 cross validation protocol (Tables 3 and 4).
input variables.
Results obtained with Linear Discriminant Analysis (LDA).
Results obtained with Back Propagation artificial neural network.
The overall predictive accuracy obtained with LDA and standard ANNs ranged from 54% to 70% (average 62.31%) and from 57% to 69.5% (average 62.59%) respectively.
With the TWIST approach, every experiment was conducted in a blind and independent manner in two directions: training with sub-sample A and blind testing with sub-sample B vs training with sub-sample B and blind testing with sub-sample A. The results in variables selection from the best 11 applications of TWIST procedures are reported in Table 5. This advanced intelligent system, through the final selection of a subgroup of 17 variables which resulted most often selected along eleven independent applications (at least 10 times), provided the highest predictive performance with a sensitivity ranging from 81.48% to 93.33% (average 89.18%), and a specificity ranging from 80.85% to 88.24% (average 82.98%) and with an overall accuracy ranging from 81.17 to 89.03% (average 86.04%) (Table 6). The resulting ROC curve is reported in Figure 1, with a comparison with ROC curve obtained with LDA.

ROC curve obtained with TWIST system (ANNs) and with Linear discriminant analysis(LDA). The respective AUC are: 0.87 and 0.65 (P < 0.005).
Variables selection in TWIST system application. In yellow variables selected in each round of TWIST applications; in purple variables selected in at least 10 TWIST applications; in green variables selected by LDA.
Results obtained with different kind of supervised neural networks applied to the 17 variables selected with TWIST system.
SelfSABp 3: self-recurrent- static-adaptive No.3.
TasmSABp 4: Temporal Associative Subjective Memory back propagation No.4.
SelfDABp 1: self-recurrent dynamic-adaptive No.1.
SelfSABp 3: self-recurrent- static-adaptive No.4.
FF_Bp 5: Feed forward Back propagation No.5.
SelfSABp 2: self-recurrent- static-adaptive No.2.
TasmSABp 4: Temporal Associative Subjective Memory dynamic adaptive No.1.
SelfDABp 1: self-recurrent dynamic-adaptive No.1.
SelfSABp 3: self-recurrent- static-adaptive No.4.
FF_Bp 5: Feed forward Back propagation No.5.
In column train- test order ab = train on subset a and test on subset b; ba = train on subset b and test on subset a.
The following variables resulted to be selected both by LDA and TWIST system: age, cancer, cancer site, nitroderivates while other 4 variables were never selected by all models: gastric surgery hypertension, diabetes, time from symptoms to hospital.
Discussion
Acute UGIH remains a common medical problem that has significant associated morbidity, 30-day mortality, and health care resource use. The PNED data indicate that ulcers are by far the most common cause of nonvariceal UGIH, accounting for 66% of all diagnoses. More than a half of the patients in the present study were taking at least one ulcerogenic drug, mainly NSAID's, whereas the use of low dose aspirin was much lower than reported in North America (19–20)and this should be taken into account in terms of generalizability of results. An important feature of the PNED study is that it represents a consortium of practice sites that use a structured endoscopy reporting system to collect information in a centralised endoscopic database. Such data are useful because they reflect “real world” endoscopic practice from a wide-range of practice settings and minimize patient selection or potential referral bias. Of the 23 participating centers, in fact, only eight were tertiary institutions, unlike the Canadian RUGBE study in which tertiary institutions accounted for two-thirds of the participating centers and 62% of patients were enrolled at six centers(21).
This results obtained with artificial neural networks analysis allow the following considerations. First of all, the comparison of results obtained with three different analytical approaches (classical statistics, standard neural networks and advanced artificial neural networks), points out the need to employ systems that are really able to handle the disease complexity instead of treating the data with reductionist approaches that are unable to detect multiple variables interaction effects in predisposing to the adverse outcome. The possibility to derive high diagnostic accuracy from limited and selected information using these new analytical tools, open the possibility to develop application software to be used in hospital ward for individual risk stratification.
Because patients in trials, as in clinical practice, have many attributes that can affect the likelihood of treatment being beneficial or harmful, exploring each of these attributes “one variable at a time” (e.g. male vs female, old vs young) risks spurious false-positive subgroup results from chance fluctuations. Furthermore, although patients have simultaneous multiple characteristics that can affect the likelihood of the outcome and the effect of therapy, one-variable-at-a-time comparisons are fundamentally limited because they compare groups that vary only on a single factor, usually resulting in the subgroups being more similar than different (21).
Second point, artificial neural networks, at variance with the classical statistical tests, can manage complexity even with relatively small samples and to the subsequent unbalanced ratio between variables and records. In this connection, it is important to note that adaptive learning algorithms of inference, based on the principle of a functional estimation like artificial neural networks, overcome the problem of dimensionality.
An important obstacle in approaching in conventional manner the biological basis of a rare events like death for UGIB, is related to the difficulty to find an homogeneous sample population large enough to be analysed for a wide number of clinical variants. Our study identified a number of prognostic clinical factors independently associated with the risk of 30-day mortality after an acute UGI bleed. The relevant predictors were female gender, age, bleeding in hospital, obstructive lung disease, cancer, cancer site, liver cirrhosis, ticlopidine, clopidogrel, rofecoxib, NSAIDs, nitroderivates, black vomiting, red blood vomiting, systolic blood pressure, presence of blood in rectum, black blood.
The fact that most deaths occur in elderly patients with severe comorbidities should prompt us to switch our focus to improve management of this selected high-risk subgroup, to be included into specifically designed trials. Furthermore, major preventive steps should be taken in order to reduce the risk of NSAIDs’ related haemorrhage, particularly in the highest risk population.
Risk stratification in patients with UGIH is essential for optimal management of bleeders, both for triage of those at high risk to inpatient care and for identification of patients at low risk of adverse outcome who can be safely managed as outpatients(23,24,25).
Predictive models, like Blatchford risk score or Rockall risk score, have been developed with conventional statistical procedures to quantify mortality risk linked to UGIB.
The application of these methods at individual level has been hampered by the high interdependence and complex interaction among clinical variables involved.
However, as pointed out in a recent review on the topic by Das et al. (25), at present there is not an ideal risk score. For acute bleeders, urgent EGD should not be an essential part of the initial risk score; however, EGD should be an important component of the risk score at a subsequent point in patient care. Hence, the need to categorise the patient's risk profile with the aid of accurate and user-friendly clinical predictors that typically are available during initial patient triage.
There are several limitations to the present registry. The adopted study design is not an experimental one, and therefore not as rigorous as that of a randomized controlled trial. The demonstrated associations should be seen as suggestive, but require prospective independent validation, which is actually under way. Observational databases can be useful adjuncts to randomised controlled trials to determine whether efficacy under controlled condition in referral specialists units can be translated into effective treatment in routine clinical practice. Methodological limitations can nonetheless threaten the internal validity of a registry: completeness of follow-up, ascertainment of outcomes, possible patient selection bias and inadequate adjustment for confounders when attempting at identifying predictors of outcome. Like others(21), we attempted to address these possible shortcomings by establishing conservative and a priori definitions for all study variables including outcomes, by training all research staff in a standardized fashion, by enforcing strict data verification and validation protocols and by ensuring complete 30-day follow-up.
In conclusion, death after an acute non variceal UGIH occurs mostly among elderly patients with severe comorbidity or those with failure of endoscopic intention to treatment. These factors should be taken into account in a struggle toward a further reduction of overall 30-day mortality. Future studies should reconsider early resort to surgery, especially for young and surgically fit patients, to reduce the risk of death during the first 24 hours of the bleeding episode. Also, the use of preventive strategies to reduce the bleeding risk especially in patients with advanced age, neoplasia, renal failure and liver cirrhosis deserves appropriate investigation. Improving the ultimate outcome of patients with non-variceal UGIH will take an integrated approach by a team focused on treating the patient and not just the source of bleeding.
The results of this study illustrate that ANNs can be added to the list of computational methods that may provide answers to some questions about complex biological process.
Footnotes
Acknowledgments
The PNED initiative was a collaborative effort supported by the Italian Society for Digestive Endoscopy (SIED) and the Italian Association of Hospital Gastroenterologists (AIGO). We acknowledge the great deal of work performed by medical and nursing staff in each of the participating units.
Appendix
The PNED investigators’ group includes Riccardo Marmo, Ospedale Curto, Polla; Livio Cipolletta, Gianluca Rotondano, and Maria A. Bianco, Ospedale Maresca, Torre del Greco; Lucio Capurso, Maurizio Koch and Angelo Dezi, ACO San Filippo Neri, Rome; Angelo Pera and Rodolfo Rocca, Ospedale Mauriziano Umberto I, Torino; Fausto Barberani and Sandro Boschetto, Ospedale San Camillo De Lellis, Rieti; Alfredo Pastorelli and Elena Sanz Torre, Ospedale Bel Colle, Viterbo; Sergio Brunati and Renato Fasoli, Ospedale Cantù, Abbiategrasso; Ivano Lorenzini and Ugo Germani, AO Umberto I, Ancona; Giorgio Minoli and Giorgio Imperiali, Ospedale Valduce, Como; Giovanni Gatto and Mariano Amuso, AO Villa Sofia, Palermo; Massimo Proietti and Anna Tanzilli, Ospedale Del Prete, Pontecorvo; Walter Piubello, Maria Tebaldi and Fabrizio Bonfante, Ospedale di Desenzano, Desenzano del Garda; Renzo Cestari and Domenico Della Casa, AO Ospedali Civili, University of Brescia, Brescia; Paolo Michetti and Paola Romagnoli, Ospedali Galliera, Genova; Omero Triossi and Andrea Buzzi, Ospedale Santa Maria delle Croci, Ravenna; Alessandro Casadei and Claudio Cortini, AO Morgagni, Forlì; Giorgio Chiozzini and Lisa Girardi, Ospedale Umberto I, Mestre; Luciano Allegretta and Salvatore Tronci, Ospedale Santa Caterina Novella, Galatina; Giovanni Aragona and Francesco Giangregorio, Ospedale Civile, Piacenza; Sergio Segato and Giuseppe Chianese, AO Ospedale Circolo e Fondazione Macchi, Varese; Andrea Nucci and Francesca Rogai, AO Ospedale Careggi, Firenze; Giampiero Bagnalasta and Claudio Leoci, Ospedale Civile, Manerbio; Giovanni Di Matteo and Paolo Giorgio, IRCCS Ospedale De Bellis, Castellana Grotte; Marco Martorano, Ospedale dell'Immacolata, Sapri; Mario Salvagnini, Ospedale San Bortolo, Vicenza.
Appendix: TWIST System
TWIST system is an ensemble of two algorithms: T&T and I.S.
