Abstract
Background
Computed tomography perfusion (CTP) is the mainstay to determine possible eligibility for endovascular thrombectomy (EVT), but there is still a need for alternative methods in patient triage.
Purpose
To study the ability of a computed tomography angiography (CTA)-based convolutional neural network (CNN) method in predicting final infarct volume in patients with large vessel occlusion successfully treated with endovascular therapy.
Materials and Methods
The accuracy of the CTA source image-based CNN in final infarct volume prediction was evaluated against follow-up CT or MR imaging in 89 patients with anterior circulation ischemic stroke successfully treated with EVT as defined by Thrombolysis in Cerebral Infarction category 2b or 3 using Pearson correlation coefficients and intraclass correlation coefficients. Convolutional neural network performance was also compared to a commercially available CTP-based software (RAPID, iSchemaView).
Results
A correlation with final infarct volumes was found for both CNN and CTP-RAPID in patients presenting 6–24 h from symptom onset or last known well, with
Conclusion
A CTA-based CNN method had moderate correlation with final infarct volumes in the late time window in patients successfully treated with EVT.
Introduction
Endovascular thrombectomy (EVT) has been the standard of care for some years now for patients with ischemic stroke and large vessel occlusion (LVO) presenting within 6 h of symptom onset. 1 With large trials showing the efficacy and safety of endovascular therapy up to 24 h after time from last known well, 2 the need for advanced neuroimaging and interpretation of these studies has surged.
The main questions to answer for determining eligibility for thrombectomy are: (1) is there salvageable brain tissue, that is, is there a mismatch between the ischemic core and penumbra and (2) how large is the ischemic core. Currently, both computed tomography perfusion (CTP)-based methods and magnetic resonance imaging (MRI)-based perfusion-weighted imaging with diffusion-weighted imaging (DWI) are the mainstay for determining the core and penumbra in stroke diagnosis and treatment selection. In patients with stroke symptoms, computed tomography angiography (CTA) is routinely acquired for LVO detection and detecting significant carotid artery stenoses. 3
With the growing number of patients possibly eligible for thrombectomy, the need for patient triage has surged, and there is still a need for alternative triage methods, as current imaging methods do not always provide the information needed or they may not be available in hospitals outside of comprehensive stroke centers.
To this end, several studies have looked at CTA-based deep learning methods in ischemic stroke detection with promising results4–7 and our previous study suggested that a convolutional neural network (CNN) model could be useful in determining eligibility for thrombolytic therapy. 5 However, CNN performance in final infarct estimation might vary depending on treatment selection and could be different in patients receiving thrombolytic therapy versus patients treated with EVT.
In this study, we set out to investigate whether our CTA-based CNN model can predict the final infarct volume in patients with acute ischemic stroke (AIS) treated successfully with mechanical thrombectomy. The premise was that with successful recanalization, the baseline infarct core determined by the CNN should roughly match the final infarct volume on follow-up imaging. CNN performance was analyzed for patients presenting either in the early (0–6 h) or late (6–24 h) time window and compared to a commercially available CTP-based software (RAPID, iSchemaView) to investigate the ability of this method in predicting final infarct volume.
Materials and methods
The data that support the findings of this study are available from the corresponding author upon reasonable request. Helsinki University Hospital ethics committee approved this retrospective study and patients’ informed consent was waived.
Study population
Patient characteristics.
aNIHSS was reported for 86 patients.
bExact time from symptom onset was unknown in 32 patients.
SD: standard deviation; IQR: interquartile range; NIHSS: national institutes of health stroke scale; CT: computed tomography; CCA: common carotid artery; ICA: internal carotid artery; MCA: middle cerebral artery; CNN: convolutional neural network; CTP-RAPID: computed tomography perfusion RAPID; CBF: cerebral blood flow.
Image acquisition and preprocessing
A majority of patients (
All follow-up studies were evaluated for final infarct volume. A senior neuroradiologist (MK) and a radiologist in training (LH), with over 20 and over 5 years of experience, respectively, segmented the infarcted regions on follow-up CT and diffusion weighted MRI scans in consensus using 3D Slicer image processing and visualization platform. 9 No blinding was used regarding image assessment. Image data preprocessing and 3D convolutional neural network implementation was conducted by a physicist (TM).
The CNN had been previously trained and validated and has been used in a recent publication. 5 A total of 150 patients with a suspected AIS of the middle cerebral artery territory were retrospectively selected for CNN development. From this population, 75 were diagnosed with stroke and 75 were stroke mimics based on acute neurological symptoms and imaging findings. The CNN was trained on 20 non-stroke and 20 stroke patient CTA-SI volumes with manually delineated lesion targets. This was equal to 1400 axial images of which 20% included an ischemic lesion. Additional five non-stroke and five stroke volumes were used as validation data. Two test data sets were used, both consisting of 25 stroke and 25 stroke mimic cases. The volumes in both training and inference were resampled to isotropic 0.5×0.5×0.5 mm³ resolution. Cases used in CNN training were not part of the study population for this study. The CNN consisted of a two-channel input, 40 3D convolutional layers with 3×3×3 kernel size, 16 filters each and valid padding, followed by a fully connected layer with 50 neurons and two output neurons (lesion/background) and softmax activation, producing voxel-by-voxel lesion presence confidences. Skip connections passing single layers were used to encourage gradient propagation. The network was fed with 147×147×147-voxel sub-volumes with equal number of stroke lesion positive and negative sub-volumes in each batch. The second input channel was the corresponding (left-right-mirrored) sub-volume from the contra-lateral hemisphere. The model was trained using batch-size eight and Adam optimizer for 30 epochs after which the validation loss, calculated on the separate set of 10 CTA-SI volumes, stopped improving. The network was implemented using Keras library version 2.2.4 10 and Tensorflow version 1.12.0. 11
Study design
Lesion volumes from CNN outputs and manual segmentations of final infarcts were calculated from all lesions in the affected cerebral hemisphere. Computed tomography perfusion-RAPID ischemic core estimations were reported as calculated by the software.
Only lesions in the affected cerebral hemisphere detected by the CNN were selected for the analysis with a volume threshold of >0.1 mL and a probability threshold of 0.5 for lesion inclusion. False positive lesions in the contralateral hemisphere or cerebellum were discarded from the analysis. This approach was chosen because in LVO, the site of arterial occlusion, and thus the affected hemisphere is readily identifiable from CTA.
Convolutional neural network performance was compared against a commercial software (RAPID, iSchemaView) in determining the infarct core volume derived from CTP as this is a validated and widely used method for treatment selection. The effect of two clinically relevant time windows (0–6 h and 6–24 h from symptom onset to start of CT protocol) on CNN and CTP-RAPID output accuracy in final infarct volume prediction was also tested. Patients whose time of symptom onset could not be accurately inferred from patient history (
ASPECTS anatomical regions were visually evaluated to determine CNN performance regarding anatomical accuracy against expert segmentation, that is, did the CNN predicted lesions’ locations match the final infarct locations within the ASPECTS regions in the middle cerebral artery territory. Individual regions were labeled “positive” or “negative” for ischemic changes by a radiologist (LH), as determined by the CNN from acute phase CTA and by manual segmentations from follow-up CT. Manual segmentations were considered as ground truths. Accuracy, sensitivity, specificity, and Sørensen-Dice similarity coefficient were calculated from the regions’ true or false labeling.
Convolutional neural network was compared to CTP-RAPID in determining patient eligibility for EVT according to criteria from the DAWN trial (DWI or CTP Assessment with Clinical Mismatch in the Triage of Wake Up and Late Presenting Strokes Undergoing Neurointervention) in patients that presented in the 6–24 h time window.
2
The cutoff points were ischemic core volumes ≤20 mL, ≤30 mL, or ≤50 mL depending on patient age and National Institutes of Health Stroke Scale score. Volume outputs from the CNN and CTP-RAPID were compared according to Figure 1. The number of true positives, true negatives, false positives, and false negatives was then calculated and sensitivity, specificity, negative, and positive predictive value for the CNN prediction were derived. Accuracy of the convolutional neural network (CNN) in triaging patients for endovascular thrombectomy (EVT) was assessed by defining CNN results as true positives, true negatives, false negatives, or false positives using criteria from the DAWN-study. Sensitivity, specificity, negative, and positive predictive value for the CNN prediction were then derived.
Statistical analysis
A linear model was fitted between the CNN derived volume outputs, manually segmented final infarct volumes, and CTP-RAPID ischemic core volumes (defined by cerebral blood flow (CBF) < 30%). Pearson correlation coefficients (
Results
Infarct lesion volumes provided by the CNN, CTP-RAPID, and measurements from follow-up imaging in mL, mean (SD, range).
CNN: convolutional neural network, CTP-RAPID: Computed tomography perfusion RAPID.
Among all cases, irrespective of time from symptom onset, CTP-RAPID showed a tendency for more accurate estimates of final infarct volumes ( Lesion volume (mL) correlation between convolutional neural network (CNN) output and manual segmentation from follow-up imaging. (a) All cases ( Lesion volume (mL) correlation between CT perfusion RAPID (CTP-RAPID) ischemic core (CBF <30%) and manual segmentation from follow-up imaging. (a) All cases (

Correlations with final infarct volumes in the early 0–6 h time window were
In the late 6–24 h time window, a better correlation with final infarct volumes was found for both the CNN (
Reliability of the convolutional neural network (CNN) and CT perfusion RAPID (CTP-RAPID) in predicting final infarct volume, intraclass correlation coefficients and their 95% confidence intervals.
CNN: convolutional neural network, CTP-RAPID: Computed tomography perfusion RAPID.
A correlation of Lesion volume (mL) correlation between the convolutional neural network (CNN) output and CT perfusion RAPID (CTP-RAPID) ischemic core (CBF <30%). (a) All cases (
Convolutional neural network accuracy in triaging patients for EVT in the late time window was compared to CTP-RAPID and assessed according to criteria from the DAWN trial as described in the statistical analysis section. Compared to CTP-RAPID, the CNN had a sensitivity of 0.38 with a specificity of 0.89, negative predictive value of 0.31, and positive predictive value of 0.92.
Among all patients, 11 had a chronic cerebral infarct on the same side as the LVO. In three of these cases, the CNN marked parts of the chronic infarct as an acute ischemic lesion. However, this changed the volume prediction only by 0.2–3 mL. A total of 890 ASPECTS regions were evaluated to determine the anatomical accuracy of the CNN. The accuracy, sensitivity, and specificity were 0.62, 0.70, and 0.57, respectively. The Sørensen-Dice similarity coefficient was 0.60.
Discussion
Better correlation between our CTA-based CNN outputs and CTP-RAPID core estimates with final infarct volumes were found in the late (6–24 h) versus early (0–6 h) time window. This finding supports the notion of Goyal et al. that extensive use of perfusion imaging in the early time window might not be desirable, and that instead, clinical-imaging mismatch (using NCCT) should be considered to be used for penumbra estimation while using CTA for LVO detection. 12 In the same vein, Lopez-Rivera et al. found an increased likelihood for undergoing EVT in centers with lower CTP utilization, which was not associated with worse clinical outcomes or increased hemorrhage, suggesting under-treatment bias with routine CTP. 13 Also, Boned and Martins have described the “ghost infarct core,” which refers to the tendency of CTP to overestimate infarct core size in an early time window of <3 h from symptom onset.14,15
Correlation with realized infarcts was moderate with the CNN and good with CTP-RAPID in the late time window. Previously, we found a good correlation between CNN outputs and final infarct volumes in patients with anterior circulation AIS that were not treated with EVT. 5 In that study, 55% of patients were treated with supportive care and 45% received thrombolytic therapy, which may explain the differences in performance, as a large proportion of patients can be assumed to have suffered from infarct growth and several studies have shown that CTA is more CBF than CBV weighted including some penumbra.16–18 On the other hand, Bal et al. found CTA-SI ASPECTS better than NCCT at predicting final infarct extent, especially in a very early 0–90 min time window. 19 In another study, Sallustio et al. found CTA-SI ASPECTS to be a better predictor of outcome than NCCT in patients with stroke treated with EVT. 20 In both of these studies, a fast image acquisition protocol was used for CTA. The conflicting results from these five studies could, at least in part, be explained with different ways of measuring infarct extent as three of the studies used ASPECTS and two used manually segmented lesion volumes to compare performance. These studies also used variable outcome measures for determining CTA performance and inclusion criteria for time from symptom onset to presentation also varied between studies from <3 h to <9 h. As such, findings from these studies may not reflect CTA performance in later time windows.
The observation that in the late time window, CTP-RAPID tended to underestimate core volumes, is in contradiction with numerous previous studies, which have shown a trend for core volume overestimation with CTP.21–24 These studies, however, used delays of <9 h from symptom onset as inclusion criteria, so these findings might not reflect CTP performance with delays of >9 h. Moreover, large studies have shown the benefit of EVT up to 24 h after symptom onset using CTP-RAPID to guide patient selection.1,2,25 However, CTP has also received critique in recent history, and it has been questioned, whether it should even be used outside of properly powered clinical trials. 22 Possible problems in using CTP for treatment selection in individual patients include: optimal thresholds varying between vendors and postprocessing platforms, with time after stroke, quality of collateral flow, ischemic preconditioning, and duration of perfusion scans.12,22,26–29 CTP is also more susceptible to motion artifacts than CTA.
Our CNN method had a positive predictive value of 0.92, that is, if CNN predicted core volume was below the cutoff point, CNN had a probability of 0.92 to correctly classify the patient as eligible for thrombectomy. However, the negative predictive value was only 0.31, which means that a number of patients that could benefit from EVT, would be left outside of treatment if only the CNN method was used.
Our CNN was trained with lesions that were manually segmented from CTA-SI as the ground truth. More accurate final infarct volume prediction may be possible by training the CNN with a different ground truth, such as manually segmented final infarcts from follow-up imaging. This introduces its own challenges, though. For example, there is no one optimal delay for follow-up imaging, as more than 30% relative growth in infarct volume can be witnessed in a significant portion of patients after 24 h 30 and a 3–5 day delay may overestimate infarct size also due to ischemia related edema. Follow-up ischemic lesion volume at 24 h has nevertheless been found to be a valuable secondary outcome measure. 30 The CNN underestimated final infarct volume in 28% of cases. In all of these cases, an underestimation was observed with CTP-RAPID also. This underestimation may be related to infarct progression and edema.
Sheth et al. used CTP-RAPID ischemic core estimates as ground truth for their DeepSymNet algorithm with good results but did not present comparisons to follow-up imaging. 6 The results from their study and our previous study, however, suggest that it may be possible to get reasonably accurate infarct core estimations for triaging purposes using a CTA-based deep learning method. Hilbert et al. have also used CTA-based deep learning models in predicting functional outcome and reperfusion results in AIS. 7 Their approach was quite different in that no lesion segmentations or volumetric data was used for neural network training, but instead, functional outcome and reperfusion measures were used as outcomes and visualization models were used afterward to assess which features or parts of the images the models used for decision making.
In this study, we selected consecutive patients who received EVT to simulate real life performance. This resulted in some limitations to available data, as the exact time from symptom onset to recanalization was unknown for almost half of all patients, albeit they presented ≤24 h from the time they were last known to be well. This prohibits us from analyzing whether correlation between CNN outputs and final infarct volumes would have been better depending on the delay from symptom onset to recanalization. We used follow-up CT to determine final infarct extent, as a 24 h follow-up CT is the standard protocol in our institution, although it is not as sensitive as DWI. Other limitations of this study include different vendors for CTA imaging and variation in follow-up time and imaging modalities.
In conclusion, a CTA-based CNN is able to detect anterior circulation ischemic strokes with moderate correlation to final infarct volumes in the late time window (6–24 h) in patients successfully treated with EVT.
Footnotes
Acknowledgments
Eero Salli, PhD (Tec.) and Ulla Wilppu, M.Sc. (Tec.) are acknowledged for their technological assistance and stimulated discussions during the project.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study received funding from Helsinki University Hospital (SS: TYH2019253, MK: Y7810046 and Y780021014, LH: Y780020120)
Ethical approval
Helsinki University Hospital ethical committee approved this retrospective study and patients’ informed consent was waived.
