Abstract
Cancer-associated thrombosis (CAT) and atrial fibrillation (AF)-related stroke are two subtypes of acute embolic stroke with distinct lesion patterns on diffusion weighted imaging (DWI). This pilot study aimed to evaluate the feasibility and performance of DWI-based machine learning models for differentiating between CAT and AF-related stroke. Patients with CAT and AF-related stroke were enrolled. In this pilot study with a small sample size, DWI images were augmented by flipping and/or contrast shifting to build convolutional neural network (CNN) predicative models. DWI images from 29 patients, including 9 patients with CAT and 20 with AF-related stroke, were analyzed. Training and testing accuracies of the DWI-based CNN model were 87.1% and 78.6%, respectively. Training and testing accuracies were 95.2% and 85.7%, respectively, for the second CNN model that combined DWI images with demographic/clinical characteristics. There were no significant differences in sensitivity, specificity, accuracy, and AUC between two CNN models (all P = n.s.).
The DWI-based CNN model using data augmentation may be useful for differentiating CAT from AF-related stroke.
Keywords
Introduction
Acute stroke is a leading causes of mortality globally. 1 With an increasing incidence trend, over 750 000 new strokes and 140 000 stroke-related deaths occur annually in the United States. 2 Acute strokes are classified into two major categories depending on etiology: ischemic strokes caused by abrupt blockage of an artery, and hemorrhagic strokes caused by bleeding. 3 Notably, ischemic strokes account for about 87% of all strokes, and include two subtypes: embolic strokes and thrombotic strokes. 4 Compared to thrombotic stroke, embolic stroke was associated with greater stroke severity in spite of a much lower incidence rate. 5 Cardioembolic stroke, also known as atrial fibrillation (AF)-related stroke, and cancer-associated thrombosis (CAT) are two common subtypes of acute embolic stroke with distinct etiologies. 6 CAT is often associated with lung, gastric, pancreatic, and colorectal adenocarcinoma. 7 Through the evaluation of infarction patterns and stroke characteristics, emergent imaging may help to reveal stroke etiology and suggest treatment strategies. 8
Diffusion-weighted imaging (DWI) is a form of Magnetic Resonance Imaging (MRI), which measures the random Brownian motion of water molecules within tissue voxel. 9 The 2010 American Academy of Neurology (AAN) guideline suggested that “DWI should be performed for the most accurate diagnosis of acute ischemic stroke”. 10 This recommendation was supported by several lines of evidence indicating that DWI has higher sensitivity and specificity for detecting acute ischemic stroke than computed tomography (CT).11–14 CAT is characterized by acute multifocal simultaneous infarcts that can be diagnosed using diffusion weighted imaging (DWI). 15 DWI is reported to effectively distinguish CAT from other causes of cerebral embolism.6,15–17 Nouh and colleagues16,18 found that DWI lesions located in the bilateral anterior and posterior circulation territories, defined as the Three Territory Sign (TTS), are radiological markers that were six times more commonly seen in CAT than in AF-related stroke. Kwon et al 19 observed that patients with bihemispheric infarctions of unknown etiology often had concealed cancer. Furthermore, Schwarzbach et al 15 suggested that after excluding other embolic stroke etiologies, scattered DWI lesions in multiple vascular territories strongly suggest cancer etiology. The difference in DWI lesion patterns between AF-related stroke and CAT may facilitate differential diagnosis and early treatment.20–22
Manual interpretation of DWI images imposes a high burden of human effort and monetary cost, and automatic interpretation via machine learning appears to be a promising solution. Machine learning uses algorithms to imitate the way that humans learn, and the performance of these algorithms improves upon exposure to more data over time. Machine learning–based prediction models have been developed to automatically extract and analyze digital image data from patients with stroke, facilitating early diagnosis and reducing labor costs. 23 Among machine learning algorithms, convolutional neural network (CNN) was mainly used for image recognition and processing, because of its ability to recognize patterns in images, and has been modified in various ways to develop prediction models for the detection of ischemic stroke lesions from DWI images.24,25 The development of machine learning models for image-based diagnosis typically requires large training-data sets. Such large-scale image data collections demand substantial labeling costs. In addition, due to the complexity and heterogeneity of brain disorders, brain imaging datasets often have small sample sizes. 26 A variety of data augmentation techniques, such as resizing, flipping, rotating, cropping, and padding, have been developed to expand the quantity of image data for training machine learning models 27 and have been used to build various CNN-based models for detecting stroke lesions.28–31 The present retrospective pilot study aimed to evaluate the feasibility and performance of machine learning-based CNN models using data augmentation to differentiate between CAT and AF-related stroke from DWI images.
Materials and Methods
Study Population
This pilot study retrospectively enrolled two distinct groups of inpatients: those with AF-related stroke and those with CAT, all of who were treated at Chang-Gung Memorial Hospital, Taiwan during different time periods.
The clinical guidelines for the diagnosis and treatment of embolic stroke were followed.32,33 The AF-related stroke group initially enrolled consecutive patients who were diagnosed with acute embolic infarction and AF based on DWI, transthoracic echocardiography, 12-lead electrocardiogram, and 24-h Holter monitor between July 2016 and January 2018. Patients who had a malignancy history within 3 months before or after stroke, and patients with valvular heart disease, infective endocarditis, autoimmune disease, or significant stenosis of cerebral arteries were excluded. Patients without definite AF, acute infarction, or MRI images also were excluded. Brain magnetic resonance angiograph was used to ensure the absence of significant stenosis of cerebral arteries in included patients.
Because the incidence of CAT ranges from 0.5% to 6%, depending on the ascertainment of CAT and study populations,34–36 the inclusion period for patients with CAT was set at 10 years in this study. The CAT group initially enrolled consecutive patients who were diagnosed with adenocarcinoma and cerebral embolism within a 3-month period from January 2007 to December 2016. Patients with arrhythmia, cancer history of more than 3 months, brain metastasis, significant stenosis of cerebral arteries, infective endocarditis, or substance abuse were excluded. Patients without MRI images, definitive pathology, or known origin also were excluded.
Ethical Considerations
The study protocol was approved by the Institutional Review Board of Chang Gung Memorial Hospital (IRB: 201701493B0, 201601546B0, and 201601546B0C601). Because all patient data were de-identified in this retrospective pilot study, patients’ informed consent was waived.
Study Variables
Data for eligible patients, including sex, age, D-dimer level, activated partial thromboplastin time (aPTT), prothrombin time (PT), cancer type, cancer staging, radiotherapy, chemotherapy, coronary thrombosis, novel oral anticoagulants, and intracranial stenosis, where applicable, were extracted from medical records for comparative analysis. Notably, D-dimer was not routinely measured in patients with AF-related stroke in clinical practice, so D-dimer values were often missing in the AF-related stroke group.
DWI Processing and Analysis
Patients’ DWI images were obtained using a GE Optima MR450w 1.5 T MRI scanner (GE Healthcare, Chicago, IL, USA). DWI parameters were as follows: slice thickness, 5 mm; in-plane resolution, 1.718 mm; matrix, 128 × 128; repetition time, 6200 s; diffusion-encoding gradient directions, 72.1; b-value, 3 s/mm2; number of averages, 1000; and acquisition time, 4 s. Among eligible patients, only 29 with 28 DWI images were subject to image processing and analysis. As shown in Figure 1, patient-wise separation was applied in the training set and testing set with 8:2 separation, as previously described.37,38 After image data augmentation, the training set included 32 sets of CAT and 30 sets of AF, and the testing set included four sets of CAT and 10 sets of AF. After a CNN was developed based on DWI brain images, five-fold cross validation was applied in the training sets and subsequent CNN model. Later, the CNN model was also applied to the testing set for disease classification.

Flow chart of patient selection and image processing and analysis. CAT, Cancer-associated thrombosis; AF, atrial fibrillation; DWI, diffusion-weighted imaging.
In patients with 28 DWI images, each brain image was divided into three regions: left, right, and posterior (Figure 2A). Patients’ brain images were resized to 128 × 128 × 64 voxels for the posterior region and 128 × 64 × 64 voxels for the left and right regions. To reduce overfitting, all DWI images were augmented after resizing. For patients with AF-related stroke, the original DWI images were augmented by flipping, thereby doubling the number of DWI images for subsequent analysis (Figure 2B). Because far fewer patients had CAT than AF-related stroke, the original DWI images of patients with CAT were augmented by flipping first, and then both the original and flipped images were augmented by contrast shifting, resulting in a fourfold increase in the number of DWI images for subsequent analysis (Figure 2B). After preprocessing, all images were combined to construct a 3D image pending further analysis using the OpenCV 4.5.3 library (Python Software Foundation, Beaverton, OR, USA).

MRI image preprocessing. (A) Each brain image was divided into three regions: left, right, and posterior, as indicated by red boxes. (B) Image augmentation.
Convolutional Neural Network Model
The CNN model consisted of convolutional (Conv) layers, average pooling layers, maximum-pooling layers, batch normalization layers, fully-connected (FC) layers, dropout layers, a rectified linear unit (ReLU) and softmax (Figure 3). The CNN network of DWI has an input of 320 × 320 × 28, with a 64-bit kernel and bias value of 64. The first and second Conv layers had 64-bit kernel with a size of 3 × 3 × 3 voxels, and were connected to the max pooling layer. The third and fourth Conv layer had 128-bit kernel and 256-bit kernel with a size of 3 × 3 × 3 voxels, respectively, and was connected to an average pooling layer. The CNN models of three regions were then wrapped into one FC layer. The CNN network of clinical parameters has seven input and two FC layer. The parameters used in the CNN model were as follows: batch size, 2; learning rate, 0.001; and epoch, 50. The CNN model was run at a workstation equipped with the Windows 10 operating system and NVIDIA 2080Ti GPU. The Keras library from Python 3.9.5 (Python Software Foundation, Beaverton, OR, USA) was used for training and testing the CNN model.

The hierarchical architecture of the 3D CNN model.
Statistical Analysis
Continuous variables are presented as the mean and standard deviation, and their between-group differences were examined by T-test. Categorical variables are expressed as frequency and percentages, and their between-group differences were examined by chi-square test. The classification performance of the CNN model was evaluated using the area under the receiver operating characteristic curve (AUC) and confusion matrix. The significance level was set as two-sided p < 0.05. All statistical analyses were performed using IBM SPSS statistical software version 20 for Windows (IBM Corp., Armonk, NY, USA).
Results
Patient Selection and Research Design
The flowchart of patient selection and research design is presented in Figure 1. A total of 38 patients were eligible, including 12 patients diagnosed with CAT and 26 patients diagnosed with AF-related stroke. However, only 29 of the 38 patients had 28 DWI images (nine patients with CAT and 20 patients with AF-related stroke); data from these 29 patients were subject to the final image processing and analysis (Figure 1).
Clinical Features of Patients with CAT or AF-Related Stroke
The clinical features of patients diagnosed with CAT are summarized in Supplemental Table S1. In patients with CAT, the most common cancer type was lung cancer (n = 4), followed by colorectal/gastrointestinal cancer (n = 3) and hepatobiliary cancer (n = 3). The majority of cancer stages were III or IV (n = 9) (Supplemental Table S1). The clinical features of patients diagnosed with AF-related stroke are summarized in Supplemental Table S2.
Demographic and Clinical Characteristics of Patients with CAT or AF-Related Stroke
Comparison of demographic and clinical characteristics between patients with CAT and AF-related stroke are shown in Table 1. Patients with AF-related stroke had a slightly higher mean age at diagnosis than patients with CAT, but without statistical significance. The mean level of D-dimer in patients with CAT was significantly higher than that of patients with AF-related stroke (p < 0.001) (Table 1).
Demographic and Clinical Characteristics of Patients with CAT or AF-Related Stroke.
CAT: Cancer-associated thrombosis; AF, atrial fibrillation; aPTT, activated partial thromboplastin time; PT, prothrombin time.
T-test; bchi-square test.
*Significant difference, p < 0.05.
Three patients in the CAT group had missing values. Since D-dimer was not routinely measured in patients with AF-related stroke in clinical practice, 23 patients in the AF group had missing D-dimer values.
Three patients in the CAT group and four patients in the AF group had missing values.
Three patients in the CAT group and three patients in the AF group had missing values.
DWI Lesion Patterns in Patients with CAT or AF-Related Stroke
DWI infarcts involving bilateral anterior and posterior circulation (three territories) were observed in 10 of 12 CAT patients (Figure 4). The remaining two CAT patients had two-territory infarction, one with unilateral acute multiple brain infarction (AMBI) (Figure 5A) and the other with AMBI involving bilateral anterior circulation (Figure 5B). Comparison of the DWI lesion patterns between patients with CAT and AF-related stroke indicated that those with AF-related stroke had a higher percentage of one-vessel arterial territory (69.2% vs 0%, respectively), unilateral (57.6% vs 8.3%, respectively), anterior arterial circulation (61.5% vs 8.3%, respectively) and cortical arterial distribution (57.6% vs 8.3%, respectively) than di those diagnosed with CAT, although no significant between-group differences were observed (Table 2).

Axial diffusion weighted images of brain MRI show multiple acute brain infarctions involving bilateral anterior and posterior circulation in ten patients with cancer-associated thrombosis.

Axial diffusion weighted images of brain MRI in two patients with cancer-associated thrombosis show (A) multiple acute brain infarctions involving unilateral brain and (B) bilateral anterior circulation.
DWI Lesion Patterns in Patients with CAT or AF-Related Stroke.
CAT, cancer-associated thrombosis; AF, atrial fibrillation; DWI, diffusion-weighted magnetic resonance imaging.
chi-square test.
Classification Performance of CNN Models
Two CNN predictive models were developed. The first model was built based on brain DWI images only. The second model included the brain DWI images in addition to demographic and clinical characteristics (age, sex, aPTT, and PT). D-dimer was not included because of the high rate of missing values. The classification performance of these two CNN models was evaluated on training sets and testing sets separately.
Training and testing accuracies of the DWI-based CNN model were plotted as a function of the epoch number (Figure 6A). The classification performance of the DWI-based CNN model was defined by a confusion matrix, which revealed that training and testing accuracies were 87.1% and 78.6%, respectively (Figure 6B and C). For the combined CNN model, the training and testing accuracy per epoch is shown in Figure 6D. The classification performance of the combined CNN model was defined by confusion matrix, which showed that training and testing accuracies were 95.2% and 85.7%, respectively (Figure 6E and F).

Classification performance of CNN models. (A) Accuracy of training group, (B) confusion matrix of training group, and (C) confusion matrix of testing group of the DWI-based CNN model. (D) Accuracy of training group, (E) confusion matrix of training group, and (F) confusion matrix of testing group of the combined CNN model.
In addition, performance comparison between the DWI-based CNN model and the combined model using training dataset or testing dataset was summarized in Supplemental Table S3. No significant between-model differences in sensitivity, specificity, accuracy, and AUC were observed (all P = n.s.) (Supplemental Table S3
Discussion
In this retrospective pilot study with a small sample size, the patients’ DWI images were augmented by flipping and/or contrast shifting to increase the number of DWI images for constructing CNN models to distinguish CAT from AF-related stroke. The preliminary results revealed that the training and testing accuracies of the DWI-based CNN model were 87.1% and 78.6%, respectively. The training and testing accuracies increased to 95.2% and 85.7%, respectively, if demographic and clinical characteristics were also considered for building a combined CNN model. Although patients diagnosed with AF-related stroke had higher percentages of one-vessel arterial territory, unilateral, anterior arterial circulation, and cortical arterial distribution than did those with CAT, no significant between-group differences in DWI lesion patterns were observed.
While conventional MRI sequences may not reveal an infarct for up to six hours, an increased DWI signal in ischemic brain tissue can be observed within a few minutes after arterial occlusion. 39 The DWI is commonly used to evaluate multiple-territory acute lesion patterns in patients with AF-related stroke or malignant disease.20–22 DWI is particularly noted for its sensitivity (88%-100%) and specificity (86%–100%) in the early detection of small infarcts and thus has become a valuable tool for investigating ischemic stroke20–22 and is commonly used in emergency department settings. 40
Cancer-related hypercoagulability is the underlying cause of CAT in 40% of affected patients. 41 In these patients, the DWI pattern typically includes multiple lesions and arterial territories, and D-dimer levels are elevated, which may be associated with neurological deterioration, stroke recurrence, and poor survival. D-dimer was suggested to be an essential factor for the diagnosis of cerebral embolism. 6 Patients with CAT were reported to have a significantly higher level of D-dimer, with a cut-off value of 3.0μg/mL.22,42 Furthermore, a study seeking markers for differentiating CAT from other stroke types in Asian patients showed that high D-dimer levels and DWI pattern together could serve as a diagnostic index for differential diagnosis. 22 However, the large number of patients with missing D-dimer data in our cohort prohibited the inclusion of D-dimer levels in the construction of the CNN predictive model, which is a limitation of the study.
Although all included patients in the present study had undergone DWI, the numbers of DWI images varied between patients. For consistency, we decided to use 28 DWI images per patient for machine learning-based image processing and analysis. Image data augmentation has been widely used to increase sample sizes for the purpose of building machine learning models for medical imaging, particularly in pilot studies with limited resources. 27 Remarkably, via image data augmentation, DWI-based CNN models for stroke imaging have been developed and tested in several studies with small sample sizes.28–31 In the present pilot study, the number of DWI images of patients with AF-related stroke was doubled by flipping, and the number of DWI images of patients with CAT was quadrupled by flipping and contrast shifting. However, the sample size was still relatively small after data augmentation, and the possibility of overfitting and underfitting cannot be ruled out. This limitation may be at least in part responsible for the observed false positives and false negatives.
The current preliminary results indicate that the addition of demographic/clinical measurements increased the accuracy of the DWI-based CNN model for distinguishing CAT from AF-related stroke. An array of candidate blood biomarkers for the differential diagnosis of ischemic stroke has been examined.43–45 Atrial cardiopathy, atrial fibrillation, arterial disease, left ventricular disease, cardiac valvulopathy, and patent foramen ovale are suggested to be associated with embolic stroke of undetermined source. 46 Whether inclusion of the aforementioned blood biomarkers and cardiac conditions increases the predictive accuracy of DWI-based CNN models for differential diagnosis of acute embolic strokes remains to be investigated by large-scale multicenter studies.
Because TTS is six times more frequent in CAT than in AF-related stroke, TSS is suggested to be a specific marker for CAT. 18 Patients with CAT may have lesions in multiple articular territories as identified by DWI; the number of territories involved also may yield valid prediction of occult systemic malignancy in cryptogenic stroke patients. 20 The hypercoagulable state of cancer patients increases the risk for stroke, and the association between stroke and occult malignancy, particularly among older adults, is becoming more widely recognized. 47 Although the representative DWI images in the present pilot study showed three-territory or two-territory infarctions, no significant differences in DWI lesion patterns between CAT and AF-related stroke were observed, which may be attributed to the small sample size. However, it is highly likely that the proposed CNN models using data augmentation techniques were able to recognize slight differences in DWI lesion patterns between CAT and AF-related stroke, thereby distinguishing CAT from AF-related stroke.
Limitations
The present pilot study has several limitations, including, single-center design, the lack of power analysis, and small sample size, which limit the generalizability of this study. Though AF-related stroke, and CAT are two common subtypes of acute embolic stroke, 6 the incidences of AF-related stroke and CAT are lower among various subtypes of stroke. 5 Furthermore, compared to AF-related stroke, CAT is a relatively rare. 34 The inclusion period for patients with CAT was 6.6 times longer than that of patients with AF-related stroke in this study; however, the number of eligible patients with CAT was half that of patients with AF-related stroke. In addition, many values across multiple variables were missing, which may skew the overall interpretation of the findings. Since D-dimer was not routinely measured in patients with AF-related stroke in clinical practice, 23 patients in the AF group had missing D-dimer values. Thus, large-scale multicenter studies are warranted to confirm the findings of this single institute retrospective pilot study.
Conclusions
This DWI-based CNN model using data augmentation may be useful for differentiating CAT from AF-related stroke. The results of this pilot study with a small sample size suggest the feasibility of machine learning–based DWI image processing for differential diagnosis of embolic stroke.
Supplemental Material
sj-docx-1-cat-10.1177_10760296231203663 - Supplemental material for Differential Diagnostic Value of Machine Learning–Based Models for Embolic Stroke
Supplemental material, sj-docx-1-cat-10.1177_10760296231203663 for Differential Diagnostic Value of Machine Learning–Based Models for Embolic Stroke by HsunYu Kuo, Tsai-Wei Liu, Yo-Ping Huang, Shy-Chyi Chin, Long-Sun Ro and Hung-Chou Kuo in Clinical and Applied Thrombosis/Hemostasis
Footnotes
Acknowledgements
We thank all the patients whose clinical data were evaluated in this study.
Author Contributions
Hsun-Yu Kuo, Yo-Ping Huang, and Hung-Chou Kuo conceived and designed the experiments.
Hsun-Yu Kuo and Hung-Chou Kuo performed the experiments.
Hsun-Yu Kuo, Tsai-Wei Liu, Yo-Ping Huang, and Hung-Chou Kuo analyzed and interpreted the data.
All authors contributed reagents, materials, analysis tools or data and wrote the paper.
Data Availability Statement
The datasets generated and/or analyzed during the present study are available from the corresponding author on reasonable request.
Declaration of Conflict of Interest
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethics Approval
The study protocol was approved by the Institutional Review Board of Chang Gung Memorial Hospital (IRB: 201701493B0, 201601546B0, and 201601546B0C601).
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported as a joint project between the National Taipei University of Technology and the Chang Gung Memorial Hospital under Grant NTUT-CGMH-106-05 to Yo-Ping Huang and CORPG3G0061 to Hung-Chou Kuo.
Informed Consent
Because all patient data were de-identified in this retrospective pilot study, patients’ informed consent was waived.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
