Abstract
Purpose:
To evaluate performance of synthetic and real FLAIR for identifying early stroke in a multicenter cohort.
Methods:
This retrospective study was conducted using DWI and FLAIR extracted from the Endovascular Treatment in Ischemic Stroke image registry (2017–2021). The database was partitioned into subsets according to MRI field strength and manufacturer, and randomly divided into training set (70%) used for model fine-tuning, validation set (15%), and test set (15%). In test set, five readers, blinded to FLAIR sequence type, assessed DWI-FLAIR mismatch using real and synthetic FLAIR. Interobserver agreement for DWI-FLAIR rating and concordance between synthetic and real FLAIR were evaluated with kappa statistics. Sensitivity and specificity for identification of ⩽4.5 h AIS were compared in patients with known onset-to-MRI delay using McNemar’s test.
Results:
1454 complete MRI sets (1172 patients, median (IQR) age: 73 years (62–82); 762 women) acquired on 125 MRI units were analyzed. In test set (207 MRI), interobserver reproducibility for DWI-FLAIR mismatch labeling was substantial for real and synthetic FLAIR (Fleiss κ = 0.79 (95%CI: 0.73–0.84) and 0.77 (95%CI: 0.71–0.82), respectively). After consensus, concordance between real and synthetic FLAIR was excellent (κ = 0.85 (95%CI: 0.78–0.92)). In 141 MRI sets with known onset-to-MRI delay, diagnostic performances for ⩽4.5 h AIS identification did not differ between real and synthetic FLAIR (sensitivity: 60/71 (85%) vs 59/71 (83%), p = .56; specificity: 65/70 (93%) vs 65/70 (93%), p > 0.99).
Conclusion:
A deep-learning-based FLAIR fine-tuned on multicenter data can provide comparable performances to real FLAIR for early AIS identification. This approach may help reducing MR protocol duration and motion artifacts.
Introduction
In acute ischemic stroke (AIS), recanalization treatment decision either by intravenous thrombolysis or endovascular therapy, is highly impacted by imaging data and time constraints.1–3 While treatment window has expanded, notably to late stroke patients,4–6 imaging has emerged as a cornerstone to assess collaterals7,8 and identify potentially salvageable brain parenchyma beyond pre-defined timeframes. 9 In this setting and given the variability of stroke growth, optimizing the imaging workflow remains crucial for providing the best-informed clinical decision support in the shortest possible time. 10 Depending on available modality, CT or MRI scan are typically performed before treatment decision, leading to comparable treatment delays in known-onset stroke. 11
In up to 27% of AIS, 12 imaging is a used for the stroke onset time estimation when unknown, such as in wake-up and unwitnessed strokes. In these clinical situations, MRI may estimate the symptom onset by assessing the DWI-FLAIR mismatch, that is, the presence of a diffusion restriction on the DWI without any significant signal change on the FLAIR sequence.13,14
A previous, single-center, study showed that a synthetic FLAIR sequence (hereafter referred as synthFLAIR) could be computed based on DWI sequence and be as clinically relevant as real FLAIR (realFLAIR) sequence to assess this mismatch pattern. Indeed, the T2-weighted acquisition embedded in the DWI sequence before application of diffusion gradients (b = 0 s/mm² (b0)) has been shown to contain information about stroke FLAIR visibility, but its analysis is limited in cortical regions where cerebrospinal fluid intensity is high. 15 The synthFLAIR sequence, whose calculation relies on these signal changes, should in contrast keep a good diagnostic value near cortical areas. Furthermore, its use should allow to reduce the MR protocol time by avoiding realFLAIR acquisition, and be used as an alternative to realFLAIR in case of motion artifacts in restless patients. 16 However, the original synthFLAIR model was developed on a homogeneous single-center dataset from a single MRI unit, and its generalizability on images issued from different MRI vendors, magnetic fields and variable sequence parameters is unknown. 17 For large-scale application, the synthFLAIR model must be adapted and validated in a multicentric environment to overcome a potential domain shift. 18 Multicenter-level studies are now part of AI recommendations driven by the FDA 19 and guideline initiatives, such as the CLAIM (Checklist for Artificial Intelligence in Medical Imaging), 20 because of potential training bias and confounding factors that may impede future broader utilization. This is even more sensitive regarding some unsupervised generative deep-learning models that can be inflected by the data distribution during the training phase, as provocatively demonstrated by some produced hallucinated images. 21
Here, we used a fine-tuning procedure to adapt the synthFLAIR model for DWI data acquired with different manufacturers and field strengths, in order to make it compatible with DWI data issued from any manufacturers at either 1.5 or 3 T. The aim of the study was to compare the diagnostic value of the new fine-tuned synthFLAIR to the realFLAIR sequence for DWI-FLAIR mismatch assessment and identification of stroke patients within 4.5 h from symptom onset in a national multicenter cohort.
Materials and methods
Data source
This retrospective study included adult patients who underwent recanalization treatment for AIS enrolled in the prospective multicenter observational Endovascular Treatment in Ischemic Stroke (ETIS) registry (ClinicalTrials Identifier: NCT03776877, approved by ethics committee ID-RCB number: 2017-A03457-46) between 2017 and December 2021. Written informed consent was obtained, and data collection and analysis were approved by ETIS review board. ETIS is a clinical multicenter registry including patients who present an acute stroke due to a large vessel occlusion with indication for mechanical thrombectomy, aged 18 years or older, with or without intravenous thrombolysis treatment. 22 ETIS-image is an associated daughter image database collecting MR and CT imaging performed in these patients and in the subgroup of centers transmitting raw image data. Inclusion criteria in our study were: (1) availability of baseline MRI acquired before treatment and/or at early follow-up; (2) availability of paired FLAIR and DWI sequences with low and high b-values (hereafter referred to as blow and bhigh). Clinical data including age, sex, National Institutes of Health Stroke Scale (NIHSS) score at admission, recanalization treatment, and stroke onset-to-MRI delay were also collected.
Selection of data subsets and stratified data partition
FLAIR and DWI sequences were evaluated by one reader (resident, G.Ha.) on an ordinal scale ranging from 1 (low quality) to 3 (high quality) and inadequate MRI quality sets due to major artifacts were excluded. MRI datasets were partitioned into subsets according to the MRI field strength (1.5 or 3 Tesla) and manufacturers (corresponding to General Electric Healthcare, Siemens Healthineers and Philips Healthcare, hereafter respectively referred as manufacturers 1, 2, and 3). A specific subset was used for DWI sequences from one MRI unit with DWI acquired with a blow value of 50 s/mm2 (rather than 0 s/mm2 in all other subsets) issued from manufacturer 2 (Supplemental Table 1). MRI sets were randomly divided into train (70%), validation (15%), and test (15%) sets with stratified randomization on each of seven subsets, MRI time (baseline or follow-up), and FLAIR quality (1, 2, or 3). Each MRI was assigned to one of the 42 (7 × 2 × 3) stratification groups depending on these three variables, and stratified splitting was performed using dedicated function from scikit-learn library. 23 In order to fulfill independence assumption for statistical tests in the test set, follow-up MRIs were excluded from analysis if the same patient was included twice in the test set.
Data preprocessing
RealFLAIR sequences were co-registered onto the corresponding DWI data using a 6-parameter rigid registration using Advanced Normalization Tools version 2.3.5 (https://stnava.github.io/ANTs). All MRI sets were either up-scaled or down-scaled into a standard 256 × 256 squared matrix size after signal normalization. Additional preprocessing steps, including data augmentation, were performed during training only according to previously published pipeline 16 (Supplemental Methods 1).
Deep-learning model update and domain adaptation
The original synthFLAIR model 16 (“Vanilla model”) was adapted to require only DWI source images as input data (i.e. without apparent diffusion coefficient (ADC) map; Supplemental Methods 2 and Supplemental Figure 1). In addition, the updated model was fine-tuned for each subset of the database. We favored this supervised domain adaptation strategy after an exploratory analysis using supervised 24 and unsupervised 25 methods evaluated on the validation set (Supplemental Methods 3). Source code and model weights are freely available on http://github.com/NeuroSainteAnne/synthFLAIR. After training, synthFLAIR were generated in the test set by applying each fine-tuned model on DWI source images.
Image analysis
For DWI-FLAIR mismatch assessment, DWI images and either realFLAIR or synthFLAIR were presented in a random order to four neuroradiologists (G.Hm., J.B., L.L., and C.O.), with respectively, 6, 4, 11, and 21 years of experience in stroke imaging, and one resident (G.Ha.). Readers were blinded to data subset, FLAIR sequence type, and onset-to-MRI delay. FLAIR lesion was categorized as not visible (i.e. presence of DWI-FLAIR mismatch), visible (i.e. absence of DWI-FLAIR mismatch), or not assessable (because of extensive white matter disease or artifacts), following the Wake-Up Stroke trial specifications. 26 One reader (G.Ha.) repeated the procedure for intraobserver reproducibility assessment after a 2-month washout period. Discrepancies between readers were resolved by consensus, either automatically when a majority agreement was reached (i.e. >3 readers among five assigned the same rating), or after agreement of two senior readers for dubious cases.
Besides the visual analysis, the ratio of signal intensity (rSI) corresponding to the relative signal intensity of the ischemic lesion to the contralateral signal intensity 16 computed on both realFLAIR and synthFLAIR was used to assess FLAIR status and detect ⩽4.5 h AIS.
Statistical analysis
Statistical analyses were performed with open-source software (R, version 4.0.1; R Foundation). Inter-observer agreement for DWI-FLAIR mismatch rating between realFLAIR and synthFLAIR of the five readers was assessed using the Fleiss Kappa (κ) coefficient. Intra-observer reproducibility for DWI-FLAIR mismatch assessment and concordance between realFLAIR and synthFLAIR were evaluated with the Cohen Kappa coefficient. Sensitivity, specificity, positive and negative predictive values of DWI-FLAIR mismatch for the identification of ⩽4.5 h AIS were compared between realFLAIR and synthFLAIR using McNemar’s test and the relative predictive value method. 27 The rSIs were compared between realFLAIR and synthFLAIR using Pearson correlation coefficients. Areas under the receiver operating characteristic curve (AUCs) for identifying ⩽4.5 h AIS were computed using rSI in stroke patients with known onset-to-MRI delay, and AUC comparison was performed using DeLong’s method. Subgroup analysis was additionally performed in the 2–9 h target window, and subgroup analyses were also performed at the subset level. Given that DWI obtained with blow = 50 s/mm2 might not contain complete T2-weighted information, we performed a subgroup analysis in all subsets with blow = 0 s/mm2. Additional post-hoc analysis for interobserver DWI-FLAIR mismatch assessment was performed in restless patients excluded from initial data partition for inadequate FLAIR quality due to major artifacts. Values are expressed with interquartile range (IQR) and/or 95%CIs. The statistical significance threshold was p < 0.05.
Results
Patients and MRI set characteristics
In total, 1490 complete MRI sets were screened. After exclusion of 27 low-quality datasets (including 16 with inadequate FLAIR quality; Figure 1), 1463 MRI sets from 1172 patients (762 women; median age: 73 years (IQR, 62–82)) acquired from 125 different MRI units were included. After data splitting, nine follow-up MRIs issued from the same subject included twice in the test set were excluded, leading to a final number of 1454 analyzed MRIs, of which 1023 (70%) were used for training, 224 (15%) for validation and 207 (15%) for testing. Among these MRI sets, 1013 (70%) were acquired before treatment and 441 (30%) at early follow-up. Clinical data are summarized in Table 1. MRI units and DWI and FLAIR sequence parameters in the seven subsets are reported in Supplemental Table 1.

Flow chart for MRI set and patient inclusion.
Data sets and patient characteristics.
NIHSS: National Institutes of Health Stroke Scale.
Values are expressed as numbers of patients with percentages in parentheses, unless otherwise specified.
Data are expressed as median with interquartile range in parentheses. NIHSS data were missing in respectively 46 (5%), 16 (7%), and 9 (4%) patients in training, validation, and test sets.
Missing data in respectively 114, 43, and 34 patients.
Missing data in respectively 135, 47, and 37 patients.
Reproducibility of DWI-FLAIR mismatch assessment
Intraobserver reproducibility was not statistically different between realFLAIR and synthFLAIR (κ = 0.82 (95%CI: 0.74–0.90) and 0.75 (0.66–0.84), respectively, p = 0.27). Interobserver reproducibility was substantial for realFLAIR and synthFLAIR sequence and not significantly different (κ = 0.79 (95%CI: 0.73–0.84) and 0.77 (0.71–0.82), respectively, p = 0.58; Table 2).
Interobserver reproducibility for DWI-FLAIR mismatch assessment between the five readers.
FLAIR: fluid-attenuated inversion recovery.
Values are expressed as κ Fleiss coefficient with 95%CI in parentheses.
The lowest interobserver reproducibility for synthFLAIR was obtained in Subset C (MRI sets with blow = 50 s/mm2; κ = 0.80 (95%CI: 0.62–0.99) and 0.58 (95%CI: 0.31–0.85) respectively for realFLAIR and synthFLAIR, p = 0.22). In MRI sets with blow = 0 s/mm2 (n = 186), interobserver reproducibility was substantial for both realFLAIR and synthFLAIR sequence (κ = 0.78 (95%CI: 0.72–0.84) and κ = 0.77 (95%CI: 0.71–0.83), respectively, p = 0.89).
Concordance between realFLAIR and synthFLAIR for mismatch assessment
Depending on the reader, rating ranged from substantial to excellent (κ = 0.70–0.83; Table 3). After consensus, four MRI sets were considered non-assessable and were thus excluded from analysis. Dubious cases were resolved by consensus review in 14/203 (7%) realFLAIR and 19/203 (9%) synthFLAIR (p = 0.38).
Concordance of DWI-FLAIR mismatch assessment between realFLAIR and synthFLAIR.
Values are expressed as κ values with 95%CI in parentheses when applicable.
In subsets A, B, and D, respectively 2, 1, and 1 MRI sets were considered non-assessable and were thus excluded from analysis after consensus.
After consensus, concordance between realFLAIR and synthFLAIR on the 203 assessable MRI sets was excellent (κ = 0.85 (0.78–0.92)). Concordance was also excellent in the subgroup of 182 assessable MRI sets with blow = 0 s/mm2 (κ = 0.87 (95%CI: 0.80–0.94)). Illustrative examples are presented in Figure 2. Consensus assessment will be used in what follows.

Diffusion-weighted imaging (DWI)–fluid-attenuated inversion recovery (FLAIR) mismatch assessment using acquired FLAIR sequence (realFLAIR) and synthetic FLAIR (synthFLAIR) in AIS. (a) DWI-FLAIR mismatch in a 69-year-old man (subset B). On 1.5 T DWI (bhigh = 1000 s/mm2) obtained 2 h and 15 min from symptom onset, diffusion restriction is seen in the left middle cerebral artery territory without signal change on the realFLAIR and synthFLAIR. (b) DWI-FLAIR mismatch in a 59-year-old man (subset G). On a 3 T DWI (bhigh = 1000 s/mm2) obtained 4 h from symptom onset, large diffusion restriction is seen in the right middle cerebral artery territory without significant signal change on the 3D realFLAIR and synthFLAIR. Note that the DWI based on EPI technique is prone to artifacts on the periphery, which results in less accurate frontal cortex delineation on synthFLAIR compared to realFLAIR. (c) Absence of DWI-FLAIR mismatch in a 54-year-old man (subset A). On 1.5 T DWI (bhigh = 1000 s/mm2) obtained 6 h and 10 min from symptom onset, diffusion restriction is seen in the left middle cerebral artery territory, also visible on realFLAIR and synthFLAIR.
Identification of ⩽4.5 h AIS with qualitative and quantitative analysis
Stroke onset-to-MRI delay was known in 141 of 203 assessable MRI sets from the test set. Early stroke (⩽4.5 h from stroke onset) was classified accurately by DWI-FLAIR mismatch with 125/141 (89%) realFLAIR and 124/141 (88%) synthFLAIR sequences (p > 0.99). The sensitivity and specificity of the DWI-FLAIR mismatch for the identification of ⩽4.5 h AIS were not significantly different between realFLAIR and synthFLAIR (sensitivity: 60/71 (85%) vs 59/71 (83%), p = 0.56; specificity: 65/70 (93%) vs 65/70 (93%), p > 0.99; Table 4).
Comparison of the diagnostic value of DWI-FLAIR mismatch after consensus review between realFLAIR and synthFLAIR to estimate stroke onset time within 4.5 h.
Diagnostic value was computed in the 141 MRI datasets when stroke onset-to-MRI delay was available. Data are expressed as number of MRI sets, with corresponding percentages in parentheses. Sensitivity and specificity were compared using the McNemar test. Predictive values were compared using the relative predictive value method.
rSIs measured on realFLAIR and synthFLAIR were highly correlated (Pearson r = 0.83 (95%CI: 0.78–0.87)). Pearson coefficient ranged from 0.77 (subset C, MRI sets with blow = 50 s/mm2) to 0.92 (subset D) and was equal to 0.83 (95%CI: 0.78–0.87) in the subgroup of all MRI sets with blow = 0 s/mm2.
AUCs using rSI for identification of ⩽4.5 h AIS were not significantly different between realFLAIR and synthFLAIR (respectively 0.90 (95%CI: 0.85–0.96) and 0.86 (95%CI: 0.84–0.95), p = 0.85), nor were they significantly different in the subgroup of MRI sets with blow = 0 s/mm2 (respectively 0.89 (95%CI: 0.83–0.95) and 0.91 (95%CI: 0.86–0.97), p = 0.60).
Subgroups of onset-to-MRI delays in the 2–9 h target window
Among 141 MRI sets where onset-to-MRI delay was known, 40 (28%) were performed in the 2–9 h target window. Interobserver reproducibility was moderate for both realFLAIR and synthFLAIR sequence (Fleiss κ = 0.69 (95%CI: 0.53–0.84) and 0.65 (0.49–0.81), respectively, p = 0.73). After consensus, concordance between realFLAIR and synthFLAIR was substantial (κ = 0.75 (0.53–0.98)). DWI-FLAIR mismatch was present in 31/40 (78%) realFLAIR and 27/40 (68%) synthFLAIR sequences (p = 0.13). Both sequences had identical accuracy for classifying stroke delay (⩽4.5 or >4.5 h from stroke onset; 31/40, 77%). The sensitivity and specificity of the DWI-FLAIR mismatch for the identification of ⩽4.5 h AIS were not significantly different between realFLAIR and synthFLAIR (sensitivity: 27/32 (84%) vs 25/32 (78%), p = 0.16; specificity: 4/8 (50%) vs 6/8 (75%), p = 0.16). The 4/40 (10%) discordant labelings corresponded to subjects labeled as DWI-FLAIR mismatch using realFLAIR and no mismatch using synthFLAIR; two subjects had early stroke (2 and 3.25-h) and two subjects had late stroke (4.7 and 6.5-h).
Post-hoc analysis in restless patients
In 16 MRI sets excluded from data partition and main analysis (inadequate FLAIR quality because of major artifacts), interobserver reproducibility for mismatch assessment on synthFLAIR sequence between the five readers was substantial (κ = 0.79 (95%CI: 0.59–0.98)). Illustrative cases are shown in Figure 3.

Diffusion-weighted imaging (DWI)–fluid-attenuated inversion recovery (FLAIR) mismatch assessment in restless patients. (a) DWI-FLAIR mismatch assessable on the synthFLAIR generated from 1.5 T DWI data in a 73-year-old woman 2 h after symptom onset (subset D). On DWI, a slight diffusion restriction is seen in the right middle artery territory without signal change on the synthFLAIR. The realFLAIR sequence was excluded from main analysis due to artifacts whereas the synthFLAIR was of diagnostic value. (b) AIS with hemorrhagic transformation in a 59-year-old woman (subset A). The realFLAIR sequence acquired with a 1.5 T MRI presented with severe motion artifacts and was excluded from main analysis. Note the absence of these artifacts on the synthFLAIR sequence.
Discussion
We have demonstrated that the original synthFLAIR model trained on homogenous single-center data could be adapted to compute clinically relevant synthFLAIR sequence on a multicenter cohort using a fine-tuning procedure.
To our knowledge, our study is the first to evaluate the adaptation of synthFLAIR in a large multicentric cohort with various manufacturers, and to propose a technical approach for this adaptation. Supervised deep-learning models’ generalizability is indeed a controversial topic, as evidenced by the recent literature, which has raised concerns about the reliability of models when faced with new heterogeneous target domains in medical imaging. 28 Our work suggests the feasibility of adapting a pre-trained model using a specific supervised domain adaptation method to overcome field strength and manufacturer shift from multicenter MRI data. To achieve such a change of scale, we first adjusted its architecture and discarded the ADC map as input. Indeed, the ADC map computed from the native DWI source image introduced signal variability by adding noise that likely affected multi-site translation without providing information relevant to the model’s purpose (see Supplemental Methods 2).
More than 100 comprehensive and primary stroke centers, using both 1.5 and 3 T MRI units from three different manufacturers participated in recruiting stroke patients and acquiring MRI data in this study. Such a broad aggregate leads to data heterogeneity beyond manufacturer and field strength, including MRI model subtypes and variability in imaging protocol and acquisition parameters either on DWI or FLAIR sequences. These variations faithfully reflect the daily clinical workflow and ensure real-world training conditions, thus allowing a widespread applicability without the need to re-train models against other MRI units at a later stage, thanks to the variety of MRI units initially included within each subset. Moreover, we purposely kept all diffusion data, including DWI with ⩾3 gradient-encoding directions and b-value variations, without considering those variations as specific domains (except for the blow variations). This data heterogeneity and these model development strategies facilitate clinical portability across any 1.5 or 3 T MRI units from three main manufacturers. Each fine-tuned model will be made available as open-source software on http://github.com/NeuroSainteAnne/synthFLAIR, in order to facilitate external validation of our technique by individual teams with different MRI units.
Within the subgroup of DWI data with blow = 50 s/mm2, the fine-tuned model presented the lowest interobserver reproducibility for DWI-FLAIR mismatch assessment as well as the lowest Pearson coefficient of the rSI compared to other subsets. This finding reinforces the underlying hypothesis that the synthFLAIR is mainly driven by the T2 contrast yielded by blow = 0 s/mm2 images 15 and explains lower performances with increasing blow values.
Subgroup analysis in the 2–9 h target window subgroup did not show any differences in interobserver reproducibility and diagnostic accuracy between realFLAIR and synthFLAIR. Despite the lack of statistical significance, synthFLAIR tended to be more “conservative” than realFLAIR in this subgroup analysis, since the four labeling discrepancies in this subgroup could have led to avoid thrombolysis using synthFLAIR mismatch definition. It should however be noted that two of these four discrepancies were justified (no DWI-FLAIR mismatch using synthFLAIR for subjects with >4.5-h onset-to-MRI delay). As a consequence, if this “conservative” feature is confirmed, synthFLAIR may be “safe” to use (reducing the risk to perform thrombolysis on late stroke and thus reducing iatrogenic hemorrhagic risk) at the expense of the number of treated patients. Until further research confirming or infirming this result, it seems thus acceptable to perform synthFLAIR only in situations where realFLAIR is deemed uninformative (restless patients).
Our initial domain definitions may be questionable as the subsets we selected, based on the MR field strengths and manufacturers, gathered very heterogeneous data, which may have led to a greater variety in the distribution of data than one would expect from a single domain as defined by the framework of the computer vision model. Data partition based on sequence parameters would have been useful to increase data homogeneity in each subset but using smaller groups would have increased the risk of overfitting, 29 especially using the fine-tuning strategy.
One of the potential strengths of synthFLAIR is its ability to overcome motion artifacts in restless patients, minimizing artifacts given the short acquisition time of DWI. This is supported by the substantial interobserver reproducibility for DWI-synthFLAIR mismatch evaluation in the post-hoc analysis of restless patients and could have a major impact on clinical practice. However, DWI, and by extension synthFLAIR, can be prone to other artifacts, including geometric distortions and susceptibility artifacts associated with EPI techniques particularly near skull base and temporal lobes.
Our study has several limitations. First, the number of MRI sets for each manufacturer and field strength was unbalanced in the seven different subsets. Its impact on each model’s performance after application of the domain adaptation technique may be difficult to apprehend. However, the smallest group (subset E), trained on only 62 subjects, reached relatively good performances as compared to other groups (realFLAIR–synthFLAIR concordance κ = 0.83). The number of MRI sets required for fine-tuning the synthFLAIR model may thus be limited. Preliminary ablation study seems to point that at least 50 MRIs may be required for this fine-tuning (see Supplemental Methods 3d).
Second, we chose to include both early and follow-up imaging in our study, even if clinical challenges and time constraints are very different in these two situations. This was performed in line with the reference study, 16 in order to increase data diversity for model training and hence improve model generalizability. 30 Moreover, if reducing acquisition time is not crucial for follow-up imaging, synthFLAIR could still be a supplementary tool in restless patients presenting with kinetic artifacts on realFLAIR sequences. This approach poses however the question of patients with two MRI in the dataset. From a learning standpoint, considering early and late imaging as independent seems acceptable. Indeed, given the important differences in MR signal, acquisition plane, and image orientation between early and late acquisitions, it seems unlikely that the model could learn patient-specific brain morphology to generate the synthFLAIR signal, particularly since the model is trained on a slice-wise basis. From a statistical standpoint, we removed follow-up imaging from the test set when the patient had also an early imaging, in order to account for statistical independence assumptions. In this study, the analysis based on the rSI showed that the AUC for the identification of ⩽4.5 h AIS on synthFLAIR tended to be lower than with the realFLAIR, without reaching statistical significance. The AUC difference between realFLAIR and synthFLAIR was however smaller in the validation dataset (with AUCs respectively equal to 0.85 and 0.84, see Supplemental Table 2, as compared to 0.90 and 0.86 in the test set), suggesting either some overfitting on the validation dataset or a variation due to data sampling. Moreover, quantitative evaluation of the rSI on the FLAIR sequence may represent an additional tool for treatment decisions, but cutoff values vary among studies14,31–34 and this parameter may not yet replace visual rating for DWI-FLAIR mismatch status in clinical practice.35,36
As the duration of realFLAIR acquisition was not available for all exams, the impact of accelerating the diagnosis process with synthFLAIR cannot be as clearly assessable as in a single-center study. 16 Due to its retrospective design, the impact of synthFLAIR on patient outcomes and management strategies could not be fully assessed beyond the potential expected benefits in time reduction acquisition in this study. Our results cannot be extrapolated to stroke mimics, 37 as we only included AIS patients. Further study still needs to be done to extend this synthFLAIR sequence to other pathologies in the setting of suspected AIS, although time management may be less decisive in those clinical situations.
Research perspectives could also include the development of a multi-task model that, beyond generating a synthFLAIR sequence from the DWI, would also predict DWI-FLAIR mismatch status 38 to enhance decision-making.
In conclusion, a single-center generative pre-trained model, fine-tuned across DWI data from different MRI manufacturers and field strengths can generate clinically relevant synthFLAIR that can compete with realFLAIR to assess DWI-FLAIR mismatch and identify early AIS at a multicenter scale. Beyond reduction time of the stroke MR protocol without the prior need for a real FLAIR sequence acquisition, synthFLAIR may be a promising alternative to overcome motion artifacts in restless patients at the acute phase of stroke.
Supplemental Material
sj-docx-1-eso-10.1177_23969873241263418 – Supplemental material for Multicenter validation of synthetic FLAIR as a substitute for FLAIR sequence in acute ischemic stroke
Supplemental material, sj-docx-1-eso-10.1177_23969873241263418 for Multicenter validation of synthetic FLAIR as a substitute for FLAIR sequence in acute ischemic stroke by Guillaume Hamon, Laurence Legrand, Ghazi Hmeydia, Guillaume Turc, Wagih Ben Hassen, Sylvain Charron, Clement Debacker, Olivier Naggara, Bertrand Thirion, Bailiang Chen, Bertrand Lapergue, Catherine Oppenheim and Joseph Benzakoun in European Stroke Journal
Footnotes
Appendix
Abbreviations
AIS = acute ischemic stroke
AUC = Area under the receiver operating characteristic curve
DWI = Diffusion-weighted imaging
FLAIR = Fluid-attenuated inversion recovery
realFLAIR = real FLAIR
synthFLAIR = synthetic FLAIR
rSI = ratio of signal intensity
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a government grant managed by the French National Research Agency (ANR) as part of the future investment program integrated into France 2030, under grant agreement No. ANR-18-RHUS-0001.
Ethical approval
This study was performed in line with the principles of the Declaration of Helsinki. The study was conducted under the Reference Methodology MR-004 for data protection relating to the processing of retrospective and prospective personal data implemented in the framework of research not involving the human person and approved by our clinical research committee.
Informed consent
Written informed consent was obtained, and data collection and analysis were approved by ETIS review board.
Guarantor
CO and JB.
Contributorship
All authors reviewed and edited the manuscript and approved the final version of the manuscript.
Data sharing statement
Data analyzed during the study were provided by a third party. Requests for data should be directed to the provider indicated in the Acknowledgments.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
