Abstract
Objective
This study aims to develop and validate OneGout, a federated learning (FL)-based framework for early and accurate gout diagnosis to address the limitations of current diagnostic methods, specifically the invasiveness of joint aspiration and the accessibility, cost, and radiation exposure associated with advanced imaging techniques like dual-energy computed tomography (DECT).
Methods
We introduce OneGout, which pioneers a deep learning-based method for generating virtual DECT images. This approach offers a low-cost and low-radiation alternative for gout diagnosis. Furthermore, OneGout integrates federated learning (OneGout-FL) to enable collaborative model training across multiple medical institutions while ensuring patient data privacy is preserved.
Results
Experiments demonstrate that our method successfully generates high-quality virtual DECT images. The framework based on U-Net achieves a PSNR of 22.44 dB and an SSIM of 0.92 for the generation of 140 kV from 80 kV images. It also shows strong diagnostic performance, with an IoU of 46.66 and a Dice score of 63.20, indicating promising accuracy comparable to diagnoses made with real DECT scans.
Conclusion
OneGout presents an efficient, scalable, and privacy-preserving diagnostic solution for gout, particularly beneficial for resource-limited medical institutions. This framework has the potential to significantly enhance global gout management by providing a more accessible and safer diagnostic alternative.
Keywords
Introduction
As of 2020, the global prevalence of gout had reached 55.8 million, representing a 150.6% increase compared to 1990, with an age-standardized prevalence rate of 659.3 per 100,000 people. 1 With the intensification of population aging, rising obesity rates, and dietary changes, the prevalence of gout is expected to continue increasing. 2
Gout is a common inflammatory arthritis caused by purine metabolism disorders and/or impaired uric acid excretion (see Figure 1), leading to elevated blood uric acid levels. 3 This results in the deposition of monosodium urate (MSU) crystals in joints and surrounding tissues, triggering acute or chronic inflammation. 3 Prolonged MSU crystal deposition can ultimately cause joint damage and deformities, significantly impacting patients’ quality of life. This trend highlights the urgent need for improved gout diagnosis and treatment. Therefore, early and accurate diagnosis is crucial for the treatment and management of gout.

Pathogenesis of gout. Disrupted purine metabolism and/or impaired uric acid excretion lead to elevated blood uric acid levels, resulting in MSU crystal deposition in joints and surrounding tissues.
Currently, the gold standard for gout diagnosis is the identification of MSU crystals in joint aspiration samples using polarized light microscopy. 4 However, joint aspiration is an invasive procedure and may not be feasible for patients with a low synovial fluid volume. Additionally, the procedure’s success depends on the clinician’s experience, posing a risk of false-negative results. 5 In recent years, imaging examinations have demonstrated significant advantages as non-invasive diagnostic tools for gout. Among them, advanced imaging modalities such as dual-energy computed tomography (DECT) and ultrasound (US) have been widely used for gout diagnosis.6,7 US is a cost-effective, radiation-free imaging technique that detects MSU crystal deposition through characteristic features, such as the “double contour sign.” 8 However, its diagnostic accuracy is highly dependent on the operator’s experience and is limited in evaluating deep-seated joints or obese patients. 9
DECT is capable of acquiring two different energy levels (e.g. 80 kV and 140 kV) of X-rays almost simultaneously. 10 Compared with conventional CT, DECT utilizes the attenuation differences of X-ray photons at varying energy levels to distinguish MSU crystals from other types of crystal deposits. The efficacy of DECT and ultrasound in detecting MSU crystals was compared by Yan et al., 11 highlighting the significant advantage of DECT in assessing intra-articular MSU deposits. Meanwhile, the performance of different DECT techniques in detecting MSU crystals was investigated by Li et al., 12 with particular focus on the novel second-generation dual-layer spectral detector CT (dlDECT) for gouty arthritis. Their findings demonstrated that higher spatial resolution and improved diagnostic accuracy in detecting MSU crystals are offered by dlDECT. While DECT achieves good diagnostic accuracy, its real-world application is stalled by two fundamental challenges that form the motivation for our work.
First, there is a critical clinical access gap. Although DECT has demonstrated high sensitivity and specificity in gout diagnosis, 13 its widespread adoption is hindered by high equipment and technical costs, as well as a strong reliance on specialized expertise, limiting its accessibility in resource-constrained healthcare settings. 14 Currently, the low availability of DECT in primary healthcare institutions prevents many patients from receiving timely, high-precision diagnostic services. Additionally, DECT’s radiation dose may be higher than that of single-energy CT (SECT).15,16 These limitations underscore the need for an alternative approach that reduces equipment dependence while maintaining diagnostic accuracy. With the rapid advancement of deep learning, medical imaging has experienced significant improvements in diagnostic accuracy and efficiency, 17 paving the way for deep learning-based gout diagnosis.
Second, the integration of deep learning into medical applications also presents critical challenges related to data security and sharing. Deep learning models require large, diverse datasets from multiple institutions to achieve the robustness and generalizability needed for clinical deployment. However, strict patient privacy regulations (e.g. GDPR) prohibit the direct sharing of sensitive medical data. Therefore, the deep learning model must be built on a framework that enables collaborative model training without compromising patient confidentiality.
This leads us to the central research question: how can we develop a deep learning-based framework that accurately simulates DECT imaging from SECT data to improve gout diagnosis accessibility, while enabling collaborative model training across institutions without compromising patient privacy?
To overcome these dual challenges, we introduce a comprehensive framework that addresses both hardware accessibility and data privacy. The core of our contribution is OneGout, a deep learning model that virtualizes DECT by generating 140 kV CT images from a single, low-cost, low-radiation 80 kV SECT scan. This directly tackles the clinical access gap by simulating DECT’s diagnostic capabilities on standard equipment. To solve the data privacy barrier, we present OneGout-FL, an implementation of OneGout based on federated learning (FL). 18 This privacy-preserving paradigm allows for the collaborative training of a OneGout model across multiple institutions without exchanging any raw patient data.
The contributions of this study are mainly reflected in the following aspects:
Related work
Pathogenesis and detection of gout
Gout is the most common cause of inflammatory arthritis in adults.19-22 Its formation mechanism is primarily related to purine metabolism disorders and/or reduced uric acid excretion. Under normal conditions, purine substances in the body are broken down into uric acid. When purine metabolism is disrupted, leading to excessive uric acid production or reduced excretion, blood uric acid levels increase, resulting in hyperuricemia. Hyperuricemia is the most important biochemical basis for gout, though not all individuals with hyperuricemia will develop gout. The most typical form of gout is characterized by recurrent, self-limiting acute inflammatory attacks, known as gout flare-ups. 23 The disease’s complexity extends to systemic complications like renal impairment, for which machine learning has been used to identify key biomarkers. 24
Gouty tophi are formed by the aggregation of MSU crystals around an inflammatory corona structure 25 and are commonly seen in patients with inadequate treatment or severe disease. Tophi most frequently occurs in the ear’s helix, the first metatarsophalangeal joint of the toes, fingers, wrists, elbows, and knees. In rare cases, they may also appear in the nasal cartilage, tongue, vocal cords, eyelids, aorta, heart valves, and myocardium. Gouty tophi can exert pressure on surrounding structures, 26 particularly in confined spaces such as the spine 27 or carpal tunnel. In severe cases, tophi may lead to chronic arthritis, often affecting multiple joints.
Traditional gout diagnostic methods include synovial fluid analysis, which is considered reliable for identifying crystals under polarized light microscopy. 22 This method provides an immediate diagnosis, even between acute flare-ups, guiding treatment planning and potentially avoiding unnecessary further testing. Since the discovery of MSU and calcium pyrophosphate (CPP) crystals in the synovial fluid of gout and CPP crystal arthritis patients, their identification through compensated polarized microscopy has become the gold standard for diagnosing crystal-induced arthritis. 28 Despite its diagnostic importance, synovial fluid analysis has several limitations in clinical practice. First, joint aspiration is an invasive procedure that may cause pain and discomfort and carry risks of complications such as infection. 4 Second, the quality and storage conditions of synovial fluid samples significantly impact the accuracy of the test results. 29 Moreover, variations in the experience and expertise of different observers may lead to inconsistencies in diagnosis. 30
To overcome these drawbacks, non-invasive imaging has become essential. 31 DECT has revolutionized gout diagnosis. DECT identifies MSU crystals by leveraging the photon energy-dependent attenuation properties of different materials. It scans the target using two different X-ray energy levels (e.g. 80 kV and 140 kV) to obtain attenuation data across different energy spectra. Based on atomic number and material density characteristics, DECT can differentiate between various tissue components. During post-processing, DECT applies color coding to distinguish urate crystals/tophi from other calcifications.32,33 This material decomposition capability is also effective in analogous applications, such as identifying urinary stones. 34 In contrast, SECT provides imaging results at only one energy level, lacking the ability to differentiate between these materials. Studies consistently confirm that DECT offers superior sensitivity and specificity compared to other methods,35-38 and its performance can be further enhanced with AI-based reconstruction techniques. 39
Privacy challenges in learning-based medical imaging
Advancements in artificial intelligence (AI) have significantly transformed medical imaging, enhancing disease diagnosis, image processing, and clinical decision-making.17,40,41 Recent works have demonstrated the potential of AI-driven models, such as convolutional neural networks, to synthesize high-fidelity medical images, facilitating multimodal diagnosis and improving clinical workflows.42,43
Despite these advancements, the increasing digitization of healthcare data introduces significant privacy and security challenges. Medical institutions generate vast amounts of sensitive patient information, which is subject to strict regulatory protections, such as GDPR. 44 Centralized data storage and processing models face heightened risks of privacy breaches, as cyberattacks on centralized repositories can lead to large-scale patient data leaks. 45 Additionally, data fragmentation across different healthcare institutions exacerbates the issue of data silos, hindering the development of comprehensive diagnostic models. 46
To address these challenges, FL has emerged as a paradigm-shifting approach. 47 FL allows multiple institutions to collaboratively train a shared model without exchanging raw patient data, mitigating privacy risks while enhancing model performance. This decentralized method has been successfully applied in various medical domains, including skin cancer prediction, 48 and its versatile framework can be adapted to different data distribution scenarios.49,50 Recent innovations have further tailored FL for CT imaging, incorporating physics-driven personalization and even leveraging large language models to secure and enhance complex U-shaped networks.51-53 These studies prove that FL can achieve high accuracy while fostering the secure, cross-institutional collaboration needed for modern medical AI.54-56
Motivation
In gout diagnosis, DECT has consistently been recognized as a highly useful non-invasive diagnostic tool. 57 However, the high cost of DECT equipment limits its widespread adoption in hospitals. How to maximize the benefits of this technology while minimizing costs and potential drawbacks has become a critical issue for our research. This study explores an innovative solution by attempting, for the first time, to generate dual-energy CT images solely from SECT images, thereby reducing dependence on DECT equipment.
To achieve this goal, we designed a comprehensive FL model centered around a deep learning network based on U-Net-like architectures. This model fully leverages U-Net’s encoder-decoder structure and skip connections to achieve high-precision mapping from SECT to DECT. By generating high-quality synthetic dual-energy CT images, this study not only achieves the detection capability for early gout lesions but, more importantly, provides an efficient gout diagnostic tool to more medical institutions without increasing equipment costs.
Method
Study design and data acquisition
This was a multicenter, retrospective diagnostic accuracy study conducted between January 2021 and June 2024 using data from the Department of Medical Imaging at Guangzhou First People’s Hospital, Guangzhou, China. The study was approved by the Institutional Review Board (IRB) of Guangzhou First People’s Hospital, which granted a waiver of informed consent due to the retrospective nature of the research and the use of fully anonymized data.
The inclusion criteria encompassed cases of gout diagnosed based on clinical symptoms. All DECT scans were conducted using second-generation dual-source CT (DSCT) equipment from Siemens. To obtain high-quality image data, the scanning parameters were optimized and adjusted for each specific anatomical site, balancing radiation dose, image quality, and detection sensitivity for urate deposition. All scans were performed in dual-energy mode, with the voltage parameters for the low-energy and high-energy channels set to 80 kV and 140 kV (tin filtration), respectively. These settings were combined with the automatic exposure control (AEC) system to further optimize the radiation dose. During data acquisition, a standard reconstruction kernel was utilized for soft tissue analysis, and additional high-resolution reconstruction was performed to enhance the resolution of fine structures.
This study retrospectively collected imaging data from 250 patients of three branches of the hospital with a history of gout or suspected gout who underwent DECT examinations. During the data screening process, to ensure the reliability of the data and the rigor of the study, cases with a uric acid deposition volume of less than 0.05 cm3 were excluded to avoid potential misjudgments caused by minimal sedimentation. Additionally, images exhibiting significant artifacts due to factors such as metal implants were removed to maintain the quality of the input data during the model training process. Samples that could not undergo complete gout post-processing analysis due to partial image loss were also excluded.
After a rigorous screening process, 139 cases of foot and ankle CT data were ultimately included in the study. Urate deposition was identified in these cases and was utilized for model training and validation. Of these, data from 129 patients were allocated to the training set (comprising 124 males and 5 females, with an average age of 44.2
Data selection and grouping summary.
While human tissue composition is inherently consistent, the Hounsfield Unit (HU) distribution probability density (excluding air regions) and lesion volumes in CT images can vary significantly across individual patients due to anatomical differences, disease severity, and scanning conditions. Figure 2 illustrates this inherent data richness within our cohort. This natural inter-patient heterogeneity serves as a robust foundation for evaluating our model, as our FL framework is specifically designed to leverage such diverse data.

HU distribution across four different patients. Lesion volumes are denoted at the upper right.
Overview of the OneGout framework
This study proposes a new deep learning framework named OneGout. Its aim is to use a deep learning model to generate 140 kV monoenergetic CT images from 80 kV monoenergetic CT images. It further predicts gout lesions, simulating the effects of DECT while reducing dependence on dual-energy CT equipment. Figure 3 compares our approach with traditional gout diagnosis methods. The conventional methods include arthrocentesis, which is invasive, lacks universality, and has a high false-negative rate, and DECT, which relies on expensive equipment, involves high radiation exposure, and presents procedural difficulties.

Comparison of our proposed OneGout framework with traditional gout diagnosis methods. Conventional approaches such as arthrocentesis are invasive and have a high false-negative rate, while DECT requires expensive equipment and involves high radiation exposure. OneGout leverages deep learning to generate 140 kV monoenergetic CT images from 80 kV images, enabling gout lesion prediction with reduced reliance on DECT. Crystals images under microscopy are adapted from “A glance into the future of gout” by Sivera F, Andres M, Dalbeth N.
In contrast, OneGout utilizes cost-effective equipment with a single radiation exposure to achieve the functionality of DECT and predict gout lesions. This approach enhances accessibility while minimizing both risks and costs. The following sections will detail its network architecture and FL algorithms.
U-Shaped networks for CT image generation (OneGout)
The OneGout framework addresses the challenges in gout diagnosis by facilitating image conversion from SECT to DECT and predicting gout lesions. This provides a cost-effective and efficient solution for medical institutions lacking DECT equipment. As illustrated in Figure 4, OneGout employs a flexible deep learning architecture, where the backbone can be a U-shaped neural network.

OneGout employs a flexible deep learning architecture (Unet, R2Unet, AttUnet, TransUnet, and SwinUnet are demonstrated in this figure), allowing any neural network as the backbone.
In this study, U-Net is adopted as one of the candidate backbones for generating 140 kV monoenergetic CT images from 80 kV monoenergetic CT images. Additionally, several U-Net variants are incorporated as the alternative backbones, including R2U-Net, 58 which introduces recurrent residual blocks to enhance feature refinement; AttU-Net, 59 which integrates attention mechanisms to selectively emphasize critical regions; and TransUNet, which embeds vision transformer (ViT) modules into the encoder to model long-range dependencies through self-attention mechanisms. Furthermore, SwinUNet leverages the Swin Transformer’s hierarchical representation learning to enhance global context modeling.
Conventional L2 loss treats all pixels equally, which fails to account for the varying clinical importance of different tissues. This can lead to suboptimal quality, especially for structures that require higher precision, such as bones and soft tissues. To address this limitation, we propose a weighted L2 loss that prioritizes important anatomical structures by assigning different weights to predefined HU ranges. The weighted L2 loss ensures that different tissue types contribute differently to the total loss. Given a predicted CT image Air (-1000 to -900 HU, Weight = 1.0): Low weight because it has minimal clinical relevance. Fat (-100 to -50 HU, Weight = 15): Moderate weight to ensure proper visualization of fat distribution. Soft Tissue (0–80 HU, Weight = 10): Higher weight due to its importance in organ and muscle structures. Cancellous Bone (200–400 HU, Weight = 20): Increased weight to enhance the fine details of trabecular bone. Cortical Bone (600–1000 HU, Weight = 30): Highest weight to preserve critical bony structures.
This weighted loss allows the model to focus on preserving the structural integrity of important tissues. The weight values are set based on the clinical importance and density of the tissues. The following HU ranges and weights are used:
This weighting scheme ensures that more important structures are reconstructed with greater accuracy.
To improve perceptual image quality and suppress artifacts, a PSNR-based loss is added:
Since PSNR measures the inverse logarithmic relationship with MSE, minimizing
The overall loss is defined as:
Gout coverage image generation
After successfully generating 140 kV images, the OneGout model calculates gout lesions based on the input 80 kV images and the 140 kV images generated. Specifically, the CT value of the gout coverage image
OneGout-FL based on federated-learning
The OneGout-FL framework adopts the horizontal federated learning (HFL) paradigm, where each participating medical institution stores patient data locally and shares only model parameters, not raw data. The architecture, shown in Figure 5, ensures maximum privacy protection while balancing data security and model collaboration. The end-to-end training process for OneGout-FL is detailed in Algorithm 1.

Federated learning for OneGout-FL. Icons are from iconpark (iconpark.oceanengine.com), under Apache License 2.0.
The process is orchestrated by a central server and involves multiple iterative communication rounds (
A crucial aspect of our framework is the aggregation strategy. Once local training is complete, clients do not transmit their raw data; instead, they send only the calculated model updates (gradients) back to the server. The server then employs the FedNova aggregation algorithm. Unlike simpler averaging methods, 60 FedNova 61 normalizes the contributions from each client based on their local computational effort, which effectively counteracts issues arising from heterogeneous data distributions (non-IID data) and variable local training steps across clients. This leads to more stable and faster convergence. The server aggregates these normalized updates to refine the global model, which is then broadcast in the next communication round.
In our FL setup, we assume a non-IID (non-independent and identically distributed) data distribution among participating clients. This reflects the real-world heterogeneity of clinical data across institutions, where factors such as imaging protocols, patient populations, scanner types, and disease prevalence can vary significantly. Each client possesses locally collected patient data, which remains private and is not shared. Instead, only the aforementioned model updates are communicated to the central server for aggregation. This non-IID setting poses greater challenges for model convergence and generalization but also makes the federated training scenario more representative and clinically relevant.

Overall, the OneGout-FL framework offers a scalable and privacy-preserving solution that overcomes the traditional barriers of medical data sharing, paving the way for more robust and intelligent medical image analysis.
Experiments and results
Implementation details
All experiments were conducted on two Nvidia RTX 3090 GPUs, each with 24GB of memory, using PyTorch 2.1. We resized the input images to 512x512 and set the batch size to 16. To prevent overfitting, we applied data augmentation techniques, including random horizontal flipping and random rotation. For optimization, we used the AdamW
62
optimizer with a learning rate of 1e
Quantitative evaluations
Image quality was quantitatively assessed by comparing the generated images with the ground-truth monoenergetic CT images using the peak signal-to-noise ratio (PSNR) and the structural similarity index measure (SSIM). Note that all CT images were normalized by dividing by 4000 HU before PSNR and SSIM calculation. This brings the PSNR values into a more conventional and interpretable range for medical imaging tasks. Higher values for both PSNR and SSIM denote greater fidelity and structural correspondence to the real images. The SSIM value ranges from 0 (no similarity) to 1 (perfect identity). To evaluate the spatial accuracy of generated structures and regions of interest, the Intersection over Union (IoU) was calculated, with scores approaching 100 % indicating a near-perfect overlap. Furthermore, the Dice coefficient was employed to specifically measure the segmentation accuracy of gout lesions, where higher values signify superior performance. OneGout is capable of bidirectional image generation: creating 140 kV images from 80 kV scans and vice versa. To evaluate its performance in both directions, we conduct experiments for both tasks in the following.
The Table 2 presents a performance comparison of different deep learning models in generating 140 kV monoenergetic CT images from 80 kV images. Among the models, UNet demonstrates competitive performance with a mean PSNR of 22.44 and SSIM of 0.92, reflecting its balanced capability in image reconstruction fidelity and structural similarity. It also outperforms others in gout segmentation accuracy, with the highest mean IoU (46.66) and Dice score (63.20).
Performance comparison of OneGout in generating 140 kV monoenergetic CT images from 80 kV images with various backbones across different metrics (PSNR, SSIM, IoU, and Dice).
In contrast, R2Unet demonstrates the lowest performance across all metrics, with a mean IoU of 10.85 and a Dice score of 19.36, suggesting that it struggles with both image translation and segmentation. AttUnet, TransUnet, and SwinUnet show intermediate results, with AttUnet slightly outperforming the others in PSNR (28.26) but lagging in segmentation accuracy compared to Unet. SwinUnet’s segmentation performance was the second-weakest, after that of R2Unet. Overall, Unet emerges as the most effective model for generating high-quality 140 kV images while maintaining strong segmentation capabilities.
Table 3 presents a performance comparison of different backbones in generating 80 kV monoenergetic CT images from 140 kV images. Unet achieves the highest overall performance, with a mean PSNR of 23.74, SSIM of 0.86, and the best segmentation accuracy (mean IoU: 39.14, Dice: 55.36). AttUnet and TransUnet show comparable results, with TransUnet slightly outperforming AttUnet in IoU (35.42 vs. 33.22) and Dice (51.60 vs. 48.88), although both lag behind Unet. R2Unet exhibits weaker performance across all metrics, with a notably lower mean IoU (17.61) and Dice score (29.25), indicating its limited ability in image reconstruction and segmentation. SwinUnet shows the lowest overall performance, with the lowest PSNR (18.60), SSIM (0.81), IoU (22.48), and Dice (36.03), making it the least effective model in this task. Overall, Unet demonstrates superior reconstruction and segmentation capabilities in converting 140 kV images to 80 kV images.
Performance comparison of OneGout in generating 80 kV monoenergetic CT images from 140 kV images with various backbones across different metrics (PSNR, SSIM, IoU, and Dice).
Comparing the two tasks, converting 80 kV to 140 kV is generally easier than converting 140 kV to 80 kV, as all models achieve higher PSNR, SSIM, IoU, and Dice scores in the first scenario. Unet consistently performs the best in both cases, with the highest image quality and segmentation accuracy, though its performance slightly declines when predicting 80 kV from 140 kV (PSNR: 23.74 vs. 22.44, SSIM: 0.86 vs. 0.92, IoU: 39.14 vs. 46.66, Dice: 55.36 vs. 63.20). The results suggest that predicting 140 kV from 80 kV is a more straightforward task, likely because higher-energy images retain richer attenuation information, while reconstructing lost details in lower-energy images is inherently more difficult. We adopt the 80 kV to 140 kV task as the default task in subsequent experiments.
Qualitative evaluations
Figure 6 presents a comparison between virtual DECT images generated using the OneGout framework and real DECT images for gout patients. It can be observed that the virtual DECT images produced by our framework exhibit outstanding visual quality.

Comparison of the original 140 kV image with the predicted 140 kV image.
In terms of details, the joint structure boundaries are sharp and well-defined, and the bone texture appears fine and highly realistic, accurately capturing subtle bone features. The soft tissue layers are clearly delineated, with distinct differentiation between different tissues. Additionally, the morphology, size, and distribution of urate crystal deposits are accurately displayed in Figure 7.

Comparison between the calculated 140 kV gout image and the predicted 140 kV gout image.
These generated images closely resemble real DECT images, making them difficult to distinguish from their real counterparts in both overall composition and fine details. This further demonstrates the high practicality and effectiveness of the OneGout framework in clinical applications such as gout diagnosis, providing reliable diagnostic support for physicians.
Federated learning approach
In this experiment, we employ a FL approach using FedNova 61 to train OneGout-FL with Unet as the backbone for generating monoenergetic CT images. The dataset is randomly split into three subsets, each assigned to one of the three clients. Each client trains its model independently on its local dataset without sharing raw data, ensuring privacy preservation. The central server aggregates the client models to enhance generalization. After training, we evaluate the performance of both the client models and the aggregated server model using PSNR, SSIM, IoU, and Dice.
The results, presented in the Table 4, indicate that the server model nearly outperforms all individual clients across all metrics, confirming the effectiveness of FL. The server achieves the highest SSIM (0.91), IoU (24.06), and Dice (37.26), demonstrating improved image reconstruction and segmentation accuracy. Among the clients, Client-2 performs best, with a mean PSNR of 24.85, SSIM of 0.87, IoU of 17.03, and Dice of 26.80, suggesting its data subset might be more representative. Clients 1 and 3 show slightly lower performance, likely due to variations in data distribution. While there is room for further improvement, these data strongly indicate that the OneGout framework trained using FL performs exceptionally well in accurately identifying and segmenting the affected areas of gout lesions. This, in turn, provides more robust support for the clinical diagnosis of gout. Overall, FL has obvious advantages in enhancing model performance and demonstrates great potential in the field of medical image analysis.
Performance of models from clients and sever across different metrics (PSNR, SSIM, IoU, and Dice).
Discussion
Traditional methods are limited by the high cost and radiation of DECT scanners15,16 or the invasive nature of joint aspiration.4,5 Our findings suggest that the OneGout framework is sensitive to capturing the necessary features for diagnosis, indicating its potential to provide diagnostic information that distinguishes between healthy and pathological tissue in gout patients. The OneGout system might hence serve as an alternative screening tool for gout diagnosis, given that it can effectively replicate DECT functionality based on more accessible single-energy CT scans.
Additionally, our federated learning model (OneGout-FL), which leveraged performance differences between the centrally aggregated server model and individual client models, showed high effectiveness, with the server model outperforming nearly all individual clients. One possible explanation for this result is that the federated approach 60 allows the model to learn from a more diverse dataset without violating patient privacy, thereby improving its generalizability and robustness.
In addition to image generation, the framework was used to classify and segment gout lesions, revealing that the model could accurately identify and mark suspicious urate crystal areas (Figure 7). Our results suggest that virtual DECT generation can facilitate rapid and accurate screening of gout lesions and could also facilitate the monitoring of treatment progress.
Our findings indicate that the data generation process is highly effective in a controlled setting. The successful deployment in a simulated FL environment suggests a pathway to overcome the limitations of single-center data. The OneGout-FL architecture, which uses the FedNova aggregation strategy, 61 is specifically designed to handle the data and device heterogeneity expected in a real-world multi-institutional collaboration. This addresses the key data governance challenges that often hinder the development of AI tools in medicine.
Conclusion
This study presents OneGout, an innovative deep learning framework that bridges the gap between advanced imaging capabilities and clinical accessibility in gout diagnosis. By transforming routine single-energy CT scans into diagnostically equivalent dual-energy images, the system overcomes the cost and radiation barriers of conventional DECT while maintaining comparable accuracy in detecting urate crystal deposits. The incorporation of FL enables multi-institutional collaboration without compromising patient privacy, addressing critical data-sharing challenges in healthcare AI. With its adaptable architecture combining U-Net and Transformer models, the solution demonstrates particular promise for underserved medical facilities lacking specialized equipment. The technical approach, featuring tissue-specific loss functions and robust validation metrics, establishes a new paradigm for implementing AI-powered diagnostic tools in real-world clinical environments. These advancements not only enhance gout management but also provide a blueprint for applying similar methodologies to other medical imaging challenges where cost and accessibility limit optimal care delivery.
Footnotes
Acknowledgments
The urate crystal image presented in Figure 3 is reproduced from the following publication: Sivera F, Andres M, Dalbeth N. (2022). A glance into the future of gout.
are from iconpark (iconpark.oceanengine.com), licensed under Apache License 2.0.
Ethical approval
This study was reviewed and approved by the Ethics Committee of Guangzhou First People’s Hospital (Approval No.[k-2024-130-01]).
Contributorship
Yufang Dong did writing—review and editing, writing—original draft, methodology, investigation, formal analysis, visualization, data curation. Min Liu did writing—review and editing, writing—original draft, methodology, formal analysis, investigation, project administration. Jiajun Feng did writing—review and editing, writing—original draft, methodology, formal analysis, investigation, resources. Yuezhe Yang did writing—review and editing, writing—original draft, methodology, data curation, formal analysis. Yong Dai did writing—review and editing, conceptualization, supervision. Zhe Jin did writing—review and editing, conceptualization, supervision, funding acquisition.
Funding
The authors received no financial support for the research, authorship, and/or publication of this article.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
