Abstract
Objective
Diabetic peripheral neuropathy (DPN) is a common complication of diabetes, posing a significant risk for foot ulcers and amputation. Corneal confocal microscopy (CCM) is a rapid, noninvasive method to assess DPN by analysing corneal nerve fibre morphology. However, selecting high-quality representative images remains a critical challenge.
Methods
In this study, we propose a fully automated CCM image-selection algorithm based on deep learning feature extraction using ResNet-18 and unsupervised clustering. The algorithm consistently identifies representative images by balancing non-redundancy and representativeness, ensuring objectivity and reproducibility.
Results
When validated against manual selection by researchers with varying expertise levels, the algorithm demonstrated superior performance in distinguishing DPN and reduced inter-observer variability. It completed the analysis of hundreds of images within 1 s, significantly enhancing diagnostic efficiency. Compared with traditional manual selection, the proposed method achieved higher diagnostic accuracy for key morphological parameters, including corneal nerve fibre density, length, and branch density.
Conclusion
The algorithm is open source and compatible with standard CCM workflows, offering researchers and clinicians a robust and efficient tool for DPN diagnosis. Further, multicentre studies are needed to validate these findings in diverse populations.
Introduction
The global prevalence of diabetes has been steadily increasing worldwide. According to the 2021 Diabetes Atlas published by the International Diabetes Federation (IDF), approximately 10.5% of adults worldwide have diabetes, a figure projected to rise to 12.2% by 2045. 1 Diabetic peripheral neuropathy (DPN) is among the most common chronic complications of diabetes, with distal symmetric polyneuropathy (DSPN) being the most prevalent subtype. Approximately 50% of patients with diabetes develop DSPN. 2 DPN is a leading cause of foot ulcers, Charcot joints, and lower limb amputations in patients with diabetes. Therefore, early identification and treatment of DPN are crucial for improving patient outcomes. 3
Recent studies have increasingly focused on the relationship between changes in corneal nerve fibres (CNFs) and DPN.4,5 Research has demonstrated that morphological alterations in CNF are closely associated with DPN severity and may appear even in the early stages of the condition.6–8 The advent of corneal confocal microscopy (CCM) and advanced image analysis software has facilitated the investigation of CNF morphological changes. 9 CCM offers several advantages as a diagnostic tool for DPN, including being rapid, noninvasive, quantitative, highly reproducible, and sensitive. These features make CCM an ideal method for assessing DPN, with significant potential for clinical application. 10
During CCM examinations, more than 100 consecutive images are typically captured per patient, but only a small subset is selected for research and diagnostic purposes. 11 To date, no standardised consensus exists regarding image-selection methods. 12 A multicentre study that established age-adjusted normative values for CNF parameters globally selected three non-overlapping high-quality images from each eye to quantify various morphological parameters. 13 However, the manual selection of images has significant drawbacks, including high subjectivity, poor reproducibility, and labour-intensive procedures. 14 For an emerging diagnostic method like CCM-based DPN assessment, uncertainty in image selection introduces subjectivity into an otherwise objective approach.
In this study, convolutional neural networks (CNNs) were utilised for quality filtering and feature vector extraction from CCM images. An unsupervised clustering algorithm was subsequently applied to develop an objective and reproducible image-selection protocol. Its effectiveness was validated by comparing the results with those obtained through a manual selection conducted by researchers with varying expertise levels. In summary, our research aimed to complement the existing CCM-based DPN diagnostic workflows, contributing to improved standardisation and reproducibility.
Methods
Ethical approval and informed consent
This study was approved by the Ethics Committee of Qilu Hospital of Shandong University (Institutional Review Board [IRB] approval number: KYLL-2017–231). Written informed consent was obtained from all participants before the study commenced and imaging procedures were conducted. The study adhered to the principles of the Declaration of Helsinki.
Study design and population
The overall research design is shown in Figure 1. This study was methodological in nature, and diagnostic tests were included to verify the effectiveness of the method.

The overall design process and graphic summary of this study.
The study participants were individuals with diabetes admitted to the Department of Endocrinology and Metabolic Diseases at Qilu Hospital of Shandong University, between November 2023 and June 2024. Inclusion criteria were (a) age between 18 and 80 years; (b) diagnosis of diabetes based on the 2024 standards of the American Diabetes Association (ADA) 15 ; and (c) voluntary provision of written informed consent. Exclusion criteria included: (a) acute complications or stress states of diabetes, (b) diabetic foot or non-healing ulcers, (c) neuropathy due to central nervous system disorders, chronic alcoholism, tumours, infectious diseases, connective tissue diseases, malnutrition, or deficiencies in folic acid or vitamin B12, (d) chronic liver disease (AST/ALT ≥ 2 times the upper limit of normal), (e) chronic kidney disease (eGFR ≤ 60 ml/min/m²), (f) active eye diseases, moderate to severe dry eye, history of eye surgery, glaucoma, corneal diseases, systemic conditions affecting the cornea, or prolonged use of contact lenses, and (g) inability to fixate, cooperate with, or tolerate CCM examination. A total of 111 individuals with diabetes participated in this study.
In addition, two external datasets were also utilised: (a)
Data acquisition
A corneal confocal microscope (HRT-II or HRT-III; Heidelberg Rostock Cornea Module, Heidelberg Engineering Inc., Germany) was used for confocal laser scanning. The main component was a Rostock corneal microscope objective lens equipped with a helium-neon laser source operating at a wavelength of 670 nm. The horizontal and vertical resolution of the microscope was 1 µm, with a magnification of 800× and an observation field of 400 µm × 400 µm. Images were acquired and stored in a fully digitalised system. Carbomer eye gel (Berlin, Germany) was used as a coupling agent between the microscope objective and the corneal cap. The eyes were topically anaesthetised with 0.4% proparacaine hydrochloride. During the CCM examination, the participant's head was fixed in position, and their gaze was directed at a central point. The objective lens was moved forward until it contacted the central cornea. The focus was fine-tuned to obtain images of the corneal endothelial cells. The lens was then advanced to capture images of the sub-basal nerve plexus layer, and multiple images were acquired and stored.
Each participant underwent a CCM examination of both eyes. Trained professionals performed electrophysiological tests on patients with diabetes to measure the motor nerve conduction velocity of both the median and tibial nerves. The Neuropathy Disability Score (NDS) was also used to assess DPN symptoms. The modified NDS evaluated signs of neuropathy, including bilateral ankle reflexes, vibration perception of the great toe (using a 128 Hz tuning fork), pinprick sensation on the dorsal foot, and temperature sensation. DPN was diagnosed according to the Toronto Consensus criteria, which require the presence of symptoms (abnormal Neuropathy Symptom Profile, NSP) or signs of neuropathy (NDS >2) and abnormal nerve conduction velocity (median nerve <50 m/s in the upper limbs or tibial nerve <40 m/s in the lower limbs). 18
A total of 35,129 CCM images were collected from 111 participants. Among these, 49 patients with diabetes were diagnosed with DPN, while 62 were classified as having NDPN. The dataset composed of these images is referred to as the ‘
Dataset selection and annotation
This study utilised various selection and annotation processes based on the aforementioned

Example of CCM image quality classification. Image A is considered a high-quality image, while the remaining images, B, C, and D, are regarded as low-quality. (A) The nerve fibres are clear, with a distinct contrast against the background. (B) The centre of the image shows an indentation caused by excessive contact of the lens with the cornea, leading to discontinuity of the nerve fibres. (C) The rapid movement of the eyeball causes distortion in the image, preventing an accurate representation of the nerve fibre morphology. (D) The majority of the image is focused on the corneal endothelium, failing to display the nerve plexus beneath the base. CCM: corneal confocal microscopy.
Algorithm design
The complete image-selection algorithm comprises three main steps:
First, a binary classification model was developed to categorise raw CCM images into high- and low-quality groups. ResNet-50 was used as the backbone network, with the output channels of the final fully connected layer set to two. The model was initialised with pre-trained weights from ImageNet. A total of 220 high- and 220 low-quality CCM images were randomly selected from
Second, a model was designed to extract and integrate features from the images. This model comprised two parallel ResNet-18 networks, each modified by removing the fully connected layer and initialised with pre-trained weights from ImageNet. The model received two input types: the original CCM image and the binary nerve fibre image. The U2Net model, trained on
Finally, the 512-dimensional feature vectors obtained in the second step were subjected to k-means clustering and grouped into predefined clusters. For each cluster, the distance between the cluster centre and each feature vector was computed. The image(s) closest to the cluster centre were selected as representative images.
An image-selection operation was performed on the
Statistical analysis
The morphological parameters of CNFs in the CCM images were calculated using two automated tools: ACCMetrics (ACCM), a well-established automated calculation tool, 19 and AiCCMetrics (AiCCM), a recently developed deep learning-based tool. 17 The parameters included corneal nerve fibre length (CNFL), corneal nerve fibre density (CNFD), and corneal nerve branch density (CNBD). Since each participant selected six representative images by manual or automatic methods, the parameters of each image were calculated and their average values were obtained as the final values for diagnosis and evaluation.
All statistical tests for difference analyses were two-tailed, with the significance level set at p < 0.05. Normality was assessed using the Shapiro–Wilk test. For continuous variables with normal distributions, t-tests were conducted, whereas the Mann–Whitney U test was applied for non-normally distributed variables.
The diagnostic and classification performances of each parameter for the outcome were evaluated using receiver operating characteristic (ROC) curves. The area under the curve (AUC) was calculated. For scenarios involving multiple parameters, a logistic regression model was used to construct a combined prediction model. The overall classification and diagnostic performance of the model were analysed using ROC curves. The truth label used to calculate the ROC curve, which is the gold standard for diagnosing DPN, was determined using the Toronto Consensus criteria.
The minimum sample size required for the diagnostic test was calculated using the PASS 11 software, with type I and type II error probabilities set at 0.05 (one-sided) and 0.1 (i.e., 90% statistical power).
All analyses, statistics, and code implementation were performed in Python 3.11.5, and executed on hardware equipped with an Intel Core i5–11300H processor (11th generation) and an NVIDIA GeForce RTX 3050 laptop GPU.
Results
Sample size and statistical efficiency calculation
Based on a review of previously published studies, 20 the expected AUC was set at 0.75. When the sample size of the disease and control groups was 21, the statistical power was 0.908. With a sample size of 40 for both groups, the statistical power increased to 0.994. These results confirm that the sample size in this study was sufficient.
Best feature fusion weight
A quantitative analysis was performed on the subset datasets generated using 11 different feature fusion weight hyperparameters with AiCCM. ROC curve analysis was used to determine AUC values and evaluate the ability to distinguish DPN. The results showed that as the hyperparameter increased from 0.0 to 1.0, the overall classification performance of the three metrics—CNFL, CNFD, and CNBD—for DPN improved.
The highest AUC for CNFD (0.7351) was observed at a hyperparameter of 0.9. For CNFL and CNBD, the highest AUCs were 0.6797 and 0.6289, respectively, both achieved at a hyperparameter of 1.0 (Figure 3).

ROC curve analysis of diagnostic performance for DPN using different feature fusion weights. DPN: diabetic peripheral neuropathy; ROC: receiver operating characteristic.
Image quantification results
A subset with a feature fusion weight of 1.0, where the features were entirely derived from the original images, was compared with the manual selection results (
For CNFD using AiCCM, the automated algorithm yielded average values of 20.70 (DPN group) and 24.20 (NDPN group), closely matching the average values from the four researchers (20.95 and 24.33). However, the researchers’ measurements showed considerable variability, ranging from 19.40 to 21.91 in the DPN group and 23.66 to 25.73 in the NDPN group.
For CNFL using AiCCM, the automated algorithm measured nerve fibre lengths as 17.12 (DPN) and 18.44 (NDPN). These values were comparable to the researchers’ averages of 16.72 and 17.52. The researchers’ measurements showed wider variability, ranging from 15.82 to 21.91 in the DPN group and 17.07 to 18.74 in the NDPN group.
For CNBD, using AiCCM, the automated algorithm measured nerve fibre branch density as 38.64 (DPN) and 40.84 (NDPN). In comparison, the researchers’ averages were 37.71 and 37.31, with ranges of 35.48 to 43.07 in the DPN group and 34.30 to 41.57 in the NDPN group.
The comparative results obtained using ACCM were similar to those from AiCCM, although the average values for CNFL and CNBD were lower. Overall, the automated algorithm produced results close to the researchers’ averages, significantly reducing variability and individual differences (Table 1, Figures 4, and 5).

Diagnostic performance of automated selecting using AiCCM for DPN. DPN: diabetic peripheral neuropathy; AiCCM: AiCCMetrics.

Diagnostic performance of automated selecting using ACCM for DPN. DPN: diabetic peripheral neuropathy; ACCM: ACCMetrics.
The mean values of CNFD, CNFL and CNBD of different sampled datasets were calculated using two automatic quantification tools for morphological parameters of nerve fibres.
DPN: diabetic peripheral neuropathy; CNFD: corneal nerve fibre density; CNBD: corneal nerve branch density; CNFL: corneal nerve fibre length; ACCM: ACCMetrics; AiCCM: AiCCMetrics.
Classification performance comparison
Using AiCCM, the automated algorithm demonstrated comparable performance to Researcher 1 in distinguishing DPN for CNFD and outperformed the other researchers. For CNFL and CNBD, the automated algorithm outperformed all other methods in its ability to distinguish DPN.
Similarly, when using ACCM, the automated algorithm outperformed all researchers in distinguishing DPN for CNFD and CNFL. However, for CNBD, all the results showed relatively poor performance, possibly due to inherent limitations in the parameter quantification tool.
Moreover, when applying logistic regression to combine the predictions from the three parameters, the performance was as follows: With AiCCM, the automated algorithm performed comparably to Researchers 1 and 2 and outperformed the rest. With ACCM, this automated algorithm using logistic regression performed slightly worse than Researcher 2 but still outdid the other researchers (Figure 6).

Comparison of classification performance between automated and manual selecting for DPN diagnosis. DPN: diabetic peripheral neuropathy.
Algorithm efficiency evaluation
These tasks were executed on an NVIDIA GeForce RTX 3050 laptop GPU, involving 35,129 quality-classification computations, 5747 feature-extraction computations, and k-means clustering computations, with a total runtime of 377.33 s. Given that the time complexity of the algorithm is O (n), the average time per task was calculated. The results showed that each selection task required an average of 3.39 s, while the average time per 100 images was 1.07 s.
Discussion
This study introduces a CCM image-selection algorithm that utilises deep learning for feature extraction and unsupervised clustering. The proposed algorithm outperformed manual selection by four researchers with varying expertise levels. It consistently produced results with identical inputs, ensuring reproducibility in the CCM image selection. Additionally, the algorithm operates at high speed on standard personal computers, facilitating its integration into existing CCM-based DPN diagnostic workflows.
Diagnosing DPN with CCM involves two steps: selecting representative images from raw data and quantifying nerve fibre morphological parameters either manually or automatically. The parameters are then averaged and compared with population standard values. While most current research focuses on the latter step, with the validity of automated quantitative parameter analysis well established21,22 studies on the image-selection process remain limited. Kalteniece et al. (sample size: n = 35) proposed a standardised manual selection protocol, demonstrating inter-observer consistency and reproducibility. 10 Similarly, Schaldemose et al. (sample size: n = 23 and n = 62) introduced a simple automated selection method based on systematic selection along nerve fibre orientation.12,23 However, the first study did not evaluate diagnostic performance differences between automated and manual selection, while the second reported slightly lower DPN classification performance with automated methods compared to manual selection. This discrepancy may stem from the limited consideration of image features in automated approaches.
Interestingly, previous studies have shown that CNNs mimic human visual processing mechanisms through pooling, residual connections, and activation functions. 24 In this study, the feature extraction layers of ResNet-18 were used to abstract image features into high-dimensional vectors. ImageNet pre-trained weights enabled the model to capture multilevel features, ranging from low-level details (e.g., edges and textures) to high-level patterns (e.g., shapes and structures). These features are versatile and applicable across various domains. An unsupervised clustering algorithm grouped similar features, identifying cluster centres as representative features and embodying the principles of ‘non-redundancy’ and ‘representativeness’. Being unsupervised and free from pretraining biases or randomness, the algorithm ensures objective and reproducible results. Our study, based on larger sample size (n = 111), demonstrated that the proposed automated selection algorithm achieved superior diagnostic performance for DPN compared to manual selection by multiple researchers. This significantly reduces the workload of researchers and enhances diagnostic efficiency for DPNs. On average, the algorithm processed 100 CCM images in less than 1 s.
Notably, when the feature fusion weight was set to 0, relying solely on binarised nerve fibre image features, the diagnostic performance for DPN matched that of manual selection. In this scenario, the features of interest (nerve fibre skeletons) aligned with those selected manually. As the feature fusion weight increased, incorporating more features from the original images steadily improved the diagnostic performance of the three morphological parameters. This suggests that CNNs can identify subtle features beyond those captured in manual selection, thereby enhancing diagnostic efficacy.
An emerging alternative involves stitching multiple local images into a panoramic view, either manually or automatically. Preliminary studies have developed automated stitching algorithms,25,26 but these methods require high-quality original images and lack the accuracy required for clinical applications. Moreover, current tools for quantifying nerve fibre parameters and reference normal values are incompatible with stitched images, hindering their integration into existing CCM workflows for DPN diagnosis. While this approach holds significant potential, a suitable selection algorithm is currently more practical and urgently needed.
This study has some limitations. First, it was conducted at a single centre, and its findings need validation through multicentre studies with larger and more diverse populations. Additionally, all participants were Chinese, which may limit the generalisability of the findings to other populations.
In conclusion, this study is the first to propose a fully automated CCM image-selection algorithm using deep learning, demonstrating superior performance in DPN diagnosis compared to manual selection.
Conclusion
This study proposes the first fully automated CCM image-selection algorithm using deep learning, which demonstrates superior diagnostic performance for DPN compared to manual selection. This algorithm ensures reproducibility, reduces workload, and integrates seamlessly into existing workflows, making it a practical tool for clinical and research applications. However, further multicentre studies with diverse populations are required to validate the generalisability.
Footnotes
Acknowledgements
This study was supported by the Jinan Clinical Research Centre for Endocrine and Metabolic Diseases. This study was funded by the National Key Research and Development Program of China (2023YFA1801100, 2023YFA1801104, and 2022YFA1004800) and the Taishan Scholars Program of Shandong Province (Grant No. tstp20231250). The funders played no role in the study design, data collection, data analysis and interpretation, or the writing of the manuscript.
Authors’ contributions
X.H. and F.L. were responsible for the overall project conceptualisation, supervision, and funding and reviewed the manuscript. Q.Q. was responsible for the research, algorithm design, statistical analysis, figure preparation, and manuscript writing and was also one of the researchers involved in the manual selection mentioned in the manuscript. J.L. provided suggestions for algorithm design. W.X. was responsible for data collection and was one of the researchers involved in the manual selection mentioned in the manuscript. W.Z. and Y.Y. were involved in the manual selection mentioned in the manuscript. L.C. provided the computational resources. All the authors have read and approved the final version of the manuscript.
Code availability
Data availability
Data supporting the analyses in this study are available from the corresponding author upon request.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Ethics approval and consent to participate
This study was approved by the Ethics Committee of the Qilu Hospital of Shandong University (Institutional Review Board [IRB] approval number: KYLL-2017–231), and all participants provided informed consent at their respective institutions. This study was conducted in accordance with the principles of the Declaration of Helsinki. All participants provided informed consent prior to imaging.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Key Research and Development Program of China, Taishan Scholars Program of Shandong Province, (grant number 2022YFA1004800, 2023YFA1801100, 2023YFA1801104, tstp20231250).
