Abstract
It is well established that the retina provides insights beyond the eye. Through observation of retinal microvascular changes, studies have shown that the retina contains information related to cardiovascular disease. Despite the tremendous efforts toward reducing the effects of cardiovascular diseases, they remain a global challenge and a significant public health concern. Conventionally, predicting the risk of cardiovascular disease involves the assessment of preclinical features, risk factors, or biomarkers. However, they are associated with cost implications, and tests to acquire predictive parameters are invasive. Artificial intelligence systems, particularly deep learning (DL) methods applied to fundus images have been generating significant interest as an adjunct assessment tool with the potential of enhancing efforts to prevent cardiovascular disease mortality. Risk factors such as age, gender, smoking status, hypertension, and diabetes can be predicted from fundus images using DL applications with comparable performance to human beings. A clinical change to incorporate DL systems for the analysis of fundus images as an equally good test over more expensive and invasive procedures may require conducting prospective clinical trials to mitigate all the possible ethical challenges and medicolegal implications. This review presents current evidence regarding the use of DL applications on fundus images to predict cardiovascular disease.
Keywords
Introduction
The retina is considered a window to the body that provides insights beyond the eye when evaluating the risk of systemic diseases.1,2 A thorough assessment of fundus features can lead to the early detection of different systemic conditions like cardiovascular diseases (CVDs), and neurological disorders, suggesting that retinal microvascular abnormalities are markers or predictive indicators of CVD. 3 For instance, arteriovenous nicking is associated with CVD mortality 4 and stroke risk. 5 Retinal hemorrhages, cotton wool spots, and microaneurysms are also associated with incident risk of stroke. 5 Tortuosity is associated with the risk of death from ischemic heart disease. 6 For a long time, researchers have explained these relationships largely depending on observable and classifiable features of the retina using fundus images. Manual assessment of fundus images to make systemic disease risk determinations often requires adequate expertise and is a tedious exercise to complete. On the other hand, automated fundus image analysis using deep learning (DL) systems may help to reduce practitioners’ need to manually look for retinal features to predict risk for systemic diseases.1,7
DL is one of the methods of artificial intelligence (AI) that focuses on computation systems’ ability to learn and recognize predictive features acquired digitally by sensors. A DL model uses multilayered neural network architecture to learn the desired features by itself without the need for outside manipulation, that is, manually inputting other relevant features. The neural networks repeatedly adjust their parameters to improve the learning performance, meaning that a DL algorithm has discriminative abilities comparable to human beings and the potential to improve our predictive and analytic accuracy of systemic disease risk. For retinal imaging tasks in ophthalmology, convolutional neural networks (CNN) are the most commonly used DL networks. 8 Different studies have applied DL systems in the last decade and successfully screened or detected ocular conditions such as diabetic retinopathy, age-related macular degeneration, and glaucoma from fundus images. 9 The wide availability of fundus cameras that are easy to use, store and transfer data has provided an opportunity to create datasets or fundus image repositories (i.e. UK Biobank) to train, validate and test DL systems, hence raising optimism for easy clinical integration. Recently, some researchers have extended this concept to CVDs noting that there is some information contained in a fundus image about CVD risk factors and major adverse cardiac events (MACE) that can be leveraged for DL-fundus image related task. Though DL systems using fundus images cannot be applied confidently in clinical settings to predict or detect systemic diseases at this stage in AI development, they show a promising future for innovative ways of disease management that require more research. This review summarizes recently published studies and highlights ways in which the concept of DL applications on fundus images can be approached in managing CVD.
Review criteria
For this review, a literature search was conducted mainly in PubMed, Web of Science, and Google Scholar databases. We searched for studies using the following search terms, ‘artificial intelligence’, ‘deep learning’, ‘fundus images’, ‘color fundus photography’, ‘eye’, and ‘cardiovascular diseases’. We also retrieved and reviewed related articles referenced in the identified articles. Except for the language to which we only included studies published in English, we applied no filter for the type of study and year of publication. However, for studies applying DL to predict, assess or diagnose systemic conditions, risk factors or biomarkers, particular interest was placed on those published in the last 5 years through 2022. A total of 19 studies published in the last 5 years were considered as the main references for this review, the details are summarized in Table 1.
Summary of included DL CVD-related studies.
AHES, Australian Heart Eye Study; AUC, area under receiver operating curve; BES, Beijing Eye Study; CAA, carotid artery atherosclerosis; CAC, coronary artery calcium; CC-FII, China Consortium of Fundus Image Investigation; CKD, chronic kidney disease; CNN, convolution neural network; CRAE, central retinal artery equivalent; CUHK-STDR, Chinese University of Hong Kong Sight-Threatening Diabetic Retinopathy; CVD, cardiovascular disease; DRIVE, Digital Retinal Images for Vessel Extraction; GUSTO, Growing Up in Singapore Toward Healthy Outcomes; HKCES, Hong Kong Children Eye Study; HPC-SNUH, Health Promotion Center of Seoul National University Hospital; HTN, hypertension; IRED, retinal imaging study in renal patients; KSH, Kangbuk Samsung Health; MACE, major adverse cardiac events; MAE, minimum absolute error; MEH, Moorfields Eye Hospital; RICP, retinal imaging in chest pain; SBP/DBP, systolic/diastolic blood pressure; SEED, Singapore Epidemiology of Eye Diseases; SiDRP, Singapore Integrated diabetic Retinopathy Program; SIVA-DLS, Singapore I vessel assessment – deep learning system; SP2, Singapore Prospective Study program; T2DM, type 2 diabetes; UoA-DR, University of Auckland Diabetic Retinopathy; VGG, Visual Geometry Group.
The rationale for new directions in CVD risk assessment
CVDs are responsible for more than 30% of all deaths globally. 29 Efforts to prevent the occurrence of CVD events involve early identification of people at increased risk to initiate appropriate interventions. 30 To achieve this, healthcare providers have been using prediction models or calculators such as the Pooled Cohort equations, 31 Systematic Coronary Risk Evaluation (SCORE),32,33 and Framingham 34 for CVD stratification. The parameters for calculation are mainly risk factors derived from the patient’s history (i.e. gender, age, smoking status, body mass index, presence of diabetes and hypertension) 35 and the patient’s blood or urine samples. However, variations of CVD risk estimates exist among the models,31,36,37 possibly due to differences in calibration (i.e. selection of clinical endpoint and estimation period), databases used, and discriminative ability. That may result in the models predicting different CVD outcomes.38–40 Furthermore, the assessment of predicting parameters may also be affected by other factors such as the unavailability of adequate resources, high costs of equipment, invasiveness of some of the sampling procedures, risk of infection during the sampling procedure, and that some sampling resources may be biohazardous. 41 Therefore, there is a need for newer and innovative ways to complement the current CVD risk assessment tools, and DL analysis of fundus images is one of them.
DL and fundus image analysis approaches for CVD
This section shows evidence of DL applications using color fundus photography in predicting or detecting CVD-related factors. The study findings show the validity of DL models, and the potential benefits that DL can offer to clinicians regarding screening and monitoring vascular-related diseases.
Predicting risk factors from fundus images
Assessment of risk factors is a crucial part of CVD risk stratification because some are used by risk calculators to predict CVD risk. 42 CVD risk factors can be modifiable or non-modifiable. Modifiable risk factors are associated with lifestyle habits such as smoking, lack of physical exercise, hypertension, diabetes, obesity, and dyslipidemia. In contrast, non-modifiable risk factors include age, gender, and familial history. The practical and effective use of risk calculators is more complex for some health workers than it may be for cardiologists, or any person directly involved in managing CVD. Generally, some obstacles may hinder the routine application of CVD risk prediction tools, such as a lack of adequate knowledge of risk estimates and the inability to demonstrate with absolute certainty that the tool accurately identifies individuals at risk, which may cause concern to healthcare providers. 43 In addition, the whole process of risk assessment using charts, tablets, and risk calculator-based applications is time-consuming for healthcare providers, even if only a few risk factors are being assessed. 44
Several studies have demonstrated that DL systems can predict CVD risk factors from fundus images. The first widely visible study was by Poplin et al. 10 The DL model in their study predicted CVD risk factors from fundus images collected from the UK Biobank and EyePACS datasets of more than 280,000 patients. The model directly and accurately predicted risk factors such as age and systolic blood pressure achieving a mean absolute error (MAE) within 3.26 years and 11.23 mmHg, respectively. The model also achieved an area under the receiver operating characteristics curve (AUC) of 0.97 and 0.71 for predicting gender and smoking status. Correlating the images directly to CVD events, the model predicted the risk of MACE, achieving an AUC of 0.70, comparable to the European SCORE risk calculator’s AUC of 0.72.
In a recent study by Zhu et al., 11 the model predicted age excellently and achieved a strong correlation between predicted retinal age and chronological age (0.81, p < 0.001), with an MAE of 3.55 years, slightly higher than that of Poplin et al. Other studies have also reported good model performance in predicting risk factors from fundus images. A model by Yun et al. 27 predicted age, sex, and HbA1c status using a DL algorithm, achieving an impressive performance within the validation set, with AUC of 0.931, 0.933, and 0.734, respectively. Gerrits et al. 12 predicted age and smoking status with an MAE of 2.78 years and an AUC of 0.96, respectively. Vaghefi et al. 13 achieved an AUC of 0.86 for smoking status. Betzler et al. 14 and Korot et al. 15 achieved an AUC of 0.94 and 0.93 for gender, respectively. A model by Rim et al. 16 predicted 10 out of the 47 biomarkers studied with an outstanding performance observed for age and sex. Similarly, a model by Zhang et al. 17 achieved AUCs of 0.880, 0.766, and 0.703 for predicting hyperglycemia, hypertension, and dyslipidemia, respectively. In addition, the model predicted other blood test erythrocyte parameters and a group of CVD risk factors (age, drinking status, smoking status, and body mass index) with AUCs > 0.70.
This evidence suggests that DL applications can quantify and predict risk factors with reasonable precision, particularly age, sex, and smoking status, which are the commonly used risk factors for all CVD risk assessment tools. 45 This ascertains that DL applications on fundus images may offer a simple and quick way of identifying CVD risks, thereby improving risk stratification.
Estimating CVD biomarkers from fundus images
A novel question may be whether DL systems using fundus images can replace already existing diagnostic or prognostic biomarkers of CVD. Carotid intima-media thickness (CIMT) is one of the validated predictive biomarkers for the incidence of major CVD events. It is used for predicting early atherosclerosis among patients in the intermediate-risk category. 46 However, measuring CIMT is a complicated procedure that requires specialized equipment and a well-trained sonographer. Chang et al. 18 used a DL model to predict carotid artery atherosclerosis from fundus images. The model achieved an AUC of 0.713, and the model’s calculated Deep Learning Funduscopic Atherosclerosis score was predictive of an increased hazard for CVD mortality. Interestingly, their results suggest that DL and fundus images can predict atherosclerosis as well as CIMT, especially in patients that are in the preclinical or asymptomatic stage. This also offers an added benefit to CVD risk stratification, as conventional risk estimates may underestimate the risk. 47
Coronary artery calcium score (CAC) is another validated independent CVD biomarker associated with developing MACE, all-cause mortality, and cardiac mortality.48,49 CAC score is used for diagnosing atherosclerosis. 50 It plays a crucial role in identifying asymptomatic patients at an intermediate-risk stage of CVD.51,52 It also provides extra prognostic information related to other CVD biomarkers and helps to improve the accuracy of prediction by risk calculators.49,53 To measure the CAC score, a patient is supposed to undergo a CT scan, which carries the risk of radiation to users. In addition, it may induce patient psychological stress in cases of a positive CAC scan and potentially influence a series of other cardiac-related tests that may not be necessary for a clinically stable individual. 53 The predicting factors for a high CAC score include age, smoking status, and cholesterol levels. The retina also contains such predictive information.
A study by Rim et al. 19 using fundus images trained a DL algorithm first to predict the presence of CAC (using a Retinal Coronary artery calcium score, RetiCAC) and secondly, to evaluate the RetiCAC score’s ability to predict cardiovascular events. The model predicted the presence of CAC with good performance (AUC of 0.742), like other predicting factors such as age and smoking status, respectively. The RetiCAC score was also predictive of CVD events comparable to the traditional CT scan-measured CAC score. A similar study by Son et al. 20 also reported a high predicted CAC score using DL systems on fundus images; this strongly suggests that color fundus image-estimated CAC score could predict CVD events, replacing or as an adjunct (less invasive, radiation-free biomarker) to CT-scan-measured CAC score.
Retinal blood vessel measurements like retinal vessel caliber and arteriovenous ratio are also crucial for CVD risk assessment. However, their analysis using semiautomated systems is affected by the complexity of the quantification tools and established protocol guidelines that require substantial expertise for objective manipulation. Therefore, as reported by Cheung et al. 21 and Fukutsu et al., 22 the combined use of DL applications and fundus images may be a reliable method of obtaining vascular measurements that are predictive of CVD on a large sample of subjects. Their DL algorithms measured retinal vessel caliber and vessel alterations, respectively, and achieved comparable performance to human beings.
Predicting CVD-related diseases from fundus images
Despite the significant progress made in the therapeutic management of CVD, many patients remain asymptomatic, and the risk of mortality remains high because some patients simultaneously suffer from other associated conditions, such as chronic kidney disease (CKD), diabetes mellitus, hypertension, and anemia, which complicate disease management.
Chronic kidney disease
CKD affects approximately 8–16% of the world’s population and is associated with an increased risk of CVD and death. 54 In addition to the significant burden of stroke, myocardial infarction and congestive heart failure, CVDs are responsible for more than 50% of all deaths in patients with CKD. 55 Early detection or screening relies on evaluating a wide range of tests. Estimated glomerular filtration rate (eGFR) from serum or urine samples is the most widely used diagnostic test, yet not necessarily feasible in other settings. 23 Many patients with CKD tend to have diabetes and hypertension, which are considered high risks for CVD mortality. 24
As part of eyecare, screening for retinopathy is necessary for all patients with hypertension and diabetes. Different studies have reported the association of structural changes in the retinal microvasculature or retinopathy changes with the presence and progression of CKD.56–62 Likewise, albumin to creatinine ratio, blood uric acid, blood creatinine, blood albumin, and eGFR have been shown to have a significant correlation with the progression of retinopathy. 63 A meta-analysis study assessing the association of CKD and retinal occlusive diseases reported a higher prevalence of CKD in patients diagnosed with retinal vein occlusion. 64 This suggests that a fundus assessment may facilitate early detection of CKD in high-risk populations.
A few studies using DL algorithms and fundus images explored the relationships between retinal characteristics and renal function. A CKD detection first-reported study using DL and fundus images was by Sabanayagam et al. 23 The model predicted CKD with good performance in all datasets (AUC more than 0.80), including subgroups of diabetes and hypertension. The researchers also reported no difference in model performance when fundus images only were used and after incorporating clinical metadata in the DL evaluation. The finding suggests that using fundus images alone without other factors to screen CKD is possible. In another study by Zhang et al. 24 using fundus images only, the DL model predicted and detected CKD with a good performance, AUC 0.861. However, the model performed better when tested using a combination of patient metadata and retinal images (AUC of 0.930). Kang et al. 25 using DL, detected early renal functional impairment in an overall study population of diabetic patients, AUC > 0.81. The model performed better in those with elevated serum HbA1c partly because of retinal microvascular damage secondary to diabetes. Furthermore, reduced eGFR is associated with changes in the retinal vasculature. 56 Zhang et al. 24 tested if eGFR can be measured using a DL-based analysis of fundus images alone. There was a strong linear correlation between DL-predicted glomerular filtration rate and measured eGFR, meaning that AI models can identify subtle changes or information contained in fundus images about renal function.
Hypertension
Hypertension is yet another complex condition associated with multiple CVD events. As already cited above, DL approaches showed good predictive performance of hypertension. In Dai et al., 26 even though their model predicted hypertension with slightly good performance (AUC of 0.606 and precision of 58.97%), they showed that microvascular morphological changes at the branch sites of blood vessels are crucial areas that DL algorithms may use to detect hypertension because that is where the effect of hypertension relatable to AI may be.
Hypertension manifests on the retina as hypertensive retinopathy and choroidopathy. Independent of other traditional risk factors, hypertensive retinopathy reflects vascular changes happening throughout the body. 65 So far, no DL algorithms that detect or predict hypertensive retinopathy have been reported in the literature. Nevertheless, creating such algorithms may help in the early detection and intervention of hypertension, especially in asymptomatic patients, thereby preventing any associated CVD-related mortality.
Diabetes mellitus
CVD is also a common cause of death in diabetic patients. 66 The retina’s manifestations are progressive microvascular and retinal tissue damage, defined as diabetic retinopathy. Fundus photography is the most commonly used imaging modality in detecting diabetic retinopathy. 67 Fundus image analysis not only helps to detect retinopathy but can also be used to detect predictive features or biomarkers associated with diabetes, such as age, body mass index, and hypertension. 16 Different studies have reported the association of diabetic retinopathy with a significant risk of developing conditions such as stroke, coronary heart disease, myocardial infarction, and the risk of CVD-related mortality.68–74 These findings show the importance of early identification and management of diabetic patients with diabetic retinopathy.
Diabetic retinopathy is the number one condition widely explored or studied using DL systems in ophthalmology. Several studies have reported good algorithm performance (AUCs > 0.90) and comparably higher specificity and sensitivity in detecting diabetic retinopathy.75–81 Based on reported accuracies of DL-related fundus image analysis, one can conclude that the creation of more detection algorithms for diabetes would be a significant milestone in managing CVD-related conditions. One of the challenges with this concept is that the onset of diabetic retinopathy differs between the two types of diabetes. Type 2 diabetics may show signs 10 years after the onset of diabetes. 82 Therefore, developing models that focus on diabetic retinopathy lesions alone may be limited in detecting many patients with type 2 diabetes. It is also essential to include other risk factors that must be considered to improve detection accuracy.
In Yun et al. 27 the DL model performed modestly well in predicting type 2 diabetes using fundus images only (AUC 0.731), but the performance was enhanced by combining non-invasive traditional risk factors and achieved an AUC of 0.810. Zhang et al. 24 reported a model performance, AUC > 0.828, in detecting type 2 diabetes and AUC > 0.80 in predicting the development of type 2 diabetes in healthy patients after 5 years.
Anemia
Anemia is a common complication of diseases like CKD 83 and type 2 diabetes. 84 In recent years, anemia has increasingly been considered an independent and significant risk factor for CVD. 85 In addition, several studies have indicated an increased risk of mortality and morbidity from heart failure and acute coronary syndrome in patients with anemia.86–88 The most reliable indicator of anemia is hemoglobin (Hb) concentration. Hb plays a crucial role in CVD development, especially in patients with CKD. Horwich et al. 89 observed that Hb was a prognostic biomarker, and its lower levels predicted increased mortality. Hb is mainly measured using blood samples, a procedure considered invasive. It requires a specialized laboratory infrastructure, sophisticated equipment (hematology analyzer, biochemical reagents), and expert technicians to perform phlebotomy for efficient analysis. 90 Studies have also shown that anemia detection is possible through simple observation of the eye’s conjunctiva color changes.91,92
Recently, other studies have explored estimating Hb levels through automated means, using digital photographs of the conjunctiva captured by a smartphone which has shown positive prospects of making anemia detection from the eyes easier.93–95 It is postulated that anemia contributes to the development of diabetic retinopathy in patients with diabetes,96,97 and that anemia manifests on the retina in the form of Roth spots, hard exudates, cotton wool spots, hemorrhages, and optic disc pallor. 98 This suggests that predicting or detecting anemia from the retina is possible, more so using DL applications. Mitani et al. 28 reported that a DL algorithm trained using fundus images could estimate Hb levels. The model achieved an AUC of 0.88 for detecting anemia. In a similar study, Zhao et al. 99 used ultrawide field fundus images to train a DL model that predicted Hb concentration and screened anemia with a good performance, MAE of 0.83 g/dl and AUC of 0.93, respectively.
Implications for clinical practice and integration
It is undoubtedly true that CVDs pose a significant public health burden. Furthermore, some of the procedures currently being used to assess CVD risk are invasive, expensive, and not feasible for underdeveloped health facilities. As a result, many people often remain undetected, which may increase CVD-related deaths. As the population is aging and the prevalence of diabetes and other CVD-related conditions increases, the need for more accessible and user-friendly risk assessment tools is more urgent. 44 DL algorithms may offer a quick, less invasive, and cost-effective way for early detection of CVD-related conditions. Ophthalmology is one of the ideal areas for integrating DL systems to predict CVD; as it has been observed, many people are concerned about eye or visual health, such that they would visit eyecare providers for regular eye checkups or participate in vision screening more than they would for CVD screening.38,100 Retinal evaluation results shared by eye care providers with other health workers may help identify individuals requiring comprehensive checkups and interventions to prevent complications.
Integrating DL into clinical practice may involve various modalities such as teleophthalmology, office-based use, and in-built DL-based fundus camera settings. Teleophthalmology is particularly ideal for health systems with telemedicine/retina platforms already in existence such that an algorithm can easily be incorporated into the platform to analyze fundus images and predict CVD. 101 Office-based modalities may involve the use of tablets, laptops, or desktop computers that have an installed application to analyze fundus images. However, this modality may be a little involving, requiring the transfer of fundus images from one device to another. Alternatively, cloud storage can improve AI systems’ data transfer, storage, and adaptability. This involves automatically transferring electronic data to a centralized data storage center. For instance, the Japan Ocular imaging registry, created in 2017 by the Japanese Ophthalmological Society, stores medical records from more than 20 collaborating institutions, promoting AI development and clinical research by utilizing vast amounts of data. 102 Another strategy is having DL algorithms built into portable devices and handheld smartphone-based fundus cameras.103,104
Other than being used in hospital settings, DL algorithms can also enhance our understanding of the association of systemic diseases with ocular diseases through clinical research without the need for invasive and high-cost screening activities. DL systems can be applied retrospectively to clinical data contained in datasets to study or analyze disease associations.
Challenges to clinical integration
As DL is fast gaining interest and technology advances in the field of medical imaging, one of the imminent challenges to its full clinical adoption and integration is the medical, legal and regulatory approvals aspect especially because there are variations in regulations across the world. These ethical and regulatory concerns generally dwell on privacy and fairness of the data, transparency, accountability, and product liability.105–107 An algorithm’s performance largely depends on the size and quality of the data that comes from people, raising concerns about data protection and privacy. Different institutions through ethics committees require that researchers obtain the informed consent of the individuals before data collection. However, there are no adequate regulations in many countries on how data should be used or shared among researchers beyond the institutional review board’s recommendation. Such data must have data sourcing, protection, and privacy requirements or regulations that determine authorized data uses. 106 Medical data is sensitive, and bias could affect its intended purpose when used for any algorithm development. Likewise, a lack of data or restrictions over the use of personal health data may also affect algorithm development.
AI is not just about big data management; the safety risks of using algorithms to manage diseases are also a concern. For instance, the commercialization of AI products may result in built-in biases to suit the interests of the developers, thereby indirectly putting the population that will use the applications at potential risk of harm. In healthcare, the safety and transparency of any AI application is necessary. A lack of understanding of how the algorithm works or derives its conclusions or predictions is also a significant concern to patients and clinicians because of the danger of absolute total dependence on AI models, which may lead to fatal outcomes if other pertinent clinical findings are not available and are not considered. Therefore, proper information sharing, training for healthcare workers and other educational approaches to create awareness and understanding of DL systems’ abilities and limitations can help to mitigate this problem. In addition, the question of malicious liability and product liability for DL applications must be addressed before full clinical adoption; it must be clear who takes responsibility when the applications get things wrong, that is, when misclassifying and misdiagnosing cases during automated CVD risk assessment, will the user (physician) be held responsible for such failures or will the failure be treated as a product liability (i.e. a technical fault/malfunctioning device), a manufacturer’s responsibility? Full clinical integration may be easier once proper standards to manage or mitigate such challenges are developed and effected.
Limitations and future directions
In this review, we have reviewed recent work showing the potential of DL applications on fundus images for CVD risk assessment. However, the evidence provided should be considered bearing in mind other limiting factors. First, it is about the complex nature of CVDs. CVDs are associated with numerous risk factors and conditions that affect different groups of people, such as those with diabetes, CKD, anemia, hypertension, and dyslipidemia. These conditions independently have other risk factors even though in some cases they tend to occur simultaneously. Although ophthalmoscopic markers in the retina can provide new insights into systemic diseases and potentially prove helpful, the current state of evidence is probably not sufficient to define specific risk profiles for CVD outside of the entire clinical picture; this means that the identification of a seemingly ideal area and population to apply the DL algorithms may be challenging. Furthermore, it may be argued that patients with more complicated case details may be problematic to evaluate using DL algorithms. Therefore, using fundus images alone may not be sufficient to make definitive diagnoses or high-stakes decisions to manage CVD. While this may be a valid observation, integrating clinical metadata in the analysis produced better algorithm performance in some of the above-cited studies, which suggests that other CVD-related examination results or factors are still an essential consideration in the final decision-making at this stage of AI development. However, the studies did not fully cover the effect of all the associated risk factors. Future studies should extensively evaluate the effect of other risk factors on the performance of DL algorithms in addition to age, sex, diabetes, and hypertension. Studies should also explore the effect of using a multimodal approach that combines fundus images with reports from other imaging modalities to predict or detect CVD-related conditions. To the best of our knowledge, only one study has reported an excellent algorithm performance in CVD detection after integrating fundus images and dual-energy X-ray absorptiometry scans. 108
Second, the studies cited above showed good predicting performance of CVD risk factors, biomarkers, and other CVD-related conditions. Nevertheless, the studies did not adequately evaluate the added diagnostic value of the DL models by directly comparing the algorithms with the existing CVD risk assessment tools or stratification methods; this may impact the perceived applicability and acceptance by patients, regulatory agencies, and other healthcare providers involved in managing CVD. Future studies should also focus on evaluating the algorithm’s superiority over or on par with the widely accepted CVD risk estimators to help clear the doubts about DL systems’ ability to predict or detect systemic diseases from fundus images confidently.
Third, the lack of interpretability of the results. Thus far, the exact predictive mechanism of DL algorithms is poorly understood. Such knowledge to end users is essential for easy clinical integration because an algorithm’s performance alone is insufficient to necessitate high-risk decisions that can be made on the patient; the recommended outcomes must also be convincing. In addition, a comparison of the effectiveness of one algorithm over another is difficult since the datasets used in the studies are different. During algorithm development and validation, there is always a risk of using biased datasets. For instance, patient characteristics (such as age, gender, ethnicity, etc.) may not be evenly distributed. Variations in image quality either because the images were collected using multiple cameras with different settings or due to patient-related factors such as the presence of ocular media opacities, high refractive errors, poor mydriasis, etc. – these may result in the algorithm reporting false positives and negative results, thereby affecting its accuracy. Researchers should also endeavor to avoid using poor data sources or evaluate thoroughly the negative impact of the data they have before using it to avoid errors in their results. More importantly, clinical trials using real clinical situations are needed to ascertain or validate the accuracy of the algorithms. 109
Finally, the review does not cover other machine learning (ML) approaches currently available in the literature. For a long time, ML techniques have been used in the diagnosis or assessment of diseases. But to the best of our knowledge and throughout the literature search, we did not find any studies about CVD and fundus image analysis using alternative ML approaches that fit the scope of this review other than those used DL methods. It is the authors’ opinion that this could be attributed to DL’s better performance in medical image analysis compared to other ML types. Traditionally, ML techniques require data with discriminative features that are manually and accurately annotated or designed by physicians with adequate expertise, and the identification of useful features is only achieved after a thorough training, validation, and testing of the model. This is an extremely time-consuming task that involves a repetitive cycle of identifying and developing new features, rebuilding the model, and measuring results until a satisfactory outcome is achieved. 110 ML models are also challenged when used on datasets that contain multiple dimensions, and large amounts of data which affects their accuracy.110–112 This implies that, the complexity of retinal images (and other medical images) where extraction of the most predictive features is not obvious and the need for large amounts of annotated data may also hinder the wide use of ML techniques in the field of oculomics. On the other hand, DL methods as we have previously mentioned in the introduction, can be presented with raw data, and extract the required features without human input; they do not need structured or labeled datasets to learn from like ML algorithms do. They can identify subtle features and changes in the fundus or other medical images which may not be easily recognizable by other ML types. 111 DL algorithms are also efficient with any amount of data.111–113 Thus, DL algorithms can adapt and improve their accuracy over time as they are trained on more data, making them highly effective for medical image analysis tasks. These and other reasons may explain why there is an influx of studies focusing on DL techniques more than other ML techniques. In addition, we also acknowledge that although relevant literature was reviewed in detail to discuss this topic, it is possible that a few latest DL studies or approaches were not fully covered. We have also dealt with CVD as a single issue, but obviously, there are many different CVDs, each of which might require a different analytical data-based strategy.
Conclusion
DL application on fundus images is a promising technological advancement with the potential to revolutionize the delivery of healthcare services. The studies reviewed show that DL systems may enhance the current risk stratification methods for CVD and other systemic diseases, especially in resource-constrained settings. More research is needed to develop algorithms with higher accuracy and predictive abilities using larger datasets containing sufficient data about cardiovascular events. With adequate clinical validation, DL systems applied on fundus images may replace some of the already established markers or better act as an adjunct assessment tool for systemic disease detection. A clinical change to incorporate DL systems and fundus images as an equally good test over more expensive and invasive procedures may require conducting prospective clinical trials to mitigate all the possible ethical challenges and medicolegal implications.
