Abstract
Identifying associations among diseases is essential for advancing our understanding of disease mechanisms, enhancing diagnostics, facilitating drug repurposing, and guiding new therapeutic development. Despite substantial progress in decoding disease biology, the molecular underpinnings, therapeutic targets, and phenotypic traits of many diseases remain poorly understood. Previous studies have typically relied on either single similarity metrics or weighted combinations of multiple metrics, often lacking objectivity and standardization. In this work, we systematically evaluate and compare state-of-the-art similarity metrics across three distinct categories—semantic, functional, and network-based—to identify the most effective representative from each. Our analysis reveals SemSim as the optimal semantic metric, FunSim for functional similarity, and NetSim for network similarity. Leveraging these findings, we propose DiGeS-FN (Disease-Gene associations using Semantic, Functional, and Network metrics), an integrated framework for comprehensive disease similarity assessment. Experimental results demonstrate that DiGeS-FN achieves an AUC of 0.81, with a high true positive rate and a low false positive rate. The framework effectively recovers well-established disease associations, including atherosclerosis–myocardial infarction, asthma–bronchitis, and asthma–chronic obstructive airway disease, thereby validating its reliability. Notably, it also uncovers a novel association between polycystic ovary syndrome and endometriosis, supported by shared gene ontologies and pathways. These findings demonstrate the dual potential of DiGeS-FN to both validate known disease relationships and uncover novel genetically associated disease pairs.
Keywords
Get full access to this article
View all access options for this article.
