Abstract
Background
This study employs machine learning strategy algorithms to screen the optimal gene signature of pulmonary arterial hypertension (PAH) under big data in the medical field.
Methods
The public database Gene Expression Omnibus (GEO) was used to analyze datasets of 32 normal controls and 37 PAH disease samples. The enrichment analysis was performed after selecting the differentially expressed genes. Two machine learning methods, the least absolute shrinkage and selection operator (LASSO) and support vector machine (SVM), were used to identify the candidate genes. The external validation data set further tests the expression level and diagnostic value of candidate diagnostic genes. The diagnostic effectiveness was evaluated by obtaining the receiver operating characteristic curve (ROC). The convolution tool CIBERSORT was used to estimate the composition pattern of the immune cell subtypes and to perform correlation analysis based on the combined training dataset.
Results
A total of 564 differentially expressed genes (DEGs) were screened in normal control and pulmonary hypertension samples. The enrichment analysis results were found to be closely related to cardiovascular diseases, inflammatory diseases, and immune-related pathways. The LASSO and SVM algorithms in machine learning used 5 × cross-validation to identify 9 and 7 characteristic genes. The two machine learning algorithms shared Caldesmon 1 (CALD1) and Solute Carrier Family 7 Member 11 (SLC7A11) as genetic signals highly correlated with PAH. The results showed that the area under ROC (AUC) of the specific characteristic diagnostic genes were CALD1 (AUC = 0.924) and SLC7A11 (AUC = 0.962), indicating that the two diagnostic genes have high diagnostic value.
Conclusion
CALD1 and SLC7A11 can be used as diagnostic markers of PAH to obtain new insights for the further study of the immune mechanism involved in PAH.
Keywords
Get full access to this article
View all access options for this article.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
