Abstract
Objective:
Hepatocellular carcinoma (HCC) is the most common primary liver cancer mainly caused by hepatitis viral infection. Early stage diagnosis is still challenging due to its asymptomatic behavior so there is an urgent need for effective biomarkers. This study aimed to identify effective diagnostic biomarker or therapeutic target for HCC.
Method:
Label-free quantitative mass spectrometry was performed to analyze protein expression in HCC and control tissues. Protein-protein interaction (PPI) analysis was done using the STRING database and hub proteins were identified by Cytohubba. The survival analysis and expressions profiling of hub proteins were performed by using GEPIA. Functional and pathway enrichment analysis were carried out using Gene Ontology (GO) and Kyoto Encyclopedia of Gene and Genome (KEGG).
Results:
A total of 1539 proteins were identified, of which 116 were differentially expressed proteins (DEPs). PPI network analysis revealed 10 hub proteins; EGFR, GAPDH, HSP90AA1, MMP9, PTPRC, CD44, ANXA5, PECAM1, MMP2, and CDK1. Among these, GAPDH, MMP9, ANXA5, HSP90AA1, and CDK1 were significantly associated with low survival rate (
Conclusion:
The hub proteins GAPDH, HSP90AA1, MMP9, ANXA5, and CDK1 demonstrated significant prognostic potential, could be used as promising theragnostic biomarkers for HCC.
Keywords
Introduction
Cancer is caused by the uncontrolled growth of abnormal cells that spread through the blood and lymphatic systems, potentially damaging normal tissues. It is the second leading cause of mortality worldwide. 1 Liver Cancer is ranked as fifth leading cause of cancer related death in men and the seventh in women around the globe. 2 In 2020, Asia accounted for 72.5% of the world’s cancer cases. 3 Primary liver cancer holds the majority of cases of (up to 75%) hepatocellular carcinoma (HCC) while the remaining cases are mostly of cholangiocarcinoma. HCC commonly developed from liver cirrhosis with other risk factors such as Hepatitis B virus (HBV) and Hepatitis C virus (HCV) infection, nonalcoholic fatty liver disease, diabetes mellitus, obesity, genetic diseases like hemochromatosis and acute intermittent porphyria and ingestion of hepatotoxins like aflatoxin B1. 4 Pakistan has a particularly high number of HCC related deaths due to the endemic prevalence of hepatitis. 5
Early diagnosis is critical for effective treatment of HCC. While abdominal ultrasonography via deep learning 6 and serum alpha-fetoprotein (AFP) are commonly used for detection of early tumors but their sensitivity is limited. Circulating biomarkers including cell-free DNA and RNA have emerged as promising alternates with the advancement of molecular diagnostic technologies. 7 Several protein biomarkers such as glypican 3 (GPC3), osteopontin (OSN), golgi protein-73 (GP73) and Dickkopf-1 (DKK1) have been used in clinical practice but their sensitivity and specificity is still arguable and necessitating further validation. 8 Therefore, comprehensive studies are needed to identify more effective biomarkers for HCC diagnosis or treatment.
Mass spectrometry has become a robust technique to identify the potential biomarkers for various cancer. Both labelling-based or label-free mass spectrometry protein profiling facilitates protein identification and quantification in complex biological samples. 9 Although the labeling methods can provide reproducible and accurate results but it is costly and time-consuming, while label-free method provide cost-effective and convenient alternative for protein quantification, widely used in proteomic studies. 10 The advancements in mass spectrometry augmented the global proteomic datasets, enabled the discovery of effective prognostic marker or therapeutic target for various cancer. 11
In this study, we used label-free quantitative mass spectrometry (LFQ-MS) to perform proteomic profiling of HCC samples, identifying differentially expressed proteins that may provide new insight into the diagnosis or treatment of HCC.
Materials and Method
Study Design
This is a case-control study.
Sample Collection
Approval for this study was taken from the Institutional Bioethics Committee, University of Karachi (IBC-UoK # 352-2023). The study was carried out on Hepatocellular carcinoma patients diagnosed with HCC through histopathological examination and imaging techniques. According to inclusion criteria only patients aged 40 years or older were included. Patients diagnosed with any other cancer along with HCC and liver cirrhosis were excluded from this study. Selected patients with positive consent were interviewed for the demographic data (Supplemental Table 2), disease history and treatment. HCC tissues along with adjacent non tumor tissues (
Label Free Quantitative Mass Spectrometry
Sample Preparation
In order to perform proteomic analysis, tissues were lysed using 6M Guanidinium chloride (GuHCl), 100 mM Tris (pH 8.5) containing 1mg/mL Chloracetonitrille (CAA) and 1.5 mg/mL Tris (2 carboxyethyl) phosphine (TCEP). Samples were heated at 95°C for 5 min. Double digestion was done initially with Lysyl endopeptidase (Wako, Japan) at a ratio of 50:200. Samples were incubated for 2 hours on a thermo mixer at 37°C. Secondly, samples were incubated with 1µg of MS-grade trypsin (Thermoscientific, USA) at 37°C for overnight. Next day, stage tips were prepared by using a C18 membrane (Empore, USA). 12 Membrane was activated using 20 µL of methanol and the stage tips centrifuged at 500 g for 5 min. Stage tips were washed with 20 µL of 0.1% Trifluoroacetic acid (TFA) and centrifuged at 500 g for 2 min. Digestion was stopped by adding 5 µL of 10%TFA to the samples and centrifuged at 17,000 rpm for 5 min. The clear supernatant, approximately 300 µL was applied to the stage tips and centrifuged at 500 g for 5 min for binding. Stage tips were then washed with 20 µL 0.1% TFA for 2 min at 300 g. The samples were then eluted from stage tips using 20 µL of 50% acetonitrile (ACN) and 0.05% TFA solution by centrifugation at 300 g for 2 min. The eluted peptides were then vacuum dried in a Vacufuge and reconstituted in 15 µL of 0.1% TFA. The peptide concentration was measured using.
Mass Spectrometry Analysis (MS)
Peptides were applied an Orbitrap Fusion Lumos Tribrid mass spectrometer connected to an Ultra3000 chromatography system (nano-LC) (Thermo Scientific, Germany) with an auto sampler for label free quantitation (LFQ). 1 µg tryptic peptides was loaded on a NanoDrop Spectrophotometer homemade column (250 mm length, 75 μm inside diameter) packed with 1.8 μm uChrom (nanoLCMS Solutions) and separated by an increasing acetonitrile gradient, using a 70-min reverse-phase gradient (from 3% to 40% Acetonitrile) at a flow rate of 400 nL/min. The mass spectrometer was operated in positive ion mode with a capillary temperature of 220 °C, with a potential of 2000 V applied to the column. The sample sequence was first randomized and then imported into Xcalibur software. The MS data were acquired using data-independent acquisition (DIA) method with MS resolution of 240 k, 1 s cycle time and MS/MS HCD fragmentation done in the ion trap. Mass spectra were analyzed using the Xcalibur software. All the samples were analyzed in replicates.
MS Data Analysis
Each Raw file with LFQ values was separately analyzed and searched against Uniprot database (homo sapiens proteins). Fixed modification was selected for Carbamidomethyl (C) while variable modifications were selected for Oxidation (M) and acetylation of protein N-termini. Other parameters; False discovery rate (FDR) 0.01 and mass accuracy of 4.5 ppm peptide/protein were used. Statistical analysis for each normal and cancer tissue group quintuplicates were used and statistical significance was determined using Limma R package.
Protein-protein Interaction (PPI) Analysis
Based on the candidate proteins obtained from MS analysis, the linkage and interacting network was generated by importing the candidate proteins to the Search tool for the Retrieval of Interacting Genes (STRING, http://string-db.org/) database with highest confidence of 0.9. 13 Nodes (proteins) with no edges (interaction) were excluded, the resultant protein interaction network was further launched into the Cytoscape tool (Cytoscape Version 3.10.1) for the evaluation of topological features of the network. Cytohubba plugin was applied to rank and screen the hub genes within the PPI network that may be involved in HCC.
Validation of Hub Genes
The possible prognostic association of Hub genes with HCC was analyzed by Gene Expression Profiling Interactive analysis (GEPIA), (http://gepia.cancer-pku.cn/).
14
It is an open source cancer data-base that analyzes differential expressions of cancer and normal tissues from The Cancer Genome Atlas (TCGA) and the Genotype Tissue Expression (GTEx) portal. All hub genes were analyzed using cutoff criteria of log2FC=1 and
Protein Functional Enrichment Analysis
Gene Ontology (GO) and Pathway enrichment using the Kyoto Encyclopedia of Gene and Genome (KEGG) data-base were analyzed by an online graphical gene set enrichment tool that is, ShinyGO v0.741 (http://bioinformatics.sdstate.edu/go/) 15 by pasting the gene list of significantly regulated proteins and were compared against the background of total quantified proteins to get enriched GO terms and KEGG pathway diagrams with our genes highlighted. GO is mainly carried out to analyze and describe the genomic information in terms of biological processes, cellular components, and molecular functions. The KEGG pathway analysis is used to acquire the information relating to gene function annotation and its involvement in different pathways.
Results
Identification and Quantification of Proteins From HCC and Healthy Control Tissues
The Protein profiles from HCC and control tissues were analyzed by DIA-NN and compared. The label-free MS identification of HCC and control tissues yielded 1539 proteins. Out of which 116 proteins were differentially expressed (*

(a) Clustered Heat map of DEPs in HCC diseased and Control samples and (b) Differential expression of proteins in HCC tissues as compared to normal control tissues.
PPI Network Analysis and Selection of Hub Proteins
The 116 DEPs were connected to establish an initial PPI network on STRING database that included 110 nodes (proteins) and 430 edges (interactions), and 7 isolated targets proteins were removed. The network was then imported to Cytoscape 3.10.1 and Cytohubba plug-in was employed for the extraction of hub proteins. On the analysis of this network, the top 10 essential hub proteins were selected and ranked by Maximal Centrality and Degree, that were Epidermal growth factor (EGFR), Glyceraldehyde-3-phosphate dehydrogenase (GAPDH), Heat shock protein HSP 90-alpha (HSP90AA1), Matrix metalloproteinase-9 (MMP9), Receptor-type tyrosine-protein phosphatase C (PTPRC), CD44 antigen (CD44), Annexin A5 (ANXA5), Platelet endothelial cell adhesion molecule (PECAM1), 72 kDa type IV collagenase (MMP2) and Cyclin-dependent kinase 1 (CDK1) (in the center of the network) showed in Figure 2a.

(a) and (b) Protein interaction analysis of DEPs along with hub proteins.
Survival Analysis of Hub Genes Using GEPIA
The correlation between hub genes and the prognosis of HCC was analyzed by using Gene Expression Profiling Interactive Analysis (GEPIA). We found that GAPDH, MMP9, ANXA5, HSP90AA1 and CDK1 were significantly associated with low survival rates (

Survival analysis of Hub genes to validate survival biomarkers by GEPIA data base. p⩽0.05 was considered significant.

Expression ofMMP9 and CDK1 in HCC tissues as compared to normal control.
Functional Enrichment Analysis
Enrichment analysis was performed by using web based tool ShinyGO v 0.741 with

Functional enrichment analysis of DEPs in HCC tissues using ShinyGO. The Gene Ontology functional database was considered for the (a) Biological process, (b) Cellular component (c) Molecular Function categories. The pathway enrichment analysis using (d) KEGG database.
Pathway enrichment analysis by KEGG.
Discussion
HCC is one of the most prevalent and frequently occurring types of cancer in recent years. 16 Early diagnosis and timely treatment are crucial for improving patient outcomes and reducing the economic burden associated with the disease. 17 Despite the avaibility of potentially curative treament options such as surgical resection and liver transplantation, HCC remains characterized by a high recurrence rate and poor prognosis. 18 Hense, novel diagnostic biomarkers and effective therapeutics stratagies are needed. Therefore, the main aim of this study was to determine a valuable biomarker for HCC detection using a proteomic approach. Label-free quantitative mass spectrometric (LFQ-MS) analysis of HCC tissues gave 116 significant differentially expressed proteins (DEPs) (Figure1a and b). Heatmap analysis showed the increased abundance of upstream proteins in disease samples as compared to the controls, suggesting their oncogenic potential and involvement in progression and development of cancer. To explore protein-protein interaction, STRING analysis of the DEPs generated a network showing strong connectivity among proteins (Figure 2a). From this network, 10 hub proteins; EGFR, GAPDH, HSP90AA1, MMP9, PTPRC, CD44, ANXA5, PECAM1, MMP2, and CDK1were identified (Figure 2b). All of these proteins were upregulated (Figure1b). More importantly, GEPIA database analysis validated a significant association between five hub proteins and HCC prognosis as shown in Figure 3. Numerous studies have demonstrated their role in the tumor progression. GAPDH is commonly used as an internal control in PCR analysis, was found to be upregulated in liver cancer tissues, with up to sevenfold increase in advanced stages of HCC. 19 EGFR (epidermal growth factor receptor) involved in regulation of cell proliferation, migration, and metabolism and its overexpression in 68% of cases of human HCC is associated with poor patient survival. 20 However, GEPIA analysis linked with TCGA database did not show a significant correlation between EGFR expression and overall survival in HCC, which contradict with previous findings (Figure 3).
Another hub protein; MMP9 (mettaloproteinase-9), a well-known metastatic inducer has been involved in cancer progression. Studies reported that the low expression of MMP9 correlates with better survival outcomes in HCC patients, 21 which aligns with our results as also we observed a significant association between high MMP9 expression and poor survival in HCC tissues (Figure 4). Furthermore, ANXA5 (Annexin5), a calcium-regulated phospholipid binding protein has been involved in organization and transport and its role is still not fully understood. ANXA5 expression correlates with poor survival in HCC patients 22 which is consistent with our findings (Figure 3). Similarly, HSP90AA1, a major systolic chaperone involved in intracellular signaling, is frequently observed as up regulated in various human malignancies including lung, breast, and gastric cancers. 23 Our study also showed a significant correlation of high expressions of HSP90AA1 with low survival rates in HCC. Additionally, we obtained a significant relation of CDK1 overexpression with low survival of HCC and we also acquired its significant high expression in HCC tissue as compared to normal in agreement with previous studies (Figure 4). Additionally, Cyclin-dependent kinase 1 (CDK1), plays an important role in cell cycle regulation, found to be significantly over expressed in HCC tissues, with high expression associated with worse overall survival, in agreement with other study. 24
To further understand gene function, Gene ontology (GO) analysis showed that DEPs were highly enriched in stress responses pathway, which are known to play a role in many human diseases like neurodegeneration, diabetes, and cancers including HCC. 25 In the cellular component category, the DEPs were significantly enriched in vesicles. Extracellular vesicles are significantly high in HCC patients as compared to cirrhosis cases which makes them a potential biomarker for early detection of HCC. 26 GO analysis further evaluates molecular functions of identified DEPs associated with protein binding and enzymatic activities.
Due to complexity of HCC, many dysregulated signaling pathways contribute in its progression. 27 KEGG pathway analysis of DEPs identified thyroid hormone synthesis as one of the most significantly enriched pathway in our study as shown in Figure 6. Liver is the major site of thyroid hormone metabolism and their dysregulation can contribute in liver disorders including HCC. Studies evident that thyroid hormone receptors regulate various signaling pathways that are involved in HCC progression and development.28,29 We also identified thyroglobin (TG) and GSR (glutathione reductase) which are major regulators of thyroid synthesis pathway and linked with tumor development and drug resistance. TG is a major protein involved in this pathway and was found to be upregulated in our HCC tissue samples in consistent with other studies that report high expression of TG in HCC. 30 According to Gionfra et al. 30 liver is the prime target for thyroid hormones and these hormones play a key role in liver associated diseases like cirrhosis and HCC. 27 GSR (glutathione reductase) is another downstream protein of this pathway was also found to be highly expressed in our data. Studies have reported high expression of GSR in HCC and has been linked with sorafenib resistance and poor prognosis making it a potential therapeutic target for sorafenib-resistant HCC. 31

Target genes involved in KEGG’ pathway of thyroid hormone synthesis (with copy right permission # 250503).
Conclusion
In conclusion, this study employed a bioinformatics approach to analyze 116 DEPs identified through LFQ-MS, providing a comprehensive understanding of protein dysregulation in the progression of HCC. Among all these proteins, 10 hub proteins were identified, five of which are significantly related to HCC prognosis. Present study suggests that these proteins could be a candidate for potential diagnostic or prognostic biomarkers. Further functional studies are necessary to explore the molecular mechanism and biological effects of these genes and proteins in HCC, surfacing the way for novel therapeutic strategies.
Supplemental Material
sj-docx-1-cix-10.1177_11769351251336923 – Supplemental material for Identification of Potential Hub Proteins as Theragnostic Targets in Hepatocellular Carcinoma through Comprehensive Quantitative Tissue Proteomics Analysis
Supplemental material, sj-docx-1-cix-10.1177_11769351251336923 for Identification of Potential Hub Proteins as Theragnostic Targets in Hepatocellular Carcinoma through Comprehensive Quantitative Tissue Proteomics Analysis by Quratul Abedin, Kulsoom Bibi, Alex von Kriegsheim, Zehra Hashim and Amber Ilyas in Cancer Informatics
Supplemental Material
sj-docx-2-cix-10.1177_11769351251336923 – Supplemental material for Identification of Potential Hub Proteins as Theragnostic Targets in Hepatocellular Carcinoma through Comprehensive Quantitative Tissue Proteomics Analysis
Supplemental material, sj-docx-2-cix-10.1177_11769351251336923 for Identification of Potential Hub Proteins as Theragnostic Targets in Hepatocellular Carcinoma through Comprehensive Quantitative Tissue Proteomics Analysis by Quratul Abedin, Kulsoom Bibi, Alex von Kriegsheim, Zehra Hashim and Amber Ilyas in Cancer Informatics
Footnotes
Acknowledgements
Authors would like to thank all the patients for their participation and Higher Education Commission, Islamabad, Pakistan for providing financial assistance.
List of Abbreviations
HCC; Hepatocellular carcinoma, PPI; Protein protein interaction, DEP; Differentially expressed proteins, GEPIA; Gene Expression Profiling Interactive analysis, GO; Gene Ontology, KEGG; Kyoto Encyclopedia of Gene and Genome, TFA; Trifluoroacetic acid, LFQ; label free quantitation, TCGA; The Cancer Genome Atlas and GuHCL; Guanidinium chloride.
Author Contributions
Quratul Abedin: sample collection, preparation, manuscript writing
Kulsoom Bibi: sample collection
Alex von Kriegsheim: Supervision and data analysis
Zehra Hashim: Sample preparation, manuscript writing
Amber Ilyas (Corresponding Author): Conceive the idea, got funding, data analysis
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was supported by a research grant (NRPU#20-17465) from the Higher Education Commission, Islamabad, Pakistan (a nonprofit organization).
Declaration of conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
