Abstract
Objectives:
Breast cancer is a heterogeneous disease driven by dysregulated cellular processes, including altered metabolic pathways. The oncogenic microRNA miR-526b influences several cancer hallmark phenotypes and holds promise as a plasma biomarker. Given miR-526b’s role in metabolic regulation, we have decided LDHA, PDHA1, ATP5A1, and TIGAR that may help to identify additional biomarkers for breast cancer detection.
Methods:
We analyzed mRNA expression of these 4 metabolic markers in breast cancer tissue biopsies and plasma samples from patients and disease-free controls, using publicly available datasets and RT-qPCR validation. Diagnostic performance was evaluated using univariate and multivariate logistic regression and LASSO regression modeling. The potential of combining ATP5A1 with pri-miR-526b expression to improve plasma biomarker accuracy was also assessed.
Results:
Individually, none of the metabolic markers demonstrated sufficient sensitivity or specificity as plasma biomarkers. However, combining markers via logistic and LASSO regression improved classification performance. ATP5A1 showed strong biomarker potential in biopsy tissue samples but limited utility in blood plasma. The combination of ATP5A1 with pri-miR-526b significantly enhanced plasma-based diagnostic accuracy, highlighting the value of integrated biomarker panels.
Conclusions:
Our study validates the potential of miR-526b-regulated metabolic genes as complementary breast cancer biomarkers. While ATP5A1 shows promise in tissue, plasma-based screening benefits from combining multiple markers, including pri-miR-526b. Further research is needed to refine plasma biomarker panels for effective early detection of breast cancer.
Introduction
Cancer remains a leading cause of death worldwide, primarily due to complications from late-stage disease. 1 Early detection is critical to improving patient outcomes, 2 particularly in breast cancer, where survival rates decline sharply as the disease progresses. While the 5-year survival rate for Stage I breast cancer approaches 100%, it drops to 92% at Stage II, 74% at Stage III, and just 23% at Stage IV. 3 Although routine screening programs exist, current tools like mammography have limitations, including age restrictions, accessibility barriers, and procedural invasiveness.4 -6 A sensitive, specific, and minimally invasive blood-based test could significantly expand access to screening, especially for younger women and underserved populations.
Metabolic reprograming is a hallmark of cancer, including breast cancer. 7 A well-known example is the Warburg effect, where cancer cells preferentially rely on glycolysis over oxidative phosphorylation (ox-phos) for energy production, even in the presence of oxygen. 8 Although glycolysis yields less ATP per glucose molecule than ox-phos, it generates ATP more rapidly and provides metabolic intermediates required for biosynthesis, thus supporting rapid cell proliferation. 9 This metabolic shift is accompanied by characteristic molecular changes, such as the upregulation of glycolytic markers like lactate dehydrogenase A (LDHA) and downregulation of ox-phos markers like pyruvate dehydrogenase A1 (PDHA1). 10 Additional proteins such as ATP5A1, a mitochondrial ATP synthase subunit, and TIGAR, a regulator of glycolysis and oxidative stress, also contribute to these metabolic adaptations.11 -13
We previously demonstrated that the oncogenic microRNA miR-526b is upregulated in aggressive breast cancer. 14 miR-526b has been linked to key oncogenic processes including increased angiogenesis, oxidative stress, and cellular proliferation.15 -17 More recently, it has also been shown to enhance hypoxic signaling and modulate the expression of key metabolic markers including LDHA, PDHA1, ATP5A1, and TIGAR.18,19 These findings suggest that miR-526b may play a regulatory role in the metabolic reprograming observed in breast cancer.
This pilot study focuses on evaluating these 4 miR-526b-associated metabolic markers for their potential as blood-based biomarkers. While miR-526b alone has demonstrated moderate diagnostic value in plasma samples, we hypothesize that combining it with additional metabolic markers could enhance screening performance. 20 Importantly, breast cancer metabolism varies by subtype and hormone receptor status.19,21 By including multiple subtypes in our analysis, we aim to better assess the consistency and utility of these metabolic biomarkers across the heterogeneity of breast cancer.
This study aims to evaluate a miR-526b-driven panel of metabolic biomarkers, LDHA, PDHA1, ATP5A1, and TIGAR, for their potential role in breast cancer detection. By comparing expression levels in breast cancer tissues and plasma samples to those in normal controls, we will assess their individual and combined diagnostic value. Statistical modeling and predictive analysis will be used to rigorously evaluate the performance of this biomarker panel and its suitability for future use in a minimally invasive blood-based screening test.
Methods
This study followed the REMARK (REporting Recommendations for Tumor MARKer prognostic studies) guideline. 22
Data Mining and Gene Expression Analysis
RNA microarray data (accession number GSE45827) was obtained from the Gene Expression Omnibus (GEO) database provided by the National Center for Biotechnology Information (NCBI; https://www.ncbi.nlm.nih.gov/geo/).23,24 This dataset included 130 tumor samples (41 TNBC, 30 HER2+, 29 Luminal A, 30 Luminal B) as well as 11 Normal tissue samples. Gene Set Enrichment Analysis (GSEA; version 4.3.2) was performed to evaluate Hallmark gene sets (Class H) from the Molecular Signatures Database (MSigDB), focusing on Glycolysis and Oxidative phosphorylation (https://www.gsea-msigdb.org/gsea/msigdb). 25 Enrichment scores were visualized using GraphPad PRISM (version 10.3.1) with statistical significance defined as FDR < 0.25.
Immunohistochemistry (IHC) Analysis
IHC images of normal and cancerous breast tissues were obtained from the Human Protein Atlas (https://www.proteinatlas.org/). Marker expression was quantified using ImageJ threshold analysis to calculate the percentage of positively stained cell area. Statistical comparisons of staining intensity between normal and cancerous tissues were performed. 26
Gene Expression and Survival Analysis Using TCGA Data
Gene expression and survival data for breast cancer patients were obtained from the TCGA-BRCA PanCancer Atlas using the Bioconductor R package.27,28 The TCGA-BRCA dataset consisted of 1117 breast tumor tissue samples and 113 solid tissue normal samples. The breast tumor samples had a subtype distribution of 192 (17.2%) Basal, 82 (7.3%) HER2+, 562 (50.3%) Luminal A, 209 (18.7%) Luminal B, 40 (3.6%) Normal-like and 32 (2.9%) were not assigned a subtype. Differential gene expression data was analyzed using t-tests (P < .05) and visualized using GraphPad PRISM (version 10.3.1). Survival analysis was conducted using the Human Protein Atlas (HPA), stratifying patients into “High” and “Low” expression groups based on optimal cutoffs (lowest p-value). Kaplan-Meier curves and log-rank tests (P < .05) were used to assess survival differences, with visualizations created in GraphPad Prism.
Biomarker Analysis Using Plasma RNA
Exosomal RNA expression data from healthy individuals and breast cancer patients were obtained from ExoRBase 2.0 (http://exorbase.org:8080/). 29 Differential expression of selected markers between groups was analyzed using t-tests (P < .05). Diagnostic potential was evaluated via Receiver Operating Characteristic (ROC) curves and Area Under the Curve (AUC) analysis, with AUC > 0.80 indicating strong biomarker potential. Data visualization and statistical analysis were conducted using GraphPad PRISM (version 10.3.1).
Modeling Approaches
Plasma gene expression data for metabolic markers were obtained from the ExoRBase 2.0 database and processed for downstream analysis. Logistic regression models were employed to assess the association between plasma expression levels of selected metabolic genes and breast cancer status.
A comprehensive multivariate logistic regression model was initially fit including all candidate markers’ main effect and interaction terms. To improve model interpretability and reduce overfitting, a simplified logistic regression model was generated using backward stepwise selection based on Akaike Information Criterion (AIC), retaining key predictors: LDHA, TIGAR, and the interaction term between TIGAR and ATP5F1A. Model coefficients, odds ratios (OR), 95% confidence intervals (CI), and p-values were reported for both models.
In parallel, the Least Absolute Shrinkage and Selection Operator (LASSO) regression was applied to the same dataset to perform variable selection and regularization simultaneously. Optimal penalty parameter (λ) was chosen via 10-fold cross-validation using the minimum mean cross-validated error (lambda.min). The final LASSO model coefficients were extracted at this λ value.
Model performance was evaluated using receiver operating characteristic (ROC) curves with corresponding area under the curve (AUC) values. Logistic regression was performed in GraphPad PRISM Version 10.3.1, and LASSO regression was performed in RStudio using the glmnet package. 30
RNA Extraction and Analysis From Tissue and Plasma Samples
Breast tissue samples (normal and cancerous) were obtained from the Ontario Institute of Cancer Research (OICR), and plasma samples from healthy individuals and breast cancer patients were provided by the London Tumor Biobank (LTB). Flash frozen tissues were transported to Brandon University, stored in a −80°C freezer until processed for RNA extraction. RNA was extracted using Qiagen RNeasy Mini Kit (Qiagen, MD, USA), and complementary DNA (cDNA) was synthesized with the Applied Biosystems High-Capacity cDNA Reverse Transcription Kit (Applied Biosystems, MA, USA), is preserved in −20°C. ATP5A1 expression (Hs00900735_m1) was quantified using RT-qPCR using beta-actin serving as the endogenous control (Hs01060665_g1).
Differential expression was analyzed using ΔCT values, stratified by cancer status, hormone receptor status and tumor stage. Biomarker potential in plasma samples was evaluated through ROC-curve analysis, with a strong performance defined as AUC > 0.80. 31 Statistical analysis and visualizations were performed in GraphPad PRISM (version 10.3.1) with statistical significance defined as P < .05.
Use of Artificial Intelligence
Large language models (eg, Grammarly, ChatGPT) were used sparingly to assist with grammar and clarity in the manuscript. No AI tools were used for data generation, data analysis, interpretation of results, or the creation of scientific content.
Results
Multiple Metabolic Pathways Are Enriched in Breast Cancer Tissue
Gene expression analysis from breast cancer and non-cancerous breast tissue microarray data was curated from NCBI dataset GSE45827. 23 GSEA, using the available microarray data and the MSigDB Hallmark gene set collection, uncovered that gene sets related to glucose metabolism (glycolysis and ox-phos) are significantly enriched (FDR < 25%) in breast cancer tissue (Figure 1A). When comparing aggressive breast cancer subtypes (Luminal B, HER2+ and TNBC) to Luminal A, glycolysis and ox-phos gene sets are enriched in aggressive subtypes, but to a lesser degree (Figure 1B). This enrichment analysis identifies a pattern of significant differences in gene expression between breast cancer and normal tissues, for glycolysis (Figure 1C) and ox-phos (Figure 1D) gene sets. This preliminary analysis confirms that glucose metabolism is altered in the cancerous state with varying degrees based on the cancer subtype.

Glucose metabolism is enriched in breast cancer. Gene Set Enrichment Analysis (GSEA) was used to identify key metabolic differences in breast cancer: (A) GSEA comparing breast cancer tissues to normal tissues highlights enrichment of glucose metabolism pathways including glycolysis and ox-phos and (B) subtype-specific GSEA shows enrichment of metabolic gene sets in Luminal B, HER2-enriched, and triple-negative breast cancers (TNBC) relative to Luminal A. Gene expression heatmaps of representative genes involved in (C) glycolysis and (D) oxidative phosphorylation (OXPHOS) illustrate distinct expression patterns between tumor and normal tissues.
Protein Expression of Metabolic Markers is Elevated in Breast Cancer Tissue
Immunohistochemistry images from the Human Protein Atlas were analyzed and the positively stained area was expressed as a percentage of the total cell area. ATP5A1 showed significantly increased expression in the BRCA tissue (mean = 37.9%) compared to the healthy breast tissue (mean = 19.3%), as observed by the increased presence of dark brown staining (Figure 2A). LDHA and TIGAR also showed an increased protein expression in BRCA tissue (27.8% and 36.9%, respectively) compared to the healthy tissue (7.9% and 12.7%, respectively; Figure 2B and C). PDHA1 showed some increased protein expression. However, in 52% (22/42) of the tissues stained, PDHA1 was not detected, making the results insignificant/inconclusive (Figure 2D).

Immunohistochemical staining of metabolic markers in healthy and breast cancer tissues. Immunohistochemistry (IHC) data were obtained from the human protein atlas for (A) ATP5A, (B) LDHA, (C) PDHA1, and (D) TIGAR. For each marker, (i) shows the quantification of IHC staining as the percentage of positively stained cells, while representative images of (ii) healthy and (iii) breast cancer tissues illustrate staining intensity and distribution. All markers demonstrated increased staining in tumor tissues compared to healthy controls, although PDHA1 showed more variability across samples.
Metabolic Marker mRNA Expression Can be Used to Predict Survival Outcomes in Breast Cancer
Gene expression quantification data for breast cancer and control tissues were extracted from TCGA-BRCA, PanCancer Atlas. 27 ATP5A1 did not show a significant difference between Normal and BRCA tissue (Figure 3A). LDHA and TIGAR showed higher expression in the breast cancer tissues, while PDHA1 showed lower expression in the BRCA tissue than in normal breast tissue (Figure 3B-D).

Tissue expression of metabolic markers and their prognostic value in breast cancer. Gene expression data from TCGA Pan-Cancer Atlas was used to compare metabolic marker levels in healthy breast tissue (n = 13) and breast cancer tissue (n = 117-1172) for (A) ATP5A1, (B) LDHA, (C) PDHA1, and (D) TIGAR. Expression was significantly higher in cancerous tissue for all markers except ATP5A1. Using gene expression and clinical outcome data accessed via the Human Protein Atlas, Kaplan–Meier survival curves were generated to compare overall survival in patients with high versus low expression of (E) ATP5A1, (F) LDHA, (G) PDHA1, and (H) TIGAR. High expression of all four markers was associated with significantly poorer survival outcomes. P-values for survival curves are based on the log-rank test.
TCGA gene expression quantification data was paired with follow-up survival information to develop Kaplan-Meier survival curves. In all cases, high expression of the marker was linked with poor patient survival, ATP5A1 (P = .021), LDHA (P = .0184), PDHA1 (P = .0002), and TIGAR (P = .0249; Figure 3E-H). This indicated that the markers possess some prognostic value in predicting outcomes.
Detection of Metabolic Marker mRNA in Blood Plasma
Utilizing the data available on ExoRBase, we investigated if the 4 potential biomarkers could be detected in patient blood plasma and differentiate healthy patient plasma from breast cancer patient plasma. All 4 markers revealed higher expression in breast cancer patients than in healthy patients (Figure 4A-D). Except for TIGAR, the markers could effectively be detected in patient blood samples. In many samples, from both healthy (n = 118) and breast cancer patients (n = 140), TIGAR mRNA was not detected and assigned TPM = 0. Following the extraction of gene expression data, it was used to evaluate each marker as a diagnostic tool. As we are testing the gene expression’s ability to serve as a classifier, separating “Healthy” from “BRCA” samples, we used a ROC curve to evaluate the classification model (Figure 4E-H). A practical classification model will have an area under the curve (AUC) of 0.8 or greater. 31 TIGAR had the highest AUC (0.659); however, due to the number of samples where TIGAR was not detected, this would not be reliable in practice. ATP5A1 and LDHA had AUC = 0.638, showing some ability to classify sample groups correctly, but not with high sensitivity or specificity. PDHA1 was least effective with an AUC = 0.621. ATP5A1 was selected for further investigation and validation using our tissue and plasma samples to test these markers’ plausibility and real-life capabilities.

Plasma expression of metabolic markers and their diagnostic performance in breast cancer. Gene expression levels of (A) ATP5A1, (B) LDHA, (C) PDHA1, and (D) TIGAR were extracted from the ExoRBase database, comparing plasma samples from healthy individuals and breast cancer patients. Panels (E-H) show the corresponding receiver operating characteristic (ROC) curves for each gene, evaluating their classification performance. All four markers demonstrated modest diagnostic value (AUC range: 0.62-0.66).
Modeling Approaches to Assess Biomarker Potential
To further evaluate the biomarker potential of the metabolic genes being investigated, logistic regression models were constructed using the plasma gene expression data extracted from ExoRBase. A full model incorporating the main effects and interaction terms for LDHA, PDHA1, TIGAR, and ATP5A1 yielded an AUC of 0.7180 (95% CI: 0.6562-0.7798, P < .0001) and AIC value of 338.7 (detailed values in Supplemental Table 1). Within this model, TIGAR expression was significantly lower in cancer samples (β = −.88, OR = 0.42, 95% CI: 0.16-0.96) and the interaction between TIGAR and ATP5A1 was also significant (β = −.88, OR = 0.42, 95% CI: 0.16-0.96) indicating that the combined expression level of these genes may provide added value. Other terms, including LDHA and PDHA1 did not reach statistical significance.
Using backward stepwise selection, a simplified logistic regression that focuses on LDHA, TIGAR and the interaction between TIGAR and ATP5A1 achieved a comparable AUC of 0.7126 (95% CI: 0.6502-0.7750, P < .0001) with improved parsimony reflected by a reduced AIC of 329.9 (Figure 5A). In this model, TIGAR expression was positively associated with breast cancer status (β = 1.085, OR = 2.96, 95% CI: 1.61-5.95), while LDHA showed a small but statistically significant negative association (β = −.0113, OR = 0.99, 95% CI: 0.9788-0.9982; Figure 5B). The TIGAR: ATP5A1 interaction remained significant (β = −.01396, OR = 0.9861, 95% CI: 0.9781-0.9930), reinforcing the potential relevance of metabolic crosstalk in breast cancer (Figure 5B).

Evaluation of plasma-based gene expression models for breast cancer classification. (A) receiver operating characteristic (ROC) curve for the simplified logistic regression model including TIGAR, ATP5F1A, and their interaction term, showing an area under the curve (AUC) of 0.71, (B) coefficient estimates (β) from the simplified logistic regression model, highlighting TIGAR, LDHA, and the interaction between TIGAR and ATP5F1A as key contributors, (C) ROC curve for the LASSO logistic regression model using all 4 markers (LDHA, PDHA1, TIGAR, ATP5F1A), achieving and, (D) coefficients from the LASSO model at λ = 0.0087, indicating all 4 markers were retained with small positive weights.
To further assess feature stability and reduce multicollinearity, a LASSO-regularized logistic regression model was fit to the same dataset (full results in Supplemental Table 2). At the optimal penalty term (λ = 0.0087), all 4 candidate genes were retained with modest coefficients (LDHA = 0.0084, PDHA1 = 0.0068, TIGAR = 0.0886, ATP5F1A = 0.0128), yielding an overall AUC of 0.66 (Figure 5C and D). Although predictive performance was slightly lower, the retention of all 4 genes supports their combined relevance in the context of breast cancer detection, even if individual effects are subtle.
Taken together, these models suggest that genes previously implicated in miR-526b-regulated metabolic pathways, particularly TIGAR and its interaction with ATP5F1A, show potential as plasma-based biomarkers for breast cancer.
Validation of ATP5A1 Expression in Breast Cancer Biopsy Tissue Samples
Using our normal breast cancer tissue acquired from OICR, we validated the expression of ATP5A1 (sample demographic in Supplemental Table 3). In our sample set, ATP5A1 showed a higher mRNA expression in breast cancer (ΔCT = 1.43 [95% CI: 0.42-2.43]) than in normal breast tissue (ΔCT = 5.80 [95% CI: 4.19-7.41]; Figure 6A). The observed difference in ATP5A1 expression was not subtype-specific or dependent on hormone receptor status (Figure 6B and Supplemental Table 5). The difference also does not appear to be tumor stage-specific, although we have limited stage I and stage IV tissues available (Figure 6C). The classification ability of ATP5A1 mRNA expression from tissue samples is significant, it is high in every tumor stage compared to control. Following ROC analysis, an AUC of 0.826 (95% CI: 0.738-0.914) is produced, above the standard benchmark (Figure 6D). This illustrates that ATP5A1 effectively distinguishes between healthy and cancerous breast tissue, hence is a good predictor of disease outcome as tumor marker.

ATP5A1 mRNA expression as a tissue biomarker in breast cancer. (A) ATP5A1 mRNA expression was analyzed in healthy and breast cancer tissue biopsies obtained from the Ontario Institute for Cancer Research (OICR), showing a significant increase in tumor tissues, (B) expression was stratified by hormone receptor status, (C) ATP5A1 expression was also assessed across tumor stages, showing stage-associated differences, and (D) a receiver operating characteristic (ROC) curve illustrates the classification strength of ATP5A1 mRNA expression in distinguishing tumor from healthy tissue.
Validation of ATP5A1 Expression in Breast Cancer Plasma Samples
Using our normal and breast cancer blood plasma samples acquired from London Tumor Biobank we validated the detection and expression of ATP5A1 (sample demographic in Supplemental Table 4). In our sample set, ATP5A1 showed marginally higher mRNA expression in breast cancer (ΔCT = 1.08 [95% CI: 0.50-1.66]) than in normal patient plasma (ΔCT = 1.87 [95% CI: 1.21-2.52]; Figure 7A). The observed difference in ATP5A1 expression was not subtype-specific, nor did it vary by hormone receptor status (Figure 7B and Supplemental Table 6). The difference also does not appear to have a relationship with tumor stage. Stage II showed a higher expression than Stage I, but Stage III showed a lower expression than Stage II (Figure 7C), although we were limited to only a few stage III malignant plasma samples, a more comprehensive sample set would be required to investigate this relationship. ROC analysis produces an AUC of 0.574 (95% CI: 0.445-0.702; Figure 7D). Therefore, according to our investigation, ATP5A1 is not a suitable blood biomarker for breast cancer detection or to monitor disease progression. ATP5A1 was discovered in the secretion of MCF7-miR526b and MCF7-miR655 cell lines, and pri-miR526b was shown to have breast cancer biomarker potential with high sensitivity AUC = 0.715.19,20,32 So, we have conducted a combined pri-miR-526b and ATP5A1 biomarker potential using logistic regression. The predicted probabilities were then used in ROC analysis, producing an AUC = 0.688 (95% CI: 0.558-0.817), a vast improvement from ATP5A1 alone (Figure 7E). However, this is less than pri-miR526b alone. 20 Therefore, ATP5A1 would not be clinically relevant as a blood biomarker to use either alone or in combination. Combining miRNA expression with marker RNA expression for the purpose of a strong biomarker needs further investigation. The inclusion of multiple miRNA associated with disease may prove more effective than what we have shown and should be evaluated using more advanced statistical analyses.

ATP5A1 mRNA expression as a blood-based biomarker for breast cancer. (A) ATP5A1 mRNA expression was analyzed in blood plasma from healthy donors and breast cancer patients, obtained from the London Tumor Biobank. Expression levels were significantly elevated in patients compared to healthy controls, (B) a stratified analysis by hormone receptor status, (C) ATP5A1 expression was further examined across tumor stages, demonstrating a stage-dependent trend, (D) a receiver operating characteristic (ROC) curve illustrates the classification performance of ATP5A1 mRNA expression in plasma, and (E) classification strength improved when ATP5A1 was combined with miR-526b expression using a multiple logistic regression model, as demonstrated by ROC analysis of the combined model.
Discussion
Early detection remains the most effective strategy to reduce breast cancer mortality, yet age restrictions, accessibility challenges, and patient compliance limit current screening methods like mammography. A reliable blood-based screening tool could overcome these barriers, offering a minimally invasive and broadly applicable alternative. In this study, we investigated the diagnostic utility of 4 metabolic markers, LDHA, PDHA1, ATP5A1, and TIGAR, selected based on their regulation by the oncogenic miR-526b and their relevance to the metabolic reprograming characteristic of breast cancer.19,33,34 Gene set enrichment analysis revealed that glycolysis and ox-phos pathways are significantly upregulated in breast cancer tissue compared to normal tissue, indicating the presence of metabolic reprograming. Importantly, this reprograming appears to vary across breast cancer subtypes, consistent with findings by Farhadi et al, who demonstrated subtype-specific differences in metabolic pathway activity, with the most aggressive alterations observed in TNBC. 21 This supports previous evidence of an enhanced Warburg effect in aggressive subtypes such as TNBC and underscores the link between metabolic reprograming and the proliferative and invasive capacity of cancer cells.35,36
The expression patterns of the markers LDHA, PDHA1, ATP5A1, and TIGAR in breast cancer tissues were examined using immunohistochemistry (IHC) data from the Human Protein Atlas for protein expression and mRNA expression data from the TCGA PanCancer Atlas BRCA dataset. LDHA and TIGAR showed consistent upregulation at both the mRNA and protein levels in breast cancer tissue compared to normal breast tissue. PDHA1 was downregulated at the mRNA level and showed variable protein expression, although many samples had undetectable protein levels based on IHC analysis. ATP5A1 exhibited a slight, non-significant decrease in mRNA expression in breast cancer tissues, yet a marked increase in protein expression, demonstrated by significantly stronger IHC staining compared to normal controls. This discrepancy may be attributed to post-transcriptional regulation of ATP5A1 or to increased protein stability, likely due to its role in the mitochondrial ATP synthase complex and the elevated mitochondrial biogenesis frequently observed in breast cancer cells. 37
The prognostic value of all markers was evaluated using TCGA expression and survival data. In each case, high expression was associated with significantly poor overall survival. This finding is consistent with the established role of metabolic reprograming in cancer, particularly the increased reliance on glucose metabolism to support rapid cell division, survival and metastatic potential. 10 Elevated LDHA and TIGAR expression, for example, supports increased glycolytic flux and regulation of oxidative stress, respectively, both processes known to contribute to cancer cell fitness. The association of high ATP5A1 and PDHA1 expression with a poor prognosis may define a complex phenotype in certain breast cancer subtypes, where mitochondrial respiration and ATP production are maintained or elevated to meet energetic demands. This observation is consistent with our earlier findings, which showed distinct metabolic responses based on the native metabolic profiles of different breast cancer subtypes, highlighting the metabolic flexibility cancer cells may exploit to sustain growth. 19
The long-term goal of developing a blood-based biomarker prompted us to investigate the mRNA expression of the selected markers in the plasma of breast cancer patients. Using exosomal RNA data from ExoRBase, we found that all 4 markers exhibited increased expression in breast cancer patient plasma compared to healthy controls. Among these, ATP5A1 showed the most statistically significant increase, while TIGAR showed the least significant difference. Notably, TIGAR expression was generally very low across plasma samples, with many samples showing undetectable levels, potentially limiting its utility as a circulating blood biomarker. Building on these findings, we performed ROC analysis, which showed that all 4 markers had AUC values ranging from 0.628 to 0.659. While this indicates modest classification potential, the values fall below the commonly accepted benchmark of AUC = 0.80 for a clinically robust diagnostic model. 31
In an effort to improve the biomarker potential of our findings, we employed multiple modeling approaches to assess whether the combined use of markers could enhance classification performance. Both multiple logistic regression and LASSO regression supported the utility of these markers as a diagnostic tool. Using backward stepwise selection, the most robust logistic regression model based on ExoRBase plasma expression data included main effects for LDHA and TIGAR, as well as an interaction term between ATP5A1 and TIGAR. This suggests that LDHA and TIGAR individually can predict whether a plasma sample originates from a breast cancer patient, while the interaction between ATP5A1 and TIGAR further refines the model’s discriminatory ability. Additionally, the LASSO model retained all 4 markers, with non-zero coefficients (β > 0), further supporting their collective relevance in distinguishing cancer from non-cancer samples. Notably, the observed relationship between TIGAR and ATP5A1 warrants further investigation, as previous studies have reported non-enzymatic functions of TIGAR, including a direct interaction with ATP5A1 that may confer mitochondrial protection. 33
The RNA expression and biomarker potential of ATP5A1 were validated using in-house tissue and plasma samples from both breast cancer patients and healthy controls. Although the AUC values derived from ExoRBase plasma RNA data were modest, suggesting limited standalone diagnostic power, we proceeded with RT-qPCR validation of ATP5A1 due to its biological relevance in the context of miR-526b. In our previous experiments, ATP5A1 protein levels were found to be elevated in the extracellular media of breast cancer cells overexpressing miR-526b, suggesting that the gene is responsive to this miRNA and may be actively secreted under pathological conditions. 32 While our current focus is on RNA-based biomarker development, this prior protein-level evidence supports ATP5A1 as a downstream target of miR-526b and a plausible candidate for circulating RNA detection in breast cancer diagnostics. ATP5A1 showed significantly higher expression in the breast cancer biopsy tissue compared to normal control tissue. There were no stage or subtype specific trends observed, and the findings produced a clinically relevant AUC of 0.826. However, in blood plasma samples, ATP5A1 showed a non-significant upregulation in breast cancer samples, compared to healthy controls. Similarly, no stage, or subtype specific trends were observed, and an unremarkable AUC of 0.574 was produced. Combining these findings with the expression of miR-526b improved the AUC to 0.688. A modest improvement from ATP5A1 alone, however, the performance is weaker than miR-526b, which alone produced an AUC of 0.715. 20 In this case the combination of miRNA and mRNA expression did not result in an improved AUC. However, this concept should be further investigated. Multiple disease related miRNA and downstream targets may present a more powerful classification model. In the future, the identification of these potential markers should be of critical focus prior to testing. Using established methods such as the regularized Cox proportional hazard models may lead to the identification of markers with greater likelihood of success in biomarker testing. 38
Limitations
While our results offer valuable insight into the biomarker potential of miR-526b–associated metabolic genes, several limitations affect the interpretation and generalizability of our findings. The use of publicly available datasets, such as TCGA and ExoRBase, enables broad accessibility and facilitates hypothesis generation. However, these datasets often lack detailed clinical metadata. In our case, information on tumor stage and subtype was unavailable, limiting our ability to explore stage- or subtype-specific expression patterns. Similarly, in our in-house validation of ATP5A1 using tissue and plasma samples, limited sample availability, particularly the underrepresentation of TNBC, stage 0, stage I and stage III cases, restricted our ability to assess subtype-specific diagnostic potential.
Additionally, ROC analysis is most robust when performed on balanced datasets with similar numbers of cancer and healthy samples. Although our dataset was imbalanced, the ROC curve remains a relatively robust evaluation metric. Given these considerations and the exploratory nature of this study, ROC analysis was deemed valuable for providing a preliminary assessment of diagnostic performance. We also noted that TIGAR expression was undetectable in a number of plasma samples from ExoRBase. While this may enhance its specificity in some modeling contexts, it poses potential challenges for clinical translation. More broadly, many markers, including ATP5A1, showed more pronounced expression differences in biopsy tissue than in plasma. While statistically, models may be significant; real-world clinical use of these diagnostic markers requires a much higher sensitivity and specificity. This study is limited by few plasma samples for each tumor subtype and also benign or control plasma samples. Future efforts will include adding more specimens to improve the sample size and plasma RNA detection sensitivity. Prioritizing markers with high levels of RNA secretion may enhance the development of effective blood-based screening tools. Due to the pilot nature of this study and the modest results, it was determined that external validation of the logistic regression models was not necessary.
Conclusion
This study highlights the diagnostic potential of miR-526b-associated metabolic markers in breast cancer, particularly LDHA, ATP5A1, PDHA1, and TIGAR. While these markers showed prognostic value in tumor tissue and modest classification potential in plasma, modeling approaches improved diagnostic performance. Tissue-level validation showed strong ATP5A1 upregulation and tissue biomarker performance. These findings support the continued investigation of metabolic reprograming as a source of blood-based biomarkers and emphasize the need for improved RNA detection strategies to enhance screening sensitivity.
Supplemental Material
sj-docx-1-cix-10.1177_11769351251408670 – Supplemental material for Prospective Breast Cancer Biomarkers Identified Using miR-526b-Driven Metabolic Alterations
Supplemental material, sj-docx-1-cix-10.1177_11769351251408670 for Prospective Breast Cancer Biomarkers Identified Using miR-526b-Driven Metabolic Alterations by Braydon Nault and Mousumi Majumder in Cancer Informatics
Footnotes
Acknowledgements
We thank Prof. Peeyush K. Lala from Western University, ON, Canada and Prof. Muriel Brackstone from London Health Sciences Center-London Regional Cancer Program, ON, Canada, for providing breast tissue and blood plasma from the Ontario Institute for Cancer Research (OICR) in Ontario and the London Tumour Biobank (LTB) in London, Ontario.
Abbreviations
AIC Akaike information criterion
AUC area under the curve
ATP adenosine triphosphate
ATP5A1 ATP synthase F1 subunit alpha
BRCA breast cancer
cDNA complementary DNA
CI confidence interval
CT cycle threshold
FDR false discovery rate
GSEA gene set enrichment analysis
HER2+ human epidermal growth factor receptor 2–positive
IHC immunohistochemistry
LASSO least absolute shrinkage and selection operator
LDHA lactate dehydrogenase A
LTB London tumor biobank
MCF7 Michigan cancer foundation-7 (breast cancer cell line)
miR microRNA
MSigDB molecular signatures database
NCBI national center for biotechnology information
OICR Ontario institute for cancer research
PDHA1 pyruvate dehydrogenase E1 alpha 1
RNA ribonucleic acid
ROC receiver operating characteristic
RT-qPCR reverse transcription quantitative polymerase chain reaction
TCGA the cancer genome atlas
TIGAR TP53-induced glycolysis and apoptosis regulator
TNBC triple-negative breast cancer
TPM transcripts per million
Ethical Considerations
This study is approved by the Brandon University Ethical Approval Committee (approval # 23056) and the Brandon University Biosafety Committee (approval # 2020-BIO-02).
Consent to Participate
The Ontario Institute for Cancer Research and London Tumour Biobank obtained informed consent from each subject before collecting samples from participants. Our study adheres to the principles outlined in the Declaration of Helsinki.
Author Contributions
BN conducted all experiments, generated data, extracted data from databases, performed data analysis, made figures, and wrote the first draft. MM contributed to funding acquisition, supervision, and manuscript writing and editing.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: We would like to thank these various funding agencies for their support. This work is funded by Breast Cancer Canada, the Lotte and John Hecht Memorial Foundation, the Natural Sciences and Engineering Research Council of Canada (NSERC), and grants to M.M. This study is also supported by the Canada Research Chair Program (CRCP), the Canada Foundation for Innovation (CFI), and Research Manitoba matching funds to M.M. B.N. received a NSERC Undergraduate Student Research Award, a Research Manitoba Master’s Studentship Award, and a Canadian Institutes of Health Research (CIHR) Frederick Banting and Charles Best Canada Graduate Scholarship-Master’s (CGS-M).
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data Availability Statement
All data generated in this study are available within the article, its Supplemental information, or from the corresponding author upon reasonable request.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
