Abstract
Background:
Lung adenocarcinoma is a highly heterogeneous group of diseases with distinct molecular genetic features, pathological characteristics, metabolic profiles, and clinical behaviors. However, the clinical relevance of metabolic characteristics of lung adenocarcinoma remains unclear. This study aimed to describe the molecular characteristics of lung adenocarcinoma.
Methods:
The gene expression profiles of 1037 lung adenocarcinoma samples were downloaded from The Cancer Genome Atlas and Gene Expression Omnibus databases. This study is based on sample data from 2006 to 2020. The long-time span and sufficient sample size ensure the robustness of the research findings. Using unsupervised transcriptome analysis, we identified three distinct subtypes (C1, C2, and C3). We then compared the prognostic traits, transcriptome characteristics, metabolic signatures, immune infiltration, clinical features, and drug sensitivity of the lung adenocarcinoma subclasses. A classifier was generated to determine lung adenocarcinoma classification, and we verified the clinical value of this classifier in other tumors.
Results:
Our results indicated that C1 possessed the most abundant metabolic pathways. Compared with C2 and C3, C1 possessed 35 metabolic pathways that exhibited significant differences. The immune score, matrix score, and immune infiltration for subtype C1 were significantly lower than those for subtypes C2 and C3, suggesting that C1 is a metabolically active subtype. Five metabolic pathways were observed in C2. Subtype C2 was associated with the best prognosis and exhibited the lowest tumor mutation burden and copy number variation. Subtype C3 comprised five metabolic pathways. Immune checkpoint analysis revealed that C3 cells may potentially benefit from immunotherapy.
Conclusions:
Our study deepens the understanding of the metabolic characteristics of lung adenocarcinoma and may provide valuable information for immunotherapy.
Introduction
Lung cancer continues to be the most dominant cause of cancer mortality worldwide, with more than 1 million deaths each year, and lung adenocarcinoma (LUAD) is the most common histological subtype of this disease.1 –3 LUAD is a highly heterogeneous group of diseases with distinct molecular genetic features, pathological characteristics, and clinical behaviors. 4 Despite advances in early diagnosis and new therapeutic strategies such as small molecule targeted therapy and immunotherapy that have provided new hope for patients with LUAD, the prognosis of LUAD patients remains far from satisfactory. Thus, there is a need to identify the molecular mechanisms that contribute to LUAD to develop new and effective prevention and treatment strategies.
With the development of RNA-sequencing technology and microarrays, gene expression profiling has emerged as a useful tool for classifying tumors.5
–8 For example, Hu et al.
9
used
Dysregulated metabolism is indispensable for cancer cell proliferation.10 –12 Abnormal cancer metabolism leads to unique metabolic dependencies that can be targeted for therapeutic effects.13,14 Based on this, we believe that insights into the differences in metabolism of various LUAD subtypes may lead to the discovery of new treatment modalities. Recently, a study divided colorectal cancer samples into three subclasses according to metabolic genes: metabolic active subtype (C1), metabolic exhausted subtype (C2), and intermediate metabolic activity subtype (C3). 15 The molecular, immune, and clinical characteristics of each subtype are different. However, molecular classification of LUAD metabolism has not yet been reported. In this study, we classified LUAD from a metabolic perspective. We compared the prognosis characteristics, transcriptome characteristics, metabolic signatures, immune infiltration, clinical features, and drug sensitivities of the LUAD subclasses. A classifier was generated to determine LUAD classification, and we verified the clinical value of this classifier in other tumors.
Materials and methods
Data preprocessing
Clinical and molecular data of LUAD were collected from The Cancer Genome Atlas (TCGA) 16 (https://cancergenome.nih.gov/) and Gene Expression Omnibus (GEO) databases 17 (https://www.ncbi.nlm.nih.gov/geoprofiles/), and only tumor samples were retained. The TCGA-LUAD datasets were downloaded using the TCGABiolinks package 18 as Fragments Per Kilobase of transcript per Million mapped reads (FPKM) values. FPKM was then transformed into Transcripts Per Million according to the GENCODE version 27 annotation file. After data processing, 487 patients with LUAD from the TCGA-LUAD project were included in the training study. To verify this, expression data of human LUAD mRNA were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/). Four datasets, GSE30219 19 (containing 85 LUAD samples), GSE3121020,21 (226 LUAD samples), GSE3774522 –24 (106 LUAD samples), and GSE4212725,26 (133 LUAD samples), were selected as the testing sets. Clinical information was extracted from the TCGA pan-cancer clinical data resource. The clinical characteristics of the 1037 patients are presented in Table 1. Gene copy number data were obtained using Firehose.
Clinical characteristics of patients with LUAD in TCGA and GEO sets.
Identification of LUAD subtypes
Nonnegative matrix factorization (NMF) is an effective matrix decomposition method that decomposes a large nonnegative matrix into several small matrices to achieve clustering and typing.
27
In our study, we prepared 2752 metabolism-relevant genes for subsequent NMF clustering. To reasonably classify LUAD samples, we first used the ComBat algorithm to eliminate batch effects across different cohorts. Candidate genes with an expression value of zero and a low median absolute deviation value ⩽ 0.5 across the patients with LUAD were excluded from the study. A Cox proportional hazards model was also created using the “survival” R package to screen for meaningful genes for overall survival (
To evaluate the stability of LUAD subtypes across different datasets and explore their relationship with existing molecular classifications, we utilized subclass Mapping (SubMap) analysis (Gene Pattern). We established a reference model using the gene expression data of LUAD subtypes from the training set and matched the expression matrix of the test dataset to this reference model to assess the consistency between different subtypes. Furthermore, we applied SubMap analysis to predict the potential response of different subtypes to immune checkpoint inhibitor therapy (e.g., PD-1 inhibitors) to explore the association between LUAD subtypes and immunotherapy sensitivity.
Additionally, we used t-distributed stochastic neighbor embedding (t-SNE) analysis to validate subtype assignments based on the mRNA expression data, ensuring the scientific rigor of the classification.
Gene set variation analysis
Gene set variation analysis (GSVA) is a nonparametric and unsupervised gene set enrichment method that can calculate a certain pathway or signature scores based on transcriptomic data.
30
Data of the 115 metabolism-associated gene signatures and 11 cancer pathways were obtained from previously published studies.31,32 The GSVA R package was used to identify differences among different gene sets. Subsequently, differential expression analysis was conducted using linear models for microarray analysis (LIMMA) software (R, Bioconductor),
33
and differentially expressed signatures were defined as those with an absolute log2 fold change (FC) > 0.2 (adjusted
Estimation of immune infiltration
The absolute abundance of eight immune and two nonimmune stromal cell populations was estimated using microenvironment cell populations-counter (MCP-counter), an independent bioinformatics tool to assess immune cell enrichment. 34 Furthermore, the single-sample GSEA algorithm was another approach used to estimate immune infiltration in this study. 35 An additional six immune cell populations were estimated using the Bioconductor R package GSVA. Additionally, the ESTIMATE 36 algorithm was used to calculate immune and stromal scores, which can represent enrichment of stromal and immune cell gene signatures.
Characterization of LUAD subtypes
The LIMMA package was used for differentially expressed gene (DEG) analysis according to normalized count data. DEGs were defined as absolute log2 FC > 1 (corrected
Generation of the classifier and performance validation
We defined statistically significant differential genes as adjusted
Prediction of the benefit of each subclass from immunotherapy
Based on the 90-gene classifier, a consistency check was performed using the NTP algorithm to predict the metabolism-related classes for each sample. The data from patients with LUAD who received immunotherapy were used to indirectly predict the immunotherapy efficacy of our subtypes by measuring the similarity of gene expression profiles between our subtypes and patients with LUAD based on SubMap analyses.
Prediction of drug sensitivity in each subclass
To identify effective antitumor drugs, we downloaded drug sensitivity data from the Genomics of Drug Sensitivity in the Cancer (GDSC) database. The Kruskal–Wallis test was used to compare the sensitivity of 100 drugs in the GDSC database. IC50 data were obtained from the GDSC database. According to the ranking of cell lines from low to high IC50, the top 1/3rd of the cell lines were defined as drug-sensitive, and the last 1/3rd of the cell lines were defined as drug-resistant.
GO and KEGG enrichment analysis
ClusterProfiler software was used for enrichment analyses that included GO and KEGG enrichment analyses of DEGs among the three subtypes (adj.
Pan-cancer analysis of 90-gene classifier
The gene expression profiling interactive analysis (GEPIA) database was used to study the 90-gene classifier. A heatmap was used to display the expression levels in the tumor tissues of individual cancers in TCGA. 38
Prediction of transcription factors
Considering that transcription factors exert various functions in the context of gene regulation, we analyzed transcription factors acquired from the transcriptional regulatory relationships unraveled by sentence-based text-mining (TRRUST) database, which are most likely to regulate the 90 genes included in the classifier. 39
Statistical analysis
All data processing and analyses were performed using R (version 4.0.2) and Excel (Microsoft). In this study, we used several statistical methods to analyze the differences among LUAD subtypes (C1, C2, and C3). Prior to intergroup difference analysis, we first calculated the variance for each group to assess the degree of data dispersion. Differences between two groups were analyzed using Student’s
Results
NMF identifies three subtypes in LUAD
The clinical characteristics of patients from different cohorts are listed in Table 1. After removing batch effects, we created a principal component analysis (PCA) plot (Figure 1(a)). In the results of PCA, the first principal component (PC1) accounted for 8.8% of the variance, indicating a substantial contribution of this component to the data distribution. Based on 2752 previously reported metabolically related genes, univariable Cox regression was used to filter genes related to Overal Sutvival (OS) time (

Identification of lung adenocarcinoma (LUAD) subclasses using nonnegative matrix factorization (NMF) consensus clustering in the metadata set. (a) Principal component analysis plot of the combined expression profile of cohort data. (b) NMF clustering using 816 metabolism-associated genes. (c) t-distributed stochastic neighbor embedding (t-SNE) analysis supported the stratification into three subclasses. (d–h) OS of three subclasses in training set and testing sets.
Correlation between the LUAD subtypes and metabolism-related signatures
Considering that the classification was based on metabolism-related genes, we further explored the unique metabolic features of each subtype. First, we used the R package GSVA to calculate the expression levels of genes associated with metabolism and carcinogenesis. To define the subtype-specific differential metabolic pathways, we used |logFC| = 0.2 and padj = 0.05 and plotted a heat map (Figure 2(a)). The results revealed that most of the differential metabolic pathways were enriched in C1. Compared to C2 and C3, C1 possessed 35 significantly different metabolic pathways, including amino acids, lipids, and other metabolism-related signatures. This clearly indicated that C1 was the most metabolically active subtype. Additionally, C2 has five enriched metabolic pathways, primarily related to lipid metabolism, while C3 also has five metabolic pathways, but with lower enrichment levels compared to C1.

Association between lung adenocarcinoma (LUAD) subtypes and metabolism-related signatures. (a) Heatmap of GSVA enrichment scores for metabolic pathways, showing the expression levels of metabolic gene sets in the C1, C2, and C3 subtypes. (b) Boxplot of GSVA results for 11 cancer-related signaling pathways. Boxplot of immune scores (c) and stromal scores (d) from ESTIMATE of subtypes.
To further understand the characteristics of these subtypes, 11 carcinogenesis pathways were evaluated and quantified using the GSVA algorithm (Figure 2(b)). The results revealed that the NORCH signature of C1 was significantly higher than that of C2 and C3. Additionally, C3 displayed higher expression of PI3K and cell cycle pathways, while C2 was significantly enriched in HIPPO, TGF-β, RTK/RAS, TP53, WNT, and angiogenesis pathways. The results presented above indicate that this classification may be strongly associated with cancer.
To evaluate the heterogeneity among the three subtypes, we used the ESTIMATE algorithm to calculate the stromal and immune scores and construct violin and box plots (Figure 2(c)and (d)). The results indicated that the immune scores of C2 and C3 were significantly higher than that of C1 (
The correlation between LUAD subtypes and immune infiltration
The results presented above indicate that there was a significant difference in immune scores among subtypes. To further investigate the immune microenvironment characteristics of LUAD subtypes, we analyzed the infiltration of immune cells and the expression levels of immune checkpoint genes to describe the immunologic landscape in the TCGA-LUAD database. We estimated the abundance of 16 immune cell types using the MCP-counter1 and single-sample GSEA algorithms and presented them in a heat map of immune function-related genes (Figure 3(a)). We then mapped a box plot to reflect the differences among the three groups. The overall levels of immune cell infiltration were significantly higher in C2 and C3 compared to C1. Specifically, T cells (CD8+ T cells), B cells, and activated NK cells were more highly expressed in C2 and C3. In contrast, C1 exhibited lower levels of immune cell infiltration, suggesting that C1 may possess an immune-cold characteristic. Figure 3(b) further validated the enrichment of different immune cell subsets across subtypes. CD8+ T cells and NK cells were most highly enriched in C3 (

Immune characteristics of three subtypes in The Cancer Genome Atlas (TCGA)-lung adenocarcinoma (LUAD) set. (a) Heatmap describing the abundance of immune and stromal cell populations in C1, C2, and C3. Immune cell scores were calculated based on MCP-counter and single-sample GSEA (ssGSEA). (b) Boxplot of the abundance of immune and stromal cell populations distinguished by different subtypes. (c) Expression levels of 13 immune checkpoint genes. Statistical comparisons were performed using the Kruskal–Wallis test, with significance levels indicated in the figures.
To further validate the immune microenvironment characteristics, we analyzed the expression of 13 immune checkpoint genes across different subtypes (Figure 3(c)). Those genes were selected based on drug inhibitors currently approved for specific cancer types. The analysis revealed that the expression levels of PD-1 (PDCD1), PD-L1 (CD274), CTLA4, and LAG3 were significantly higher in C3 compared to C1 and C2 (
In summary, the analysis of immune cell infiltration and immune checkpoint gene expression uncovered distinct immune profiles among different LUAD subtypes. C3 exhibited an “immune-hot” tumor phenotype, characterized by high levels of immune cell infiltration and immune checkpoint gene expression, and may represent a subgroup that could potentially benefit from immunotherapy. C1 displayed an “immune-cold” phenotype, which may require combination therapeutic strategies to enhance sensitivity to immunotherapy, while C2 had an intermediate immune profile between C1 and C3.
Correlation between LUAD subgroups and clinical characteristics in TCGA and GEO datasets
To investigate the distribution of LUAD subtypes across various clinical characteristics, we analyzed patient information from the TCGA-LUAD dataset as well as the GSE30219, GSE31210, GSE37745, and GSE42127 datasets. Figure 4(a) presents the stratified statistics of the TCGA-LUAD cohort, including patient age, smoking status, pathological stage (pStage), mTOR pathway activation, and DNA methylation profiles. The overall trend indicates that C1 is overrepresented in patients with high smoking exposure (>30 pack-years), mTOR pathway activation, and specific methylation patterns, suggesting that C1 may be characterized by unique metabolic and signaling network reprogramming. In contrast, C2 is more prevalent among patients with lower smoking exposure and is associated with lower levels of mTOR pathway activation and distinct DNA methylation profiles. C3 exhibits intermediate distribution across these clinical features.

Clinical characteristics of lung adenocarcinoma (LUAD) subtypes in the TCGA cohort and GEO testing cohort. (a) Distribution of subtypes (C1, C2, and C3) in the TCGA-LUAD dataset across demographic features, smoking status, pathway activation (mTOR), DNA methylation patterns, and pathological staging. (b–e) Analysis of clinical features of LUAD subtypes in GEO datasets: GSE30219 (b), GSE31210 (c), GSE37745 (d), and GSE42127 (e).
To further validate whether the clinical characteristics of different LUAD subtypes are consistent across independent GEO datasets, we repeated the classification analysis in the GSE30219, GSE31210, GSE37745, and GSE42127 datasets (Figure 4(b)–(e)). The results from the GEO cohorts were largely consistent with those from the TCGA-LUAD dataset, indicating that C1 is more common in patients with advanced pathological stages (Stages III and IV) and heavy smoking history, while C2 is relatively more prevalent in patients with early-stage disease (Stages I and II). Additionally, the enrichment of the mTOR signaling pathway showed a similar trend across the GEO datasets, with C1 exhibiting higher levels of mTOR pathway activation and C2 showing the lowest levels of mTOR activation.
These findings demonstrate good cross-dataset consistency in the distribution of LUAD subtypes across clinical and molecular features, suggesting that C1 may be driven by metabolic factors, while C2 may be associated with more primary features of LUAD. This subtype-specific molecular pattern may hold significant implications for LUAD diagnosis, stratification, and the development of personalized therapeutic strategies.
Associations of LUAD subtypes with mutations, neoantigens, and copy number aberrations
To investigate the genetic mutations and genomic characteristics of LUAD metabolic subtypes, we analyzed the gene mutations, tumor mutation burden (TMB), predicted neoantigen load, and copy number variation (CNV) patterns across different subtypes (C1, C2, and C3) in the TCGA-LUAD dataset.
Figure 5(a) presents the mutation spectra of different LUAD subtypes, focusing on key driver genes with the highest mutation frequencies, such as

Association between lung adenocarcinoma (LUAD) subtypes and mutations, neoantigens, and copy number aberrations. (a) Waterfall map of driven carcinogenic mutations clustered in C1, C2, and C3. (b) Analysis of tumor mutation burden (TMB). (c) Analysis of neoantigen load. (d–e) Analysis of copy number variation (CNV), including gene amplification levels (d), and gene deletion levels (e).
Figure 5(b) and (c) further examines the differences in TMB and neoantigen load. Statistical analyses revealed that C3 had significantly higher TMB and neoantigen load compared to C1 and C2 (
Figure 5(d) and (e) evaluates the CNVs across the three subtypes, including gene amplifications and deletions. C3 exhibited the highest level of CNVs (
Overall, the mutation analysis indicates that C3 is characterized by high mutation rates, neoantigen load, and genomic instability, and may benefit the most from ICI therapy. C1 exhibited intermediate levels of genomic variation, while C2 had the lowest overall mutation burden and genomic changes, consistent with its better clinical prognosis. These findings may guide personalized treatment decisions for LUAD patients.
To further explore the CNVs associated with the three metabolic subtypes of LUAD, we employed GISTIC2.0 to analyze the genomic CNV patterns across the three groups of samples. Figure 6(a)–(c) illustrates the genomic copy number alteration profiles of C1, C2, and C3. C3 exhibited the most pronounced CNV alterations, including multiple amplifications and deletions across various loci, indicative of greater genomic instability. C1 showed significant amplifications at loci 1q, 8q, and 19q, and frequent deletions at loci 9p and 18q. In contrast, C2 had the fewest CNVs, typically limited to localized amplifications/deletions of a few driver genes, consistent with its lower TMB and better clinical prognosis.

GISTIC score analysis of the lung adenocarcinoma (LUAD) subtypes in The Cancer Genome Atlas (TCGA) cohort. Cytoband map of copy number alterations across subtypes C1 (a), C2 (b), and C3 (c). Red indicates regions of amplification, and blue indicates regions of deletion. (d) Box plots of predicted IC50 values based on the Genomics of Drug Sensitivity in the Cancer (GDSC) database.
Prediction of drug susceptibility
To identify potential antineoplastic drugs associated with the ICI group, we downloaded drug response data for more than 100 agents from the GDSC database. The IC50 values of the selected compounds across the different LUAD subtypes were compared using the Kruskal–Wallis test. We then listed the top 12 drugs with the most significant differences according to the
The 90-gene classifier and its performance verification
To establish a classifier for clinical use, subtype-specific genes were selected to develop a prediction model. Differential expression analysis of “subtype n versus other subtypes” was performed using the Limma package. After the Movics package analysis, 40 the first 30 genes with the highest log2 FC values in each subtype were selected for the development of the subtype classifier, and a correlation heat map was created (Figure 7(a)). Based on this result, we obtained a 90-gene classifier. The subtypes of TCGA and the test sets were then predicted using the NTP method. Heat maps were created to represent the degree of matching between the true and predicted subtypes (Figure 7(b) and (c)). The results revealed good consistency between the two separate methods (NMF and NTP), indicating that the 90-gene signature can reproducibly determine LUAD classification.

Identification of predictive classifier and putative targeted therapeutic and immunotherapeutic responses. (a) Heatmap of the expression levels of the 90-gene classifier. Concordance of The Cancer Genome Atlas (TCGA)-lung adenocarcinoma (LUAD) (b) and testing cohort (c) prediction between the 90-gene classifier and the original prediction based on nonnegative matrix factorization. (d) The predicted molecular targeted therapy and immunotherapy response of the classifier.
The observation that different subtypes exhibit different patterns of immune cell infiltration and expression levels of immune checkpoint genes indicates that further research is needed to predict immunotherapy responses. Subclass mapping was used to compare the degree of similarity in expression profiles between the three subtypes and the dataset containing 47 patients with lung cancer who received immunotherapy. The results suggested that patients in group C3 were most likely to respond to immunotherapy (Figure 7(d)).
Functional enrichment analysis of gene classifiers
GO and KEGG functional enrichment analyses of the DEGs were conducted using the cluster profile R package. Visualization was performed using the R package ggplot2. The significance of enrichment for the top ten DEGs is presented in Figure S1 and Table S1.
GO enrichment analysis of the DEGs implicated numerous biological processes and pathways, including metabolic processes, immune-related responses, protein binding, major histocompatibility class II receptor activity, and multiple metabolic enzyme activities. The KEGG enrichment results indicated that genes were primarily enriched for cell cycle, antigen processing and presentation, protein digestion and absorption, phagosome, and cancer-related signaling pathways (Figure S2 and Table S2).
Differences in expression of the 90-genes classifier in pan-cancer
The expression levels of the 90-genes in tumor tissues of individual cancers in the TCGA database were validated using GEPIA. The results are presented as a heat map representing DEGs among different cancer types (Figure S3(a)).
Prediction of transcription factors that can regulate the 90-genes
The transcription factors of the 90-genes were acquired from the TRRUST database. We identified 34 transcription factors that potentially regulate the 90-genes (Table.S3), including multiple tumor-associated genes, proto-oncogenes, and interferon regulatory factors. The transcription factors and their targets described above were analyzed using the Metascape platform and are indicated with an enrichment bar chart (Figure S3(b) and (c)). A molecular interactive network was then constructed for the 90-genes, with colors representing the strength of significance (Figure S3(d)).
Discussion
Dysregulation of cellular metabolism in cancer cells is indispensable for indefinite proliferation of cancer cells and represents a hallmark of cancer.10 –12 The rewiring of cellular metabolism results in a unique set of metabolic phenotypes that (1) allow for earlier cancer diagnosis, (2) better predict cancer risk, (3) guide therapy selection, and (4) facilitate the development of methods to monitor therapeutic effectiveness. Changes in the metabolism of cancer cells can lead to unique metabolic dependencies that provide an excellent opportunity for targeted therapy.13,14 To identify the LUAD subsets associated with metabolic processes and good prognosis, we classified the metabolic spectrum of LUAD samples comprehensively. In this study, LUAD was divided into three different metabolism-relevant subtypes, and the repeatability of this subtyping was verified in the context of several test sets. Variances in metabolomic features, prognostic traits, transcriptome characteristics, immune infiltration, clinical characteristics, and drug sensitivities among the three subtypes were compared. Our results revealed that C1 possessed the most abundant metabolic pathway. Compared to C2 and C3, C1 possessed 35 different metabolic pathways, the majority of which were upregulated. Therefore, we defined C1 as the metabolically active subtype. There were five different metabolic pathways in C2, all of which were related to the lipid metabolism. Therefore, we defined C2 as the lipid metabolism-related subtype. Subtype C3 possessed five metabolic pathways. Analysis of the clinical features of the groups revealed that the majority of samples in C1 were in advanced clinical and pathological stages. Moreover, subtype C1 exhibited lower enrichment in LUAD suppressor signatures (such as HIPPO and WNT) than did the other two subtypes. Tumor microenvironment-related estimations revealed that the immune score, matrix score, and immune infiltration of C1 were significantly lower than those of the C2 and C3 subtypes. Studies have demonstrated that lower immune and matrix scores are associated with later-stage tumors and worse overall survival outcomes. 41 These results are consistent with those of our analysis, indicating that subtype C1 exhibited the worst prognosis in the training and testing sets. Compared to C2, which is involved in lipid metabolism, C1 was involved in a variety of metabolic processes, including amino acid, glucose, and lipid metabolism. Abundant metabolic signatures indicate that patients with C1 may benefit from metabolic therapies. In this era of increasing drug resistance, metabolic therapies for specific metabolic processes offer an alternative regimen for LUAD treatment.
Subtype C3 exhibited higher mutation rates in
The immune checkpoint inhibitors nivolumab and pembrolizumab can improve survival outcomes in patients with lung cancer, and they have been approved by the Food and Drug Administration for the treatment of patients with advanced and recurrent lung cancer.54,55 It is important to identify objective molecular markers that can predict the effects of immune therapy. Based on ongoing clinical trials and approved immunosuppressants for specific cancer types, we selected 13 potential immune checkpoint genes. Our results revealed that the majority of the immune checkpoint genes for subtype C3 were highly expressed, indicating that patients with C3 subtype exhibited a better response to anti-PD-1 therapy. Moreover, we retrospectively analyzed prognostic and predictive markers of efficacy reported by other investigators. In recent years, neoantigens derived from oncogenic driver gene mutations have become a major focus in immunotherapy efficacy studies.56 –58 Typically, a high neoantigen load in tumors has been linked to an enhanced response to ICI.59,60 Our study revealed that the neoantigen loads of subtypes C3 and C1 were significantly higher than that of C2, suggesting that subtypes C1 and C3 may benefit from immunotherapy, and patients with subtype C2 are less likely to experience effective immunotherapy outcomes. Several studies have reported that tumors with CD8+ T cell infiltration and high PD-L1 expression can benefit from ICI therapy.61,62 Consistent with the studies by Chen and Peng, the degree of infiltration of activated natural killer cells and CD8+ T cells of subtype C3 was higher than those of the other two groups,63,64 suggesting that patients with subtype C3 are most likely to benefit from immunotherapy. TMB is also a key factor that affects the efficacy of immunotherapy.65 –67 Our data revealed significant differences in the TMB among the subtypes. The TMB of subtype C3 was significantly higher than those of the other two subtypes. The above studies confirmed that patients with subtype C3 are most likely to benefit from immunotherapy from several different perspectives.
We further examined the IC50 values of 100 broadly employed drugs for LUAD therapy (primarily molecular-targeted drugs) in the three subgroups. The top 12 drugs with the largest gaps are presented in Figure 6(d). The data revealed that the IC50 of docetaxel in subtype C3 was significantly lower than that in C1 and C2, suggesting that different chemotherapy regimens could be considered based on metabolic subtypes. Preclinical studies have demonstrated that drugs targeting PKM2 are effective against certain types of cancer and have been established as safe for patients in early phase clinical trials.68 –70 We observed that shikonin, a PKM2 inhibitor, exhibited significantly different IC50 values among the subtypes examined in this study. Notably, the IC50 of subtype C1 was significantly higher than those of the other two subtypes, suggesting that C1 may be resistant to this drug. These findings have important implications for future clinical research and practice.
This study developed a metabolic signature that predicts the prognosis of patients with LUAD, thereby providing robust support for personalized treatment. The signature consisted of 90 metabolic genes. The prognosis of patients in the high-risk group was significantly worse than that of other patients in the training and testing sets. This metabolic signature can assist in identifying patients who may benefit from specific targeted therapies or immunotherapies and may offer new insights for the early screening of LUAD, particularly for subtypes with weak immune responses or high metabolic activity. Future public health strategies could potentially incorporate early interventions targeting these subtypes. Additionally, we further evaluated the application potential of the generated 90-gene classifier in other cancer types. The results revealed differences in expression levels among the different types of cancer. Marked differences were observed in lung squamous cell carcinoma, pancreatic cancer, cholangiocarcinoma, and gastric adenocarcinoma. This finding holds potential value for broader public health strategies addressing cancer. Overall, our work provides an in-depth analysis of LUAD subtypes based on large-scale datasets (TCGA and GEO), enhancing a better understanding of the metabolic hallmarks of LUAD and provided meaningful reference information for individualized treatment and prognosis prediction. However, this study has some limitations. First, the study primarily relies on publicly available databases, and although data processing was rigorous, it may still be affected by data acquisition standards and sample heterogeneity. More clinical and demographic characteristics of patients with LUAD should be included in our analysis to comprehensively and systematically reflect the factors influencing LUAD metabolism profiles. Second, our results must be validated using larger sample sizes and cohorts with more statistical power. Third, the identification of LUAD subtypes was based on bioinformatics analysis, which may pose certain limitations for practical clinical application. Therefore, validation of clinical samples and biological experiments are necessary to understand the differences in mechanisms among the three metabolism-relevant subtypes of LUAD. Fourth, Group C2 has a favorable prognosis, characterized by low TMB and copy-number variations, and is speculated to respond well to conventional treatments such as surgery combined with adjuvant chemotherapy. However, current clinical trial data for this subgroup are insufficient to determine whether additional targeted or immunotherapies are needed. Future research should validate the treatment requirements and potential benefits for Group C2 through prospective clinical trials. Moreover, it is essential for future studies to validate our findings in other histological types of lung cancer to determine the universality of the metabolic subtypes. Finally, owing to the retrospective nature of the cohort, a degree of selection bias was inevitable.
Conclusions
In summary, this study classified LUAD from the perspective of metabolism and proposed three subtypes. C1 was closely related to metabolic processes and was in accordance with the characteristics of established LUAD of the proximal proliferative subtype. C2 exhibited a good prognosis similar to that of the terminal respiratory unit subtype. Subtype C3 possessed a higher level of TMB,
Supplemental Material
sj-docx-1-smo-10.1177_20503121251341114 – Supplemental material for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus
Supplemental material, sj-docx-1-smo-10.1177_20503121251341114 for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus by Lilin Que, Zhibing Liu, Yinghui Wu, Lan Luo and Leifeng Liang in SAGE Open Medicine
Supplemental Material
sj-docx-2-smo-10.1177_20503121251341114 – Supplemental material for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus
Supplemental material, sj-docx-2-smo-10.1177_20503121251341114 for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus by Lilin Que, Zhibing Liu, Yinghui Wu, Lan Luo and Leifeng Liang in SAGE Open Medicine
Supplemental Material
sj-docx-3-smo-10.1177_20503121251341114 – Supplemental material for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus
Supplemental material, sj-docx-3-smo-10.1177_20503121251341114 for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus by Lilin Que, Zhibing Liu, Yinghui Wu, Lan Luo and Leifeng Liang in SAGE Open Medicine
Supplemental Material
sj-tif-4-smo-10.1177_20503121251341114 – Supplemental material for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus
Supplemental material, sj-tif-4-smo-10.1177_20503121251341114 for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus by Lilin Que, Zhibing Liu, Yinghui Wu, Lan Luo and Leifeng Liang in SAGE Open Medicine
Supplemental Material
sj-tif-5-smo-10.1177_20503121251341114 – Supplemental material for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus
Supplemental material, sj-tif-5-smo-10.1177_20503121251341114 for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus by Lilin Que, Zhibing Liu, Yinghui Wu, Lan Luo and Leifeng Liang in SAGE Open Medicine
Supplemental Material
sj-tif-6-smo-10.1177_20503121251341114 – Supplemental material for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus
Supplemental material, sj-tif-6-smo-10.1177_20503121251341114 for Identification of metabolism-associated molecular classification for effect and prognosis in lung adenocarcinoma based on multidatabases including the cancer genome atlas and gene expression omnibus by Lilin Que, Zhibing Liu, Yinghui Wu, Lan Luo and Leifeng Liang in SAGE Open Medicine
Footnotes
Acknowledgements
We would like to thank Editage for language editing.
Ethical considerations
This study did not include any studies with human participants or animals performed by any of the authors.
Author contributions
LQ and ZL: conceptualization. YW and LFL: methodology. LQ and YW: software. LL, ZL, and YW: investigation. LFL: resources. LFL, ZL, and YW: data curation. LQ and ZL: original draft preparation. LQ, ZL, YW, LFL, and LL: writing, review and editing. LFL: supervision. LFL: funding acquisition. All the authors have read and agreed to the published version of the manuscript.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by grants from the National Natural Science Foundation of China (No. 82260627) and the Guangxi Science and Technology Major Project (GuikeAA22096030).
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
Publicly available datasets were analyzed in this study. These data can be found in The Cancer Genome Atlas (https://portal.gdc.cancer.gov/) and the Gene Expression Omnibus (
).
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
