Sage Journals: Discover world-class research

Abstract

Background:

The cellular composition of the tumor immune microenvironment (TIME) is a key contributor to the response of the tumor to immunotherapy. Transforming growth factor-beta (TGF-β) signaling is known to promote immune exclusion, where CD8⁺ T cells are in the surrounding stromal tissue but not within the tumor itself. To better identify patients with an immune-excluded phenotype, we developed two machine learning (ML) models to quantify CD8⁺ cell positivity and classify the immunophenotype of a histological cancer specimen.

Methods:

Immunohistochemistry against CD8 was performed on nonsmall cell lung cancer (NSCLC) samples (N = 200) and digitized whole slide images (WSIs) were then generated. ML models, trained on these WSIs, identified relevant tissue regions (cancer epithelium, stroma) and cell types (CD8⁺ lymphocytes). Features related to CD8, including overall CD8⁺ count proportion, CD8⁺ count proportion in cancer epithelium, and CD8⁺ count proportion in cancer-associated stroma, were extracted for the ML-based approaches to predict immunophenotypes. In the cutoff model, data-driven cutoffs were applied to model-generated human interpretable features of CD8⁺ count proportion within cancer epithelium and cancer-associated stroma, whereas in the spatial model, all tissue and cell model predictions within the TIME were used to train a graph neural network to classify immunophenotypes.

Results:

An inverse correlation was observed between TGF-β signaling and manually determined CD8⁺ cell levels. CD8 quantification model predictions showed high concordance with pathologist consensus annotations for all model classes. Concordance of model-derived immunophenotype predictions with ground truth pathologist-derived immunophenotype consensus labels was comparable with concordance of an average pathologist with the same ground truth consensus. Significant associations were seen between immunophenotypes (derived from ground truth and model predictions) and relative abundance of T cell populations and PD-L1 activity gene signature scores, while a trend was observed between immunophenotypes and a gene signature indicative of TGF-β signaling.

Conclusions:

We developed a digital pathology approach that can characterize and classify the cancer immunophenotypes in a reproducible and scalable manner, paving the road for the application of such a method to identify patients who may benefit from immunotherapy and/or TGF-β blockade in NSCLC.

Introduction

Lung cancer remains the leading cause of cancer-related death in the United States and worldwide, and nonsmall cell lung cancer (NSCLC) is the most common histological subtype.¹ Many therapeutic strategies have been implemented to target actionable genomic alterations in lung cancer. However, these approaches have only provided incrementally improved prognosis for patients, as resistance to targeted therapeutics inevitably occurs in most patients.^2,3

Another common treatment strategy for NSCLC is immunotherapy, especially immune checkpoint blockade (ICB). ICB strategies involve the suppression of T cell activation inhibitors, such as PD-1, PD-L1, and CTLA4.⁴ The first ICB agent approved for NSCLC was nivolumab, a PD-1 inhibitor, based on results from the CheckMate 017 and 057 clinical trials.^5–7 Subsequently, approvals for pembrolizumab (a PD-1 inhibitor) and atezolizumab (a PD-L1 inhibitor) were granted based on observed responses in ICB-treated patients.^8–13

Long-term follow-up results are now available for many of the seminal ICB clinical trials in NSCLC. Four-year overall survival rates for atezolizumab were nearly doubled compared with those for docetaxel,¹⁴ whereas 5-year overall survival rates for nivolumab and pembrolizumab showed similar results.^15,16 More recently, the PD-1 inhibitor cemiplimab, either as monotherapy or in combination with chemotherapy, was assessed in the EMPOWER-1 and EMPOWER-3 trials, respectively. In these trials, cemiplimab was shown to significantly reduce the risk of death and extend overall and progression-free survival compared with chemotherapy alone in the first-line setting, leading to FDA approval for this drug for patients with specific PD-L1 levels.^17–19

Furthermore, although some NSCLC patients treated with ICB do show prolonged responses, including in excess of 5 years, the median overall survival remains under 2 years.^14–16 The ability to accurately predict patients who are likely to respond would greatly aid in the design and execution of future ICB clinical trials in NSCLC. The low median overall survival that has been observed for ICB agents may be due to mechanisms that promote tumor immune evasion. It has been reported that a large number of immune cells present in the tumor stroma could impact tumor progression, including cytotoxic CD8⁺ T cells (CTLs).

Tumor immune microenvironment (TIME) classifications depend on the relative location of these CTLs. Classifications include immune inflamed (immune cells infiltrated into tumor core), immune excluded (CTLs present but not infiltrated into tumor), and immune desert (CTLs largely absent from tumor and stroma).^20–23 TIMEs classified as immune excluded and immune desert are typically not immunogenic. In contrast, inflamed TIMEs are considered immunogenic and are characterized by PD-1(+) CTLs. Tumors with immune-inflamed TIMEs are, typically, more likely to respond to ICB.

In cancer settings, transforming growth factor-beta (TGF-β) demonstrates a profound influence on the cancer cells themselves, the TIME, and the antitumor immune response.^24,25 Targeting the TGF-β pathway in patients who have relapsed or shown resistance to checkpoint inhibitors is an active area of research. The TGF-β pathway plays a critical role in immune suppression and tumor progression by inhibiting the immune response and promoting tumor growth and metastasis.²⁶ In addition, TGF-β signaling in the tumor microenvironment can contribute to immune cell exclusion, a phenomenon wherein immune cells are unable to effectively infiltrate the tumor and mount an antitumor immune response.²⁷

By inhibiting TGF-β signaling, the immunosuppressive effects of the tumor microenvironment can be reduced, thereby promoting immune cell infiltration, potentially leading to enhanced antitumor immune responses. Since TGF-β signaling in the tumor microenvironment can contribute to the immune cell exclusion phenomenon, a major barrier to successful immunotherapy, patients with an immune-excluded phenotype might benefit from TGF-β blockade, alone or in combination with ICB. Nevertheless, the translation of TGF-β inhibitors into the clinical setting has been hampered by the lack of effective patient-selective approaches. As such, improved predictive biomarkers are necessary to maximize the potential of this therapeutic strategy.²⁸

Given the potential importance of the immune milieu of a tumor for response to both ICB and TGF-β blockade, we hypothesize that immune phenotypes have the potential to act as a biomarker in this regard. Gold standard methods for immune phenotyping using CD8 immunohistochemistry (IHC) involve the manual scoring of CD8⁺ T cells within the tumor microenvironment to assess their distribution, density, functional state, and spatial organization. Despite the importance of CD8⁺ T cell localization as a potential biomarker in cancer, it is extremely difficult for pathologists to make these classifications during standard manual review of clinical specimens.

Unlike for PD-L1, for which there are multiple FDA-approved IHC assays,^29,30 there are currently no standardized assays for clinical CD8 assessment. Furthermore, the cellular complexity of the TIME, as well as pathologist variability and subjectivity, makes the manual determination of TIME status incredibly challenging.^24,30–34 In this regard, approaches such as digital pathology—computational analyses of digitized images of histological specimens—may prove useful. Indeed, computational models are proving to perform at least as well as manual assessment in biomarker analyses of oncology clinical trial specimens.³⁵

Given the potential for digital pathology approaches in oncology, we sought to develop an approach that uses computer vision models to predict TIME features in NSCLC. Herein, we describe two distinct machine learning (ML) models that predict immune phenotypes from CD8-stained NSCLC tissue, and we validate their performance through concordance with manual pathologist assessment and correlation with relevant gene expression signatures.

Methods

Data set description

Formalin-fixed paraffin embedded NSCLC tumor blocks (N = 200) were acquired from commercial sources (BioIVT and Cureline) and subsequently sectioned onto slides, stained, imaged, and manually scored at a central laboratory (Labcorp., Indianapolis, IN). Staining with hematoxylin and eosin (H&E) and IHC to detect CD8 (SP57; Ventana Benchmark XT) was performed on these slides. H&E and IHC slides were scanned at a resolution of 0.25 μm per pixel (40 × objective) using the Aperio AT2 scanner (Leica Biosystems, Vista, CA) to generate whole slide images (WSIs).

Images were manually assessed to establish a consensus score by two board-certified pathologists, who provided the percentage of CD8⁺ T cells present within the whole tumor region and expressed in increments of 5% for those areas containing >10% positive cells. Samples with infiltrations greater than the median level (10%) were classified as CD8-high, and samples with infiltrations less than or equal to the median level were classified as CD8-low.

Development and validation of an ML model to quantify CD8 positivity

ML models were developed to quantify CD8 expression in digitized WSI of NSCLC samples. Three deep convolutional neural networks were trained to classify tissue regions and cell types. An artifact model was trained to distinguish regions of artifact (e.g., blur and tissue folds) from slide background and evaluable regions of tissue free of artifact. Within regions of evaluable tissue, a tissue model was trained to classify cancer epithelium, cancer-associated stroma, necrosis, and normal (nontumor) tissue. Within the tumor region, a cell model was trained to detect and classify CD8⁺ lymphocytes, CD8^– lymphocytes, and other nucleated cells. No other CD8 detection software was used for these studies.

Annotations used to train models (e.g., artifact, cancer epithelium, CD8⁺ cells) were provided by board-certified pathologists, and quality control review of annotations was completed before training. Training and validation of all models occurred with CD8-stained WSIs from 199 unique cases. Corresponding negative control slides stained with an isotype control reagent (DA1E; Cell Signaling Technology, Danvers, MA)³⁶ and H&E samples were utilized as reference during annotation collection and model validation.

Three additional cases were excluded from model development because they did not contain enough evaluable tissue, due to a high percentage area of blur and tissue folds. The cases used for model development were randomly split into training (60%), validation (20%), and held-out test (20%) slides while ensuring splits were balanced across relevant metadata including (1) adenocarcinoma, squamous versus other (solid+neuroendocrine), (2) vendor, and (3) individual cases. The final split was 125/36/38 cases.

The 38 test cases were completely held out from all development until model training was complete, at which point the finalized locked models were deployed and assessed on these cases. Evaluation of the cell model performance occurred through comparison of model predictions with a consensus of five pathologist annotations on eighty-two 75 × 75 μm regions of interest, or “frames,” which were sampled throughout the tumor area of the test slides to ensure adequate representation.³⁷

Correlation analysis was used to compare the median number of cells of each cell type (e.g., CD8⁺ lymphocyte, CD8^– lymphocytes, other cells) identified by pathologists and the cell model. Adequate performance was defined as the Pearson correlation between model and consensus counts being within the confidence intervals for average annotator versus consensus correlation.

Development and validation of models to predict immunophenotypes

For immune phenotyping, ML-generated tissue region and cell type predictions were used together with pathologist consensus ground truth immunophenotype labels. For each sample, five pathologists estimated intratumoral CD8⁺ cell density within cancer and cancer stroma regions, which was used to derive majority consensus labels of desert, excluded, and inflamed phenotypes. The data set was divided into training, validation, and test sets, and two methods were compared (Fig. 1A).

Fig. 1.

Digital pathology approaches to immune phenotyping. (A) Two ML models were developed using digitized WSIs to classify tissue regions (e.g., cancer epithelium, stroma, and necrosis) and cell types (e.g., CD8⁺, CD8^–, and nonlymphocytes). In the first approach (Method 1), data-driven cutoffs were applied to model-generated HIFs of CD8⁺ lymphocyte count proportion within the cancer epithelium and stroma to classify samples as desert, excluded, or inflamed. In the second approach (Method 2), all tissue and cell model predictions were used to train a GNN to classify samples as desert, excluded, or inflamed. (B) For the spatial model, an unsupervised GNN was applied to CD8 IHC WSI. This GNN was trained to discover tissue patterns defined by the spatial arrangement of CD8⁺ cells and other cell types relative to cancer epithelium and stroma. GNN, graphical neural network; HIFs, human interpretable features; IHC, immunohistochemistry; ML, machine learning; WSIs, whole slide images.

In the cutoff method, elected cutoffs were applied to model-generated features to assign immunophenotypes. For this method, a two-parameter classifier was trained based on two model-generated features that capture immunophenotype patterns in the tumor: count proportion of CD8⁺ cells in the stroma and count proportion of CD8⁺ cells in the cancer epithelium. These two features form a pair of coordinates for each patient WSI. For each WSI and pair of coordinates in the training and validation sets, we performed a cutoff search across both coordinates and measured the weighted F1 between labels derived from the cutoffs and labels from pathologist ground truth.

This metric was generated after computation of the agreement (F1) for each label (desert, excluded, and inflamed) and was weighted by the number of data points for each label, accounting for any label imbalance. Bootstrapping grid search procedure was executed for cutoff pairs. The 25th-percentile of the score was measured for each cutoff pair across bootstrap replicates, and the cutoffs maximizing this statistic were selected as it combines search for efficacy with some weight toward lower variance across data resampling.

In a parallel approach, model-generated tissue and cell overlays were used to train a spatial model, a graphical neural network (GNN)^38–40 (Fig. 1B), to predict immunophenotypes. Immunophenotyping classification using a 20% cancer CD8 positivity threshold and 20% cancer associated stroma CD8 positivity threshold is presented. GNN predictions for each WSI were clustered into superpixels to construct the nodes of the graph. To improve computational efficiency, pixels were randomly sampled from each WSI and clustered based on their spatial coordinates using the Birch clustering method.

Each GNN pixel prediction was assigned to the cluster of its nearest neighbor in the clustered subset of ∼5000 pixels. This process reduced the hundreds of thousands of pixel-level predictions to thousands of superpixel clusters. WSI regions predicted as background and normal tissue were excluded from the clustering process. A directed graph with edges between each node and its five nearest neighbors was constructed using the K nearest neighbor algorithm, and self-loops were also incorporated. Each graph node was represented by three classes of features generated from previously trained GNN predictions, which were predefined to be biological classes of known clinical relevance.

The spatial features included the mean and standard deviation of the (x, y) coordinates of the node. The topological features included the area, perimeter, and convexity of the cluster. The logit-related features included the mean and standard deviation of the logits for each of the classes of GNN-generated overlays (Fig. 1B).

RNA-seq analysis

RNA-seq data were processed as follows: sequencing reads were mapped to the reference genome GRCh 38 using Spliced Transcripts Alignment to a reference aligner.⁴¹ The gene expression was initially measured in FPKM (fragments per kilobase million) by cufflink,⁴² and the gene-level FPKM was converted into TPM (transcripts per kilobase million). The TPM values were log₂ transformed and quantile normalized for the downstream analysis including differential gene expression analysis. A computational method, gene set variation analysis, was utilized to calculate gene signature scores such as the TGF-β pathway activation scores.^43–45 In addition, the abundances of immune and stromal cells were estimated by a deconvolution method, MCP counter.⁴⁶

Statistical analyses

To evaluate the performance of the cutoff and spatial models, weighted F1 scores and unweighted Cohen's kappa scores were calculated on both the held-out test set and the combined training and validation set. These metrics were qualitatively compared with individual pathologist concordance and with the consensus. Descriptive statistics were computed for biomarker characteristics. For comparisons of gene expression signatures, one-way analysis of variance (ANOVA) was used to compare variances across the means of predefined subgroups. p values from the ANOVA were considered significant if <0.05. Plots were generated using R software version 4.0.4. (R: a language and environment for statistical computing).

Results

Association between TGF-β gene expression with CD8 levels in NSCLC

Given the known association between TGF-β cytokine levels and tumor immune exclusion,^28,47 we sought to correlate the relationship between TGF-β and CD8 levels in a cohort of NSCLC samples. CD8⁺ cell levels were assessed by manual scoring after CD8 IHC (LabCorp.), and the activated TGF-β pathway signature score levels⁴³ in the tumor sample were assessed by RNAseq in matched tissues from these cases. Infiltration of CD8⁺ cells was classified as low (≤10% infiltration) or high (>10% infiltration). As shown in Figure 2, higher levels of the TGF-β pathway activation score were observed in cases with lower CD8⁺ cell infiltration.

Fig. 2.

Association between TGF-β gene expression signature and CD8 infiltration. Levels of CD8 were assessed by manual scoring of CD8 IHC and compared with a TGF-β transcriptomic signature from patient-matched tissue (For CD8 IHC low, N = 96, median = 0.06, mean = 0.07, SD = 0.23; for CD8 IHC high, N = 71, median = −0.04, mean = −0.02, SD = 0.19). An inverse relationship between TGF-β gene signature scores and CD8 levels was observed. Median gene signature levels ±1.5 times the interquartile range are shown. SD, standard deviation; TGF-β, transforming growth factor-beta.

ML-based immunophenotyping in NSCLC

Although our observed inverse association between TGF-β gene signature levels and manually annotated CD8 levels in NSCLC was intriguing, manual analysis of the TIME, including CD8⁺ scoring, is challenging due to inter-reader variability and lack of established standardized manual assays.^24,30–34 Therefore, we sought to develop a more quantitative approach to assess CD8⁺ cell numeration and infiltration. As such, we developed a digital pathology approach for quantifying CD8 infiltration and predicting immune phenotyping in NSCLC (Fig. 2).

ML models were developed using digitized WSIs to identify relevant tissue regions (cancer epithelium, stroma, necrosis; “tissue model”) and cell types (CD8⁺ lymphocytes, CD8^– lymphocytes, and other cells; “cell model”). From these models, human interpretable features (HIFs) relating to CD8 were extracted, including overall CD8⁺ cell count proportion, CD8⁺ count proportion in cancer epithelium, and CD8⁺ count proportion in cancer-associated stroma.

To assess the performance of our cell model, model predictions were compared with a consensus of annotations from five expert pathologists using Pearson correlation. The average correlation of a single manual pathologist annotator with consensus was also calculated. Model predictions showed high concordance with consensus for all cell model classes, with Pearson values comparable with those of the average annotator versus consensus (Fig. 3).

Fig. 3.

Performance of our ML model in predicting CD8⁺ lymphocytes. Model predictions were compared with a consensus of annotations from five expert pathologists using the Pearson correlation. The average correlation of a single manual pathologist annotator with consensus was also calculated. Model predictions showed high concordance with consensus for all cell model classes; this concordance was comparable with that of the average annotator versus consensus.

Two ML model-derived approaches were taken to predict immunophenotypes based on tissue and cell model outputs (Fig. 1). In the first approach, data-driven cutoffs were applied to model-generated HIFs of CD8⁺ count proportion within cancer epithelium and cancer-associated stroma to classify samples as immune desert, excluded, or inflamed. In the second approach, all tissue and cell model predictions within the TIME were used to train a GNN to classify samples as immune desert, excluded, or inflamed. Results from both approaches showed moderate concordance with a 5-way pathologist consensus in a held-out test set, comparable with the concordance of an average pathologist with consensus (Fig. 4).

Fig. 4.

Comparison of digital pathology-based immunophenotyping methods. (A) Pathologist labels based on a 20% cancer CD8 positivity threshold and 20% cancer stroma positivity threshold were used for classification. (B, C) Immunophenotyping results using both cutoff and spatial (GNN) methods showed moderate concordance with five-way pathologist consensus in a held-out test set, comparable with concordance of an average pathologist with consensus.

Functional validation of immunophenotyping model

Given the strong performance of our model in predicting both CD8 levels and immunophenotypes in our NSCLC cohort compared with pathologists, we sought to functionally validate our models by comparing model-predicted immunophenotypes with relevant gene expression signatures. Samples determined to be immune inflamed (either by pathologist-labeled ground truth, cutoff model prediction, or spatial model prediction) showed elevated relative abundance of T cells, CD8⁺ T cells, cytotoxic lymphocytes, and overall CD8⁺ cell positivity (Fig. 5). Interestingly, relative abundance of other immune cell populations was more variable.

Fig. 5.

Abundance of T cell populations in model-predicted immunophenotypes. Transcriptomic analyses were performed to compare abundance of T cell populations estimated by associated gene expression signatures with immunophenotypes predicted by the (A) cutoff or (B) spatial model in patient-matched tissues. Model-predicted immunophenotypes (dashed outlines) were compared with ground truth (solid outlines). Median gene signature levels ±1.5 times the interquartile range are shown. (C) p Values calculated by ANOVA were used to identify significant differences between immunophenotypes for each gene expression signature. ANOVA, analysis of variance.

Samples determined to be immune inflamed (either by pathologist-labeled ground truth, cutoff model prediction, or spatial model prediction) showed elevated expression of gene signatures associated with monocytes (Fig. 6). Samples labeled as immune inflamed by pathologists showed elevated relative abundance of B cell and NK cell populations; although not significant, a similar trend was observed for cutoff-predicted and spatial-predicted inflamed samples (Fig. 6). No significant associations were observed between immunophenotype and relative abundance of myeloid-dendritic cells, neutrophils, fibroblasts, or endothelial cells (Fig. 6 and, Supplementary Fig. S1).

Fig. 6.

Estimated abundance of non-T cell immune populations in model-predicted immunophenotypes. Transcriptomic analyses were performed to compare estimated abundance of non-T cell immune cell populations estimated by gene expression signatures with immunophenotypes predicted by the (A) cutoff or (B) spatial model in patient-matched tissues. Model-predicted immunophenotypes (dashed outlines) were compared with ground truth (solid outlines). Median gene signature levels ±1.5 times the interquartile range are shown. (C) p Values calculated by ANOVA were used to identify significant differences between immunophenotypes for each gene expression signature.

In all instances, no significant differences were observed between our model predictions and ground truth, supporting our models' accuracy in predicting immunophenotypes. These results support the specificity of our models in predicting the CD8⁺ landscape within H&E-stained cancer specimens.

Having confirmed that tumors predicted to be inflamed by our model harbor functional evidence of increased immune infiltration at the T cell level, we sought to assess the relationship between our model predictions and proposed gene signatures indicative of molecular pathway activity. The “T cell inflamed signature” has been proposed as a biomarker of response to anti-PD-1 treatment.⁴⁸ Tumors predicted to be inflamed by our spatial and cutoff models showed significantly elevated scores of this gene signature, further suggesting that our model predictions may have the capability to function as a surrogate biomarker for response to PD-1 blockade.

In addition, given the association between T cell infiltration and TGF-β, we assessed how our model predictions compared with a gene signature indicative of TGF-β signaling.⁴³ Although we did observe a trend toward lower TGF-β signature score in immune-inflamed samples, no significant associations were observed. Still, given the performance of our model more broadly, we hypothesize that this digital pathology approach could be used as a biomarker to predict response to immuno-oncology therapeutics.

Discussion

In the context of cancer immunology, different immune profiles have been described in the tumor microenvironment, represented as “immune inflamed,” “immune excluded,” and “immune desert.”^21–24 Recent study has confirmed the role of TGF-β signaling in promoting immune exclusion, a state in which immune cells, especially CD8⁺ T cells, are found in the stromal tissue surrounding the tumor, but not infiltrated into the tumor itself.⁴⁴ It has been theorized that inhibiting TGF-β signaling might promote the infiltration of immune cells into the tumor and enhance the overall antitumor immune response, alone or in combination with immunotherapy.^27,43

In this study, we have also shown an inverse correlation between TGF-β pathway activation and CD8⁺ cell infiltration (Fig. 2). We believe that this relationship between TGF-β and immune infiltration has the potential to inform the identification of patients with an immune-excluded phenotype who may benefit from TGF-β blockade as part of their therapeutic regimen.

Despite this potential, the clinical development of TGF-β inhibitors has been hampered by the lack of effective predictive biomarkers. Largely, these studies have focused on either transcriptional expression/protein expression of the TGF-β superfamily of ligands, TGF-β receptors, and downstream transcription factors⁴⁹ and associated independent intracellular signaling pathways.^25,50 However, TGF-β signaling is notoriously modular and complex—multiple ligands, receptor–dimer combinations, receptor-associated SMAD effectors (R-SMADs), coactivators, corepressors, and transcription factor complex partners render a “one-size-fits-all” biomarker for TGF-β activity difficult.²⁷

In addition, R-SMAD activation occurs through phosphorylation, and phosphorylated proteins have limited clinical utility as biomarkers.⁵¹ Furthermore, TGF-β activation often leads to activation of non-SMAD effector pathways, further limiting the clinical utility of SMAD activation as a biomarker.⁵² To circumvent these issues, gene expression signatures associated with TGF-β activation have been proposed as surrogate markers of pathway activation.^53,54 However, the incorporation of these transcriptomic signatures into the clinical workflow has been hampered by the fact that performing these assays is time consuming (especially relating to data analysis), costly, and requires considerable high-quality tissue.^55,56

Conversely, although immunophenotyping using IHC against CD8 and other immune cell markers has shown promise for clinical application, this strategy has been hindered by low concordance among pathologists in providing manual slide-level immunophenotypes.^24,30–34 To overcome these challenges, we developed ML models to predict immunophenotypes, which were shown to be concordant with pathologist ground truth labels of immune inflamed, immune excluded, and immune desert (Figs. 3 and 4). These results support the growing body of evidence that digital pathology approaches have utility for patient selection strategies in oncology.^35,57–59

Furthermore, we observed that the distribution of relevant immune cell gene expression signatures, as well as a gene signature associated with PD-L1, correlated with immunophenotypes predicted by our models (Figs. 5–7), providing key functional support to our model's performance. Specifically, we observed a significant relationship between our model predictions (derived from both the cutoff model and spatial model) and gene signatures associated with T lymphocytes (Fig. 5).

Fig. 7.

T cell inflamed and TGF-β gene signature scores in model-predicted immunophenotypes. Transcriptomic analyses were performed to compare T cell inflamed and TGF-β gene expression signature⁴³ scores with immunophenotypes predicted by the (A) cutoff or (B) spatial model in patient-matched tissues. Model-predicted immunophenotypes (dashed outlines) were compared with ground truth (solid outlines). Median gene signature levels ±1.5 times the interquartile range are shown. (C) p Values calculated by ANOVA were used to identify significant differences between immunophenotypes for each gene expression signature.

However, with the exception of monocytes, gene signatures associated with other non-T immune cell types did not show significant associations with our model predictions (Fig. 6), supporting the specificity of our models in quantifying CD8⁺ cells. In addition, although the association between model-predicted immunophenotypes and TGF-β gene signature was not significant, we did observe a trend toward reduced TGF-β signature levels in immune-inflamed samples using ground truth predictions, as well as predictions from both our cutoff and spatial models (Fig. 7).

In all instances, no significant differences were observed between our model predictions and ground truth, further supporting our models' accuracy in predicting immunophenotypes. The observed differences in the association between our model predictions and signatures associated with PD-L1 and TGF-β can likely be explained, in part, by the pleiotropic nature of TGF-β signaling. Collectively, out of our full cohort (N = 163), the number of samples determined to be inflamed was quite low for ground truth (N = 21), cutoff model (N = 17), and spatial model (N = 20). Therefore, our sample size may have been insufficient to detect significant differences in TGF-β signaling within these populations. Future analyses will involve deployment of these models on larger cohorts to further understand their potential clinical utility.

Conclusion

In conclusion, these findings established a ML-based digital pathology approach that can quantify CD8⁺ cell positivity and classify the cancer immunophenotypes in a reproducible and scalable manner, paving the road for the application of such a method to clinical studies. Our results support the potential of using ML-predicted cancer immunophenotypes to identify patients who may benefit from immunotherapy and/or TGF-β blockade in NSCLC.

Footnotes

Acknowledgments

The authors thank Jingjing He (Sanofi) for supporting data reconciliation and Charles Biddle-Snead (PathAI) for aiding in conceptualization, methodology, and investigation in immune phenotyping. They also extend gratitude to the software engineering and ML teams at PathAI for developing the systems and pipelines used in model development and feature extraction. In addition, the authors thank Bioscience Communications for its help with figure design (funded by Sanofi), Jacqueline Brosnan-Cashman (PathAI) for assistance with article writing and figure design (funded by Sanofi), and Kaushik Kuche (Sanofi) for aid in editing and formatting the article.

Authors' Contributions

Conceptualization of the study was carried out by R.J.P., R.W., S.B., and C.H. Methodology was done by R.W., J.S.L., C.H., A.S-.M., A.K., M.G., H.W., Q.T., and R.T. Investigation was done by R.J.P., R.W., H.W., J.S.L., C.H., A.S-.M., A.K., M.G., and S.B.

Visualization was carried out by R.W., H.W., M.G., and A.S-.M. Funding acquisition: N/A. Project administration was by S.B. Supervision was done by R.W. and S.B. Writing—original draft was taken care by R.J.P., R.W., S.B., A.S-.M., A.K., M.G., H.W., J.S.L., and S.M.B.

Writing—review and editing was by R.W., H.W., J.S.L., S.B., C.H., A.S-.M., A.K., M.G., B.D., and S.M.B. All authors have reviewed and approved the article for submission.

Authors' Information

R.J.P., H.W., S.M.B., Q.T., R.T., J.S.L., B.D., and R.W. are employed by Sanofi and may hold stock and/or stock options. A.K. reports holding stock and/or stock options at PathAI. A.S.M. reports receiving support for attending meetings and/or travel support from PathAI, holding stock or stock options at PathAI, and receiving equipment, materials, drugs, medical writing, gifts, or other services from PathAI. M.G. reports holding stock or stock options at PathAI and having financial and nonfinancial interests at PathAI. C.H. reports consultation fees from HistoGeneX LLC and PathAi, along with financial interests in HistoGeneX LLC and PathAi. S.B. reports holding stock or stock options at PathAI.

Data and Materials Availability

Model parameters for CD8 quantification and immunophenotyping models are not disclosed. Access requests for such a code will not be considered to safeguard PathAI's intellectual property. All source code for reproducing analyses and predictions will be deposited to GitHub before publication, and the link will be provided at that time. Access requests can be made to: (publications@pathai.com).

Author Disclosure Statement

No competing financial interests exist.

Funding Information

This study was funded by Sanofi.

Supplementary Material

References

Schabath

, Cote

. Cancer progress and priorities: Lung cancer. Cancer Epidemiol Biomarkers Prev, 2019;28(10):1563–1579; doi: 10.1158/1055-9965.EPI-19-0221

Bivona

, Doebele

. A framework for understanding and targeting residual disease in oncogene-driven solid cancers. Nat Med, 2016;22(5):472–478; doi: 10.1038/nm.4091

Thai

, Solomon

, Sequist

, et al. Lung cancer. Lancet, 2021;398(10299):535–554; doi: 10.1016/S0140-6736(21)00312-3

Postow

, Callahan

, Wolchok

. Immune checkpoint blockade in cancer therapy. J Clin Oncol, 2015;33(17):1974–1982; doi: 10.1200/JCO.2014.59.4358

Kazandjian

, Suzman

, Blumenthal

, et al. FDA approval summary: Nivolumab for the treatment of metastatic non-small cell lung cancer with progression on or after platinum-based chemotherapy. Oncologist, 2016;21(5):634–642; doi: 10.1634/theoncologist.2015-0507

Brahmer

, Reckamp

, Baas

, et al. Nivolumab versus docetaxel in advanced squamous-cell non-small-cell lung cancer. N Engl J Med, 2015;373(2):123–135; doi: 10.1056/NEJMoa1504627

Borghaei

, Paz-Ares

, Horn

, et al. Nivolumab versus docetaxel in advanced nonsquamous non-small-cell lung cancer. N Engl J Med, 2015;373(17):1627–1639; doi: 10.1056/NEJMoa1507643

Sul

, Blumenthal

, Jiang

, et al. FDA approval summary: Pembrolizumab for the treatment of patients with metastatic non-small cell lung cancer whose tumors express programmed death-ligand 1. Oncologist, 2016;21(5):643–650; doi: 10.1634/theoncologist.2015-0498

Garon

, Rizvi

, Hui

, et al. Pembrolizumab for the treatment of non-small-cell lung cancer. N Engl J Med, 2015;372(21):2018–2028; doi: 10.1056/NEJMoa1501824

10.

Herbst

, Baas

, Kim

D-W

, et al. Pembrolizumab versus docetaxel for previously treated, PD-L1-positive, advanced non-small-cell lung cancer (KEYNOTE-010): A randomised controlled trial. Lancet, 2016;387(10027):1540–1550; doi: 10.1016/S0140-6736(15)01281-7

11.

Weinstock

, Khozin

, Suzman

, et al. U.S. Food and Drug Administration Approval Summary: Atezolizumab for metastatic non-small cell lung cancer. Clin Cancer Res, 2017;23(16):4534–4539; doi: 10.1158/1078-0432.CCR-17-0540

12.

Fehrenbacher

, Spira

, Ballinger

, et al. Atezolizumab versus docetaxel for patients with previously treated non-small-cell lung cancer (POPLAR): A multicentre, open-label, phase 2 randomised controlled trial. Lancet, 2016;387(10030):1837–1846; doi: 10.1016/S0140-6736(16)00587-0

13.

Rittmeyer

, Barlesi

, Waterkamp

, et al. Atezolizumab versus docetaxel in patients with previously treated non-small-cell lung cancer (OAK): A phase 3, open-label, multicentre randomised controlled trial. Lancet, 2017;389(10066):255–265; doi: 10.1016/S0140-6736(16)32517-X

14.

Mazieres

, Rittmeyer

, Gadgeel

, et al. Atezolizumab versus docetaxel in pretreated patients with NSCLC: Final results from the randomized phase 2 POPLAR and phase 3 OAK clinical trials. J Thorac Oncol, 2021;16(1):140–150; doi: 10.1016/j.jtho.2020.09.022

15.

Borghaei

, Gettinger

, Vokes

, et al. Five-year outcomes from the randomized, phase III trials CheckMate 017 and 057: Nivolumab versus docetaxel in previously treated non-small-cell lung cancer. J Clin Oncol, 2021;39(7):723–733; doi: 10.1200/JCO.20.01605

16.

Herbst

, Garon

, Kim

D-W

, et al. Five year survival update from KEYNOTE-010: Pembrolizumab versus docetaxel for previously treated, programmed death-ligand 1-positive advanced NSCLC. J Thorac Oncol, 2021;16(10):1718–1732; doi: 10.1016/j.jtho.2021.05.001

17.

Sezer

, Kilickap

, Gümüş

, et al. Cemiplimab monotherapy for first-line treatment of advanced non-small-cell lung cancer with PD-L1 of at least 50%: a multicentre, open-label, global, phase 3, randomised, controlled trial. Lancet, 2021;397(10274):592–604; doi: 10.1016/S0140-6736(21)00228-2

18.

Gogishvili

, Melkadze

, Makharadze

, et al. Cemiplimab plus chemotherapy versus chemotherapy alone in non-small cell lung cancer: A randomized, controlled, double-blind phase 3 trial. Nat Med, 2022;28(11):2374–2380; doi: 10.1038/s41591-022-01977-y

19.

Akinboro

, Larkins

, Pai-Scherf

, et al. FDA Approval Summary: Pembrolizumab, atezolizumab, and cemiplimab-rwlc as single agents for first-line treatment of advanced/metastatic PD-L1-high NSCLC. Clin Cancer Res, 2022;28(11):2221–2228; doi: 10.1158/1078-0432.CCR-21-3844

20.

Hendry

, Salgado

, Gevaert

, et al. Assessing tumor-infiltrating lymphocytes in solid tumors: A practical review for pathologists and proposal for a standardized method from the international immunooncology biomarkers working group: Part 1: Assessing the host immune response, TILs in invasive breast carcinoma and ductal carcinoma in situ, metastatic tumor deposits and areas for further research. Adv Anat Pathol, 2017;24(5):235–251; doi: 10.1097/PAP.0000000000000162

21.

Desbois

, Udyavar

, Ryner

, et al. Integrated digital pathology and transcriptome analysis identifies molecular mediators of T-cell exclusion in ovarian cancer. Nat Commun, 2020;11(1):5583; doi: 10.1038/s41467-020-19408-2

22.

Braun

, Hou

, Bakouny

, et al. Interplay of somatic alterations and immune infiltration modulates response to PD-1 blockade in advanced clear cell renal cell carcinoma. Nat Med, 2020;26(6):909–918; doi: 10.1038/s41591-020-0839-y

23.

Kather

, Suarez-Carmona

, Charoentong

, et al. Topography of cancer-associated immune cells in human solid tumors. Elife, 2018;7:e36967; doi: 10.7554/eLife.36967

24.

Liu

, Chen

, Moore

, et al. Exploiting canonical TGFβ signaling in cancer treatment. Mol Cancer Ther, 2022;21(1):16–24; doi: 10.1158/1535-7163.MCT-20-0891

25.

Tian

, Schiemann

. The TGF-β paradox in human cancer: An update. Future Oncol, 2009;5(2):259–271; doi: 10.2217/14796694.5.2.259

26.

Batlle

, Massagué

Transforming growth factor-β signaling in immunity and cancer. Immunity, 2019;50(4):924–940; doi: 10.1016/j.immuni.2019.03.024

27.

Tauriello

DVF

, Palomo-Ponce

, Stork

, et al. TGFβ drives immune evasion in genetically reconstituted colon cancer metastasis. Nature, 2018;554(7693):538–543; doi: 10.1038/nature25492

28.

Punekar

, Shum

, Grello

, et al. Immunotherapy in non-small cell lung cancer: Past, present, and future directions. Front Oncol, 2022;12:877594; doi: 10.3389/fonc.2022.877594

29.

Hirsch

, McElhinny

, Stanforth

, et al. PD-L1 immunohistochemistry assays for lung cancer: Results from phase 1 of the blueprint PD-L1 IHC assay comparison project. J Thorac Oncol, 2017;12(2):208–222; doi: 10.1016/j.jtho.2016.11.2228

30.

Tsao

, Kerr

, Kockx

, et al. PD-L1 immunohistochemistry comparability study in real-life clinical samples: Results of blueprint phase 2 project. J Thorac Oncol, 2018;13(9):1302–1311; doi: 10.1016/j.jtho.2018.05.013

31.

Amgad

, Stovgaard

, Balslev

, et al. Report on computational assessment of Tumor Infiltrating Lymphocytes from the International Immuno-Oncology Biomarker Working Group. NPJ Breast Cancer, 2020;6(1):1–13; doi: 10.1038/s41523-020-0154-2

32.

Brunnström

, Johansson

, Westbom-Fremer

, et al. PD-L1 immunohistochemistry in clinical diagnostics of lung cancer: Inter-pathologist variability is higher than assay variability. Mod Pathol, 2017;30(10):1411–1421; doi: 10.1038/modpathol.2017.59

33.

Van Bockstal

, François

, Altinay

, et al. Interobserver variability in the assessment of stromal tumor-infiltrating lymphocytes (sTILs) in triple-negative invasive breast carcinoma influences the association with pathological complete response: The IVITA study. Mod Pathol, 2021;34:2130–2140; doi: 10.1038/s41379-021-00865-z

34.

Tramm

, Di Caterino

, Jylling

A-MB

, et al. Standardized assessment of tumor-infiltrating lymphocytes in breast cancer: An evaluation of inter-observer agreement between pathologists. Acta Oncol, 2018;57(1):90–94; doi: 10.1080/0284186X.2017.1403040

35.

Baxi

, Edwards

, Montalto

, et al. Digital pathology and artificial intelligence in translational medicine and clinical practice. Mod Pathol, 2022;35(1):23–32; doi: 10.1038/s41379-021-00919-2

36.

Phillips

, Simmons

, Inzunza

, et al. Development of an automated PD-L1 immunohistochemistry (IHC) assay for non-small cell lung cancer. Appl Immunohistochem Mol Morphol, 2015;23(8):541–549; doi: 10.1097/PAI.0000000000000256

37.

Beck

, Glass

, Elliott

, et al. An empirical framework for validating artificial intelligence-derived PD-L1 positivity predictions applied to urothelial carcinoma. J Immunother Cancer, 2019;7(Suppl 1):283; doi: 10.1186/s40425-019-0764-0

38.

Morris

, Ritzert

, Fey

, et al. Weisfeiler and Leman Go Neural: Higher-Order Graph Neural Networks. arXiv [csLG], 2018; doi: 10.48550/arXiv.1810.02244

39.

Diehl

Edge Contraction Pooling for Graph Neural Networks. arXiv [csLG], 2019; doi: 10.48550/arXiv.1905.10990

40.

Lee

, Lee

, Kang

Self-Attention Graph Pooling. arXiv [csLG], 2019; doi: 10.48550/arXiv.1904.08082

41.

Dobin

, Davis

, Schlesinger

, et al. STAR: Ultrafast universal RNA-seq aligner. Bioinformatics, 2013;29(1):15–21; doi: 10.1093/bioinformatics/bts635

42.

Trapnell

, Roberts

, Goff

, et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat Protoc, 2012;7(3):562–578; doi: 10.1038/nprot.2012.016

43.

Greco

, Qu

, et al. Pan-TGFβ inhibition by SAR439459 relieves immunosuppression and improves antitumor efficacy of PD-1 blockade. Oncoimmunology, 2020;9(1):1811605; doi: 10.1080/2162402X.2020.1811605

44.

Pomponio

, Tang

, Mei

, et al. An integrative approach of digital image analysis and transcriptome profiling to explore potential predictive biomarkers for TGFβ blockade therapy. Acta Pharm Sin B, 2022;12(9):3594–3601; doi: 10.1016/j.apsb.2022.03.013

45.

Hänzelmann

, Castelo

, Guinney

GSVA: Gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics, 2013;14:7; doi: 10.1186/1471-2105-14-7

46.

Becht

, Giraldo

, Lacroix

, et al. Estimating the population abundance of tissue-infiltrating immune and stromal cell populations using gene expression. Genome Biol, 2016;17(1):218; doi: 10.1186/s13059-016-1070-5

47.

Mariathasan

, Turley

, Nickles

, et al. TGFβ attenuates tumour response to PD-L1 blockade by contributing to exclusion of T cells. Nature, 2018;554(7693):544–548; doi: 10.1038/nature25501

48.

Ayers

, Lunceford

, Nebozhyn

, et al. IFN-γ-related mRNA profile predicts clinical response to PD-1 blockade. J Clin Invest, 2017;127(8):2930–2940; doi: 10.1172/JCI91190

49.

Joshi

, Cao

TGF-beta signaling, tumor microenvironment and tumor progression: The butterfly effect. Front Biosci, 2010;15(1):180–194; doi: 10.2741/3614

50.

Wakefield

, Hill

. Beyond TGFβ: Roles of other TGFβ superfamily members in cancer. Nat Rev Cancer, 2013;13(5):328–341; doi: 10.1038/nrc3500

51.

de Gramont

, Faivre

, Raymond

Novel TGF-β inhibitors ready for prime time in onco-immunology. Oncoimmunology, 2017;6(1):e1257453; doi: 10.1080/2162402X.2016.1257453

52.

Zhang

YE.

Non-smad signaling pathways of the TGF-β family. Cold Spring Harb Perspect Biol, 2017;9(2):a022129; doi: 10.1101/cshperspect.a022129

53.

Chakravarthy

, Khan

, Bensler

, et al. TGF-β-associated extracellular matrix genes link cancer-associated fibroblasts to immune evasion and immunotherapy failure. Nat Commun, 2018;9(1):4692; doi: 10.1038/s41467-018-06654-8

54.

, Soliman

, Joehlin-Price

, et al. High TGF-β signature predicts immunotherapy resistance in gynecologic cancer patients treated with immune checkpoint inhibition. NPJ Precis Oncol, 2021;5(1):101; doi: 10.1038/s41698-021-00242-8

55.

Feliubadaló

, Tonda

, Gausachs

, et al. Benchmarking of whole exome sequencing and ad hoc designed panels for genetic testing of hereditary cancer. Sci Rep, 2017;7:37984; doi: 10.1038/srep37984

56.

Ascierto

, Bifulco

, Palmieri

, et al. Preanalytic variables and tissue stewardship for reliable next-generation sequencing (NGS) clinical analysis. J Mol Diagn, 2019;21(5):756–767; doi: 10.1016/j.jmoldx.2019.05.004

57.

Althammer

, Tan

, Spitzmüller

, et al. Automated image analysis of NSCLC biopsies to predict response to anti-PD-L1 therapy. J Immunother Cancer, 2019;7(1):121; doi: 10.1186/s40425-019-0589-x

58.

, Cui

, Yang

, et al. Using deep learning to predict anti-PD-1 response in melanoma and lung cancer patients from histopathology images. Transl Oncol, 2021;14(1):100921; doi: 10.1016/j.tranon.2020.100921

59.

Echle

, Ghaffari Laleh

, Quirke

, et al. Artificial intelligence for detection of microsatellite instability in colorectal cancer-a multicentric analysis of a pre-screening tool for clinical application. ESMO Open, 2022;7(2):100400; doi: 10.1016/j.esmoop.2022.100400

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.26 MB

Classification of the Tumor Immune Microenvironment Using Machine-Learning-Based CD8 Immunophenotyping As a Potential Biomarker for Immunotherapy and TGF-β Blockade in Nonsmall Cell Lung Cancer

Abstract

Background:

Methods:

Results:

Conclusions:

Introduction

Methods

Data set description

Development and validation of an ML model to quantify CD8 positivity

Development and validation of models to predict immunophenotypes

RNA-seq analysis

Statistical analyses

Results

Association between TGF-β gene expression with CD8 levels in NSCLC

ML-based immunophenotyping in NSCLC

Functional validation of immunophenotyping model

Discussion

Conclusion

Footnotes

Acknowledgments

Authors' Contributions

Authors' Information

Data and Materials Availability

Author Disclosure Statement

Funding Information

Supplementary Material

References

Supplementary Material