Abstract
Cancer stem cells (CSCs) are resistant to standard cancer treatments and are likely responsible for cancer recurrence, but few therapies target this subpopulation. Due to the difficulty in propagating CSCs outside of the tumor environment, previous work identified CSC-like cells by inducing human breast epithelial cells into an epithelial-to-mesenchymal transdifferentiated state (HMLE_sh_ECad). A phenotypic screen was conducted against HMLE_sh_ECad with 300 718 compounds from the Molecular Libraries Small Molecule Repository to identify selective inhibitors of CSC growth. The screen yielded 2244 hits that were evaluated for toxicity and selectivity toward an isogenic control cell line. An acyl hydrazone scaffold emerged as a potent and selective scaffold targeting HMLE_sh_ECad. Fifty-three analogues were acquired and tested; compounds ranged in potency from 790 nM to inactive against HMLE_sh_ECad. Of the analogues, ML239 was best-in-class with an IC50= 1.18 µM against HMLE_sh_ECad, demonstrated a >23-fold selectivity over the control line, and was toxic to another CSC-like line, HMLE_shTwist, and a breast carcinoma cell line, MDA-MB-231. Gene expression studies conducted with ML239-treated cells showed altered gene expression in the NF-κB pathway in the HMLE_sh_ECad line but not in the isogenic control line. Future studies will be directed toward the identification of ML239 target(s).
Introduction
Breast cancer is a leading cause of cancer-related deaths among women, and the majority breast cancer deaths are due to metastasis. Research has identified subpopulations of cell types that are thought to drive tumor growth and recurrence. Based on cell surface markers (CD44+/CD24–) and their apparent origin and maintenance of tumors, this cell population has been termed cancer stem cells or CSCs.1–3 Often a minority subpopulation of cancer cells within a tumor, CSCs are resistant to standard chemotherapeutic and radiation therapies and are thought to be the cause of relapse in patients.1–3 Because of the important role of CSCs in breast cancer, we endeavored to identify compounds that could selectively kill CSCs to gain a better insight into CSC biology and breast tumor progression. Therefore, we conducted a high-throughput screen targeting CSCs to develop a chemical tool to understand the complex tumor biology and the cooperative nature of cancer cells.
Using the pilot screen by Gupta et al. 4 as a starting point, we endeavored to test a larger chemical library to identify more selective, chemically tractable compounds for use as probes. Knocking down E-Cadherin in human mammary epithelial cells (HMLEs) induced an epithelial-to-mesenchymal transdifferentiation (EMT) propagating a CSC-enriched population in culture (HMLE_sh_ECad). 4 Use of this surrogate system was necessary because primary CSCs comprise a small subpopulation of the tumor mass and are not readily propagated in culture. Furthermore, this screen benefited from availability of an isogenic control cell line (HMLE_sh_GFP), created by introduction of a short shRNA targeting the green fluorescent protein (GFP) gene for the counterscreen of the primary screen hits. This isogenic control, which is absent in most screens, minimizes the probability of finding spurious compound hits that are not selective for CSCs but rather target unknown or uncharacterized genetic differences between the screened cell line and other commonly used cellular models.
The HMLE_sh_ECad cell line was tested in a phenotypic screen against a larger, more diverse set of compounds, the Molecular Libraries Small Molecule Repository (MLSMR) chemical library. In all, 300 718 compounds were tested, and 3188 compounds were found to be toxic to the HMLE_sh_ECad cell line. Of these, 2244 compounds were retested in dose from DMSO stock sources for CSC-selective toxicity as compared with the isogenic control cell line. From this list of compounds, an acyl hydrazone scaffold emerged as a potent and selective chemical compound against HMLE_sh_ECad. In addition, ML239 was selectively toxic toward another CSC-like cell line, HMLE_Twist, and the breast cancer line, MDA-MB-231.
Given the difficulty in identifying a target for probes identified in phenotypic screen, a gene expression study was undertaken to identify target pathway(s) affected by ML239. These studies found that ML239 perturbed nuclear factor–κB (NF-κB)–associated pathways after a 24-h treatment in the CSC-like cell line but not in the isogenic control cell line. Given these results, these data suggest that ML239 may be used to identify pertinent cellular pathways for targeted toxicity, may further be used to dissect tumor biology, and may lead to identification of novel therapeutics.
Methods
Cell Culture
HMLE_sh_ECad, HMLE_Twist, HMLE_sh_GFP, and MDA-MB-231 cells were prepared as previously described.4,5 Briefly, HMLE cells expressing either shRNA targeting E-Cadherin (HMLE_sh_ECad), TWIST (HMLE_Twist), or enhanced GFP (HMLE_sh_GFP) were propagated in a 1:1 mixture of 10% fetal bovine serum (FBS; HyClone, Logan, UT), 1% penicillin/streptomycin (pen/strep; Cellgro, Manassas, VA), 1% Glutamax-1 (Invitrogen, Carlsbad, CA), 70 nM hydrocortisone (Sigma, St. Louis, MO), 12 µg/mL insulin (Sigma), 50 µg/mL gentamicin (Sigma), 12.5 µg/mL Plasmocin (InvivoGen, San Diego, CA), and 10 ng/mL epidermal growth factor (EGF) in Dulbecco’s modified Eagle’s medium (DMEM; Cellgro) with mammary epithelial cell growth medium (MEGM complete medium; Lonza, Basel, Switzerland) at 37 °C, 5% CO2.
Screening, Data Normalization, and Analysis
For screening, the cells were counted and resuspended in complete media without serum. Next, 2000 cells were plated per well in white, tissue culture–treated, 384-well plates (Corning, Corning, NY). The cells were incubated at 37 °C, 5% CO2 for at least 4 h and pinned with compounds from the MLSMR library of compounds (http://mli.nih.gov/mli/); final screening concentration was 7.5 µM. The cells were incubated for approximately 72 h, and then CellTiter-Glo (Promega, Madison, WI) diluted 1:3 with phosphate-buffered saline (PBS) was added to the well. Each plate was read using the EnVision (PerkinElmer, Waltham, MA; luminescence 0.1 s/well) after a 12-min incubation post CellTiter-Glo addition. Each assay plate was normalized to the 32–neutral control (DMSO-treated) wells and the 32–positive control (Puromycin-treated) wells on each plate.
Tumorsphere Assay
Tumorsphere assays were performed as previously described with minor changes. 6 SUM159 cells were propagated in 5% FBS (HyClone), 1% pen/strep, 1% Glutamax-1, 12 µg/mL insulin, and 50 µg/mL gentamicin in F12/DMEM (Cellgro). Compounds were pinned into media in 96-well, ultra-low-adhesion plates (Costar; Corning). Harvested cells were resuspended in their propagation media with 1% methylcellulose (ES-CultM3120; StemCell Technologies, Reston, VA). The resuspended cells were added to the plates containing media with compound for a final count of 2000 cells/well in 0.5% methylcellulose. Tumorspheres were allowed to form for 9 days incubated at 37 °C, 5% CO2. Tumorspheres were imaged using a 2× objective on the ImageXpress Micro (Molecular Devices, Sunnyvale, CA). Cell clusters greater than 100 µM in diameter were identified and counted using the MetaXpress software (version 3.1; Molecular Devices).
Gene Expression Assays
Triplicate samples of HMLE_sh_ECad were treated with vehicle (DMSO) or probe compound ML239 (http://pubchem.ncbi.nlm.nih.gov/summary/summary.cgi?cid=49843203&loc=ec_rcs) at an IC50 dose determined from the 3-day toxicity assay for 24 h prior to isolation of RNA. Total RNA was isolated using the RNeasy Protect Mini Kit (Qiagen, Valencia, CA). Quality control (QC) processing of the RNA samples and the gene expression analysis were performed by the Genome Analysis Platform (GAP) at the Broad Institute. Briefly, RNA samples were analyzed for quality using Aglient Bioanalyzer Chips (Agilent, Santa Clara, CA). RNA synthesis from the total RNA samples passing QC was prepared for analysis on a HumanHT-12 Expression BeadChip (Illumina, San Diego, CA) according to the manufacturer’s instructions. Quantile normalization of the raw gene expression data, QC checks, and analyses were performed with GenePattern (genepattern.broadinstitute.org; Broad Institute). After the samples were normalized, gene expression was compared, and those with a statistically significant difference (p > 0.005) were identified. The raw data are available to the scientific community at http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-884. Genes identified to have approximately 2.5-fold change in expression were analyzed for gene set enrichment using Ingenuity (http://www.ingenuity.com/; Ingenuity Systems, Redwood City, CA).
Results and Discussion
To complete this study, a stem-like cell line, HMLE_sh_ECad, was chosen due to the innate difficulties in isolating and propagating native CSC cells. Furthermore, the induction of EMT in neoplastic mammary epithelial cell populations has been shown to enrich for cells with stem-like properties.7,8 With the availability of this cell line, we were able to screen 300 718 novel compounds from the MLSMR collection (http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?aid=2717&loc=ea_ras). Toxicity was assessed after treating 2000 HMLE_sh_ECad cells/well with compound for 72 h using CellTiter-Glo (Promega). The signal was normalized to the neutral (DMSO) and positive (Puromycin) controls. Hits were defined as showing >75% reduction in viability (i.e., adenosine triphosphate [ATP] level) at an average screening concentration of 7.5 µM. From this, 3190 hits were identified as inhibitors of HMLE_sh_ECad ( Fig. 1 ). Duplicate samples correlated highly with one another, showing that the data were reproducible. To remove promiscuously active compounds, the list of hits was narrowed by discarding compounds that were active in >10% of assays conducted within the Molecular Libraries Probe Production Center Network (MLPCN). In addition, compounds that scored positively in mammalian cell toxicity screens listed in PubChem as of April 2010 were also removed from the list of hits as nonselective. The top 2500 compounds were requested, and of these, 2244 compounds in DMSO stock were available and sourced for retest. To determine selectivity, these cherry-picked compounds were retested in dose against the primary cell line, HMLE_sh_ECad, and the counterscreen control (HMLE_sh_GFP) cell line. From the 2244 compounds retested in the primary assay against HMLE_sh_ECad at dose concentrations, an unprecedented 97% (2181 compounds) inhibited at least 50% at the highest dose tested (20 µM). In parallel, these 2244 compounds were included in HMLE_sh_GFP viability assays to determine if these compounds affected the viability of the control cell line. Taken together, only 26 compounds were identified as having an IC50 of ≤5 µM and were 25-fold more toxic toward HMLE_sh_ECad versus HMLE_sh_GFP cells ( Fig. 2A , B ).

Replicate data of compounds tested in the primary screen. In total, 300 718 compounds (average concentration of 7.5 µM) run in duplicate were tested in the primary screen identifying HMLE_sh_ECad-toxic compounds. The signal was normalized to neutral (DMSO) and positive (Puromycin) controls, and a mean of 75% inhibition was used as a cutoff to define a hit. In total, 3190 compounds were identified as hits based on these criteria (blue squares).

Representative dose curve data for probe ML239. HMLE_sh_ECad (IC50 = 1.16 µM; n = 4) (black) and control cell line HMLE_sh_GFP (26.7 µM, n = 5) (gray) toxicity (
After completing the primary and counterscreen retest assays at dose from DMSO stocks, 19 exact or nearly identical compounds were ordered from commercial vendors and purified. These 19 dry-powder compounds were quality controlled prior to retesting in the primary assay and counterscreens at dose concentrations to determine if these compounds retained potency and selectivity. From this set of dry-powder compounds, two acyl hydrazone compounds (CID5417654 and CID24816775) retained selectivity, and this scaffold was prioritized for medicinal chemistry efforts ( Figs. 2 and 3 ). Based on the potency and selectivity of this scaffold, 53 structurally modified compounds were tested against HMLE_sh_ECad and HMLE_sh_GFP cell lines.5,9 A single compound, ML239, was identified as the best in the series with an IC50 = 1.16 µM and 23-fold selectivity for HMLE_sh_ECad versus HMLE_sh_GFP cells (see Figs. 2 and 3 ). To confirm that ML239 was not specifically toxic to the HMLE_sh_ECad cell line, another CSC-like cell line, HMLE_Twist, was treated with ML239. Twist is a transcription factor that downregulates the expression of E-Cadherin. Therefore, an overexpression of Twist promotes an EMT-induced model of breast CSC-like cells.4,8 Similar to the results observed in the HMLE_sh_ECad cell line, ML239 was potently toxic, inhibiting HMLE_Twist with an IC50 ~0.1 µM ( Fig. 3 ). Furthermore, ML239 displayed potent toxicity (IC50 = 2.81 µM) to the breast carcinoma cell line, MDA-MB-231 ( Fig. 3 ). Taken together, this suggests that ML239 is potently toxic to CSCs and at least one other cancer cell line but does not display toxicity to control mammary epithelial cells (HMLE_sh_GFP).

Summary of CID5417654, CID24816775, and ML239 toxicity results toward HMLE_sh_ECad (sh_ECad), HMLE_sh_GFP (sh_GFP), HMLE_Twist (_Twist), and MDA-MB-231 cell lines and SUM159 tumorspheres (n = 3–5) (
As previously stated, the compounds CID24816775 and CID5417654 were toxic toward the HMLE_sh_ECad cells with IC50 values of 2.51 µM and 3.3 µM, respectively. To confirm that they retained this selectivity in other CSC-like cell lines, they were also tested for toxicity against the HMLE_Twist line. Although CID5417654 and CID24816775 were not as potent against HMLE_Twist, they were toxic in the low µM range (IC50 = 2.89 µM and 2.21 µM, respectively). Interestingly, CID5417654 did not significantly inhibit the control line HMLE_sh_GFP at the highest dose tested (20 µM), whereas CID24816775 was significantly toxic to HMLE_sh_GFP (IC50 = 14.95 µM). These two compounds were significantly less toxic to the MDA-MB-231 cells. CID24816775 inhibited viability with an IC50 of 14.7 µM, whereas CID5417654 was only toxic at very high concentrations (IC50 = 47.6 µM) ( Fig. 3 ).
To obtain better insight into how these compounds might be affecting cell growth, each was tested for inhibition of tumorsphere formation in the SUM159 cells. The SUM159 cell model is a human breast carcinoma line, where the presence of cancer stem cell–like cells has been studied.4,10 These cells were grown in the presence or absence of compound for 9 days in a low-adhesion environment. Under these conditions, SUM159 cells, which contain a mixture of breast CSCs and differentiated breast cancer cells, form tumorspheres. The number of tumorspheres was counted in each of the conditions. The signal (number of tumorspheres) was normalized to neutral (DMSO) and positive (Puromycin) controls, and a 30% inhibition cutoff at an average screening concentration of 20 µM was used to define a hit. Neither the probe (ML239) nor CID5417654 significantly inhibited tumorsphere formation ( Fig. 3 ). However, CID24816775, the least selective toward breast cancer stem cell–like cells and most toxic toward HMLE_sh_GFP, inhibited SUM159 tumorsphere formation. Although it is unclear why this may be the case, SUM159 cells may have a different composition of CSCs or oncogenic dependencies as compared with MDA-MB-231 ( Fig. 3 ). This is supported by the fact that CID24816775, a less selective, more general toxic compound, was able to suppress tumorsphere formation. Additional work will have to be done to determine if ML239 can inhibit tumorsphere formation in other established cell lines.
To elucidate pathways in breast cancer stem cells that may be used for future target development and expand the understanding of how ML239 might be selectively toxic, a gene expression profiling study was undertaken. To get a snapshot of early gene expression changes, HMLE_sh_ECad cells were treated with ML239 or DMSO in triplicate for 24 h. The concentration chosen for this study was ML239’s IC50 concentration of HMLE_sh_ECad toxicity after a 72-h treatment with the compound (IC50 = 1.18 µM). By treating at this concentration for only 24 h, we hypothesized that we would detect gene expression changes directly affected by ML239, whereas waiting until 72 h, cells may identify genes involved with death signatures or additional compensatory signaling events.
Numerous genes are either upregulated (blue) or downregulated (red) after treatment with ML239 as compared with DMSO-treated cells in HMLE_sh_ECad cells ( Fig. 4A ). To focus on certain genes/pathways that have been the most greatly affected, we identified those genes with the highest significant changes (p > 0.005) and greater than 2.5-fold change in expression. After analysis, 21 genes met these criteria and the fold changes are plotted in Figure 4B . Genes identified are involved in free radical scavenging, cell death, and protein expression regulation, including ASNS, CBS, CDKN1A, FBXo32, GDF15, HERPUD1, HSPA5, MTHFD2, PCK2, PSAT1, RND3, SLC7A1, SQSTM1, and TRIB3.

Gene expression profiling. RNA samples from HMLE_sh_ECad cells, ML239 treated or control (DMSO), were isolated, and gene expression analysis was performed via the Illumina Bead Chip (Illumina, San Diego, CA) array. Normalized treated and untreated samples were compared to identify genes with significantly varied expression using GenePattern (genepattern.broadinstitute.org) (
Similarly, we treated HMLE_sh_GFP cells with ML239 (1.18 µM) for 24 h. Interestingly, only five genes were differentially regulated in HMLE_sh_GFP after treatment with ML239 (ATP6V0C, PKM2, PPDPF, RPL23, and SERINC2) (data not shown), and none overlapped with genes regulated in HMLE_sh_ECad cells. The few number of genes altered is consistent with the fact that we saw little toxicity in this cell line at this dose and there were different gene expression responses to the compound in the two cell types.
The gene expression profiling studies gained insight into ML239’s mechanism of action. Similar results were obtained previously where we identified protein processing genes (ATF4 and HERPUD) and inflammatory signaling pathways. 5 Due to insufficient sample size in control samples, we repeated the experiments ( Fig. 4 ). Importantly, 13 of 24 of the genes (ASNS, FBXO32, GDF15, HERPUD1, HSPA5, NCRNA00219, P8, PSAT1, RND3, SLC7A1, SNHG1, SQSTM1, and TRIB3) were identified as significantly altered by ML239 in both experiments, suggesting these pathways are critical for ML239 selectivity.
For additional insight into the interaction of the gene set identified, an Ingenuity Network analysis was performed on the 24 genes. The most significant pathway associated with the affected genes was the NF-κB pathway, with several genes involved in free radical scavenging, cell death, and protein synthesis. Fourteen of the 24 genes have been linked to this pathway ( Fig. 4C ), primarily centering on TRIB3 (Tribbles 3 homolog). TRIB3 is a putative protein kinase that is induced by NF-κB and has been shown to be correlated with both proapoptotic and antiapoptotic features. 11 High levels of TRIB3 have been correlated with poor prognosis in breast cancer, and TRIB3 expression is increased in hypoxic conditions. 11 ML239 increases TRIB3 expression in HMLE_sh_ECad ( Fig. 4 ), and this increase may induce TRIB3-regulated apoptosis. This hypothesis can be readily tested.
Taken together, these data show that we have developed a chemical probe enabling further study and the delineation of CSCs roles in cancer. There has been a recent increase in research efforts targeting the selective killing of breast CSCs. In one approach, researchers targeted the breast CSC pathways involved in self-renewal and survival. These pathways included NF-κB, Notch, Hedgehog, and Wnt.12,13 Given the interest and the therapeutic value of developing probes, ML239 is a valuable tool to move the field forward. These gene-profiling experiments suggest that regulating the NF-κB pathway is critical to the probe’s selective killing of breast CSCs. Follow-up studies are required to confirm these results. Although the direct target of ML239 has not been identified, the collective results suggest that the NF-κB pathway may be a good pathway to explore for future therapeutics. Identification of the direct target of ML239 will aid in the development and study of the selective killing of breast CSCs and may identify novel proteins or pathways to target with future therapeutics aimed at breast CSCs.
Footnotes
Acknowledgements
We thank Piyush B. Gupta and Eric S. Lander for developing the cell lines, assays, and grant that led to this work (1 R03 MH089663-01, P.B.G. and E.S.L.). We also thank P.B.G. and Yuxiong Feng for contributions to these studies and critical reading of this manuscript. Furthermore, we thank Nicola Tolliday for editing and critiquing this manuscript. We also thank Supriya Gupta and the Genetics Analysis Platform at the Broad Institute for their work processing the gene expression profiling work. We thank Stuart L. Schreiber for his generous support and guidance (GM38627, S.L.S.).
Declaration of Conflicting Interests
L. C. C., J. R. P., and M. P. are co-inventors on a patent application containing the gene expression data in this article that is owned by the Broad Institute (Cambridge, MA). A. R. G. and B. M. are co-inventors on a patent application containing the chemical composition data in this article that is owned by the Broad Institute (Cambridge, MA) that is licensed to Verastem (Cambridge, MA). L. C. C. and B. M. have provided consulting services to Verastem (Cambridge, MA).
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The project has been funded with federal funds from the National Cancer Institute’s Initiative for Chemical Genetics, National Institutes of Health, under contract no. N01-CO-12400. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Service, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. government.
