A Multivariate Computational Method to Analyze High-Content RNAi Screening Data

Abstract

High-content screening (HCS) using RNA interference (RNAi) in combination with automated microscopy is a powerful investigative tool to explore complex biological processes. However, despite the plethora of data generated from these screens, little progress has been made in analyzing HC data using multivariate methods that exploit the full richness of multidimensional data. We developed a novel multivariate method for HCS, multivariate robust analysis method (M-RAM), integrating image feature selection with ranking of perturbations for hit identification, and applied this method to an HC RNAi screen to discover novel components of the DNA damage response in an osteosarcoma cell line. M-RAM automatically selects the most informative phenotypic readouts and time points to facilitate the more efficient design of follow-up experiments and enhance biological understanding. Our method outperforms univariate hit identification and identifies relevant genes that these approaches would have missed. We found that statistical cell-to-cell variation in phenotypic responses is an important predictor of hits in RNAi-directed image-based screens. Genes that we identified as modulators of DNA damage signaling in U2OS cells include B-Raf, a cancer driver gene in multiple tumor types, whose role in DNA damage signaling we confirm experimentally, and multiple subunits of protein kinase A.

Keywords

high-content screening RNAi screening multivariate data analysis feature selection hit identification

Introduction

Image-based high-content (HC) RNA interference (RNAi) screening is an effective experimental approach to elucidate the functions of genes on the systems level by investigating the phenotypes of a large number of living cells in culture. HC perturbation assays generate large amounts of high-resolution image data that are converted into multidimensional numeric data by image-processing software. The resulting features are numeric representations of a variety of phenotypic readouts of cells, such as cellular morphology measurements or the intensity of stains of cellular DNA or specific proteins, and they permit the identification of genes that are involved in complex biological pathways. Although univariate computational methods (i.e., methods that make use of only a single feature) can suffice to identify a limited number of the most salient hits in high-content screening (HCS), it was conclusively demonstrated that multivariate hit identification methods (i.e., methods that exploit multiple numeric features) outperform univariate methods.¹

A recent review lists more than a dozen studies in which multivariate techniques were applied to HC RNAi screens.² In the vast majority of these and other relevant studies, multivariate analysis was either a complicated, multistep procedure that involved an extended sequence of separate computational steps for hit identification and dimensionality reduction,^3
–7 lacking nongreedy/ad hoc feature selection,^8,9 or both.^10–12

As a potential result, the majority of researchers pursuing HCS still rely on univariate methods to analyze multidimensional screening data. Analyzing multidimensional data with univariate methods is a self-imposed bottleneck that makes HC assays factually low content. A meta-analysis of 118 published articles to investigate how many features were actually used for hit identification in different HC screens found that even as recently as 2012, only 25% of HC screens were analyzed using multivariate methods, rendering the information content of these experiments much lower than their potential.¹³ We believe that the reason for the underwhelming popularity of multivariate methods in HCS is their extensive complexity. A powerful but easy-to-use multivariate method would encourage more researchers to get more actual content out of their HC screens.

Moreover, to gain real insight into the biological mechanisms of identified hits from these HC screens, secondary screens and follow-up experiments are required. The majority of dimensionality reduction methods previously used in HCS, such as factor analysis¹⁴ or principal component analysis,⁵ construct novel “meta-features” by linear combination of the original features. Although these techniques successfully reduce the screening data’s dimensionality, they do not necessarily reduce the actual number of features (and therefore the number of phenotypic readouts) required for hit identification. Hence, even after dimensionality reduction, a large number of features need to be rescreened in secondary screens. Selecting the minimum number of the most informative original features at the most informative time points would greatly reduce the effort in biological follow-up experiments because only a subset of the original phenotypic readouts at a limited number of time points would have to be recorded without any significant loss of important biological information.

To address these challenges, we developed the multivariate robust analysis method (M-RAM), a novel computational technique for the multivariate analysis of biological perturbation screens, and applied this method to an HC screen recently performed in our laboratory to discover novel regulators of the DNA damage response (DDR).¹⁵ M-RAM consists of two components: dRIGER, which condenses the effects of all of the short hairpin RNAs (shRNAs) for any particular gene into a single value for each of the features analyzed in the images, and logistic regression paired with the least absolute shrinkage and selection operator (Lasso), an effective, integrated regularization method for feature selection.¹⁶ M-RAM predicts hits and simultaneously selects a limited number of the most informative original features and time points in one single step. Therefore, this method is fast, elegant, and easy to use. We anticipate that M-RAM will find wide acceptance in the HCS community because of its simplicity and interpretability.

Materials and Methods

HCS

To investigate new aspects of the DDR in molecular detail, an RNAi-based HC screen was performed, as described previously.¹⁵ In brief, U2OS cells in 44 384-well plates (Suppl. Table S1) were infected with varying numbers of lentiviral shRNAs (Suppl. Fig. S1), irradiated with 10 Gy of ionizing radiation (IR), and immunostained for phenotypic readouts of DNA double-strand breaks (histone H2AX phosphorylated on serine-139 [γH2AX]), progression through the cell cycle (DNA content), mitotic entry (phospho-histone H3 [pHH3]), apoptotic cell death (cleaved caspase 3 [CC3]), and cytoskeletal changes within cells (tubulin) immediately before IR (0 h) and at 1, 6, and 24 h after IR. Furthermore, each screened plate contained varying numbers of positive controls (caffeine and ataxia telangiectasia mutated [ATM]) and negative controls (green fluorescent protein [GFP], red fluorescent protein [RFP], and lacZ; Suppl. Table S2) that were used to normalize between plates. For each screened well, a robotic Cellomics Arrayscan automated microscope outfitted with Zeiss optics acquired six nonoverlapping images in four fluorescent channels at four time points. A 20× Plan Neofluar objective lens with N.A. 0.4 was used for all images, resulting in typical nuclei sizes between 20 and 100 pixels and damage foci between 2 and 6 pixels (see Supplemental Materials). As a result, more than 1.2 million images that captured more than half a billion single cells and nuclei were generated. Assay validation suggests high reproducibility of our screen (Suppl. Fig. S2).

Directional RIGER

A new, extended derivation of RNAi Gene Set Enrichment (RIGER),¹⁷ directional RIGER (dRIGER), was used to transform normalized (see Supplemental Materials) shRNA-level data into gene-level data by computing directional normalized enrichment scores (dNES). dRIGER quantifies both the magnitude and the consistency of the phenotypic effects of multiple shRNAs targeting the same specific gene using a Kolmogorov-Smirnov–motivated running-sum test statistic. Multiple shRNAs inducing a moderate but consistent phenotypic effect receive a higher dNES than a set of highly inconsistent shRNAs with one very strong outlier. Briefly, dRIGER, like RIGER, first rank orders all screened shRNA values from largest to smallest and sequentially traverses each rank in this list from beginning to end (top to bottom) to compute a list of positional enrichment scores (ES) (one positional ES for each rank). A rank’s positional ES reflects how many shRNAs from the set targeting the gene of interest were previously encountered in the list and how many are still ahead in the list. This procedure quantifies whether the shRNAs targeting a gene of interest are clustered toward the top/beginning of the list. In dRIGER, but not in RIGER, the rank-ordered list is then similarly traversed from end to beginning (bottom to top) to compute a second list of positional ES that quantify whether the shRNAs of interest are clustered toward the bottom/end of the list. Therefore, two positional ES, henceforth called directional positional ES, are computed for every single rank. Finally, the single largest directional positional ES is normalized as in Gene Set Enrichment Analysis¹⁸ and selected as dNES. If the dNES was found by traversing from end to beginning (bottom to top), its sign is set to negative to indicate bottom-of-list enrichment. dNES were computed for each feature and each gene at each time point.

Mathematically, positional hit scores (P_H) and miss scores (P_M) were calculated at each position i in a rank-ordered list of length L based on the ranks of the screened shRNAs targeting the gene of interest, $G_{f, t} = (h_{1}, \dots, h_{| G_{f, t} |})$ , where each $h \in G_{f, t}$ represents the rank of an shRNA targeting gene G in the rank-ordered list for feature f at time point t, and $| G_{f, t} |$ refers to the number of shRNAs targeting gene G:

P_{H} (G_{f, t}, i) = \sum_{h_{j \leq i} \in G_{f, t}} \frac{h_{j}}{\sum_{h \in G_{f, t}} h}

P_{M} (G_{f, t}, i) = \sum_{h_{j \leq i} \notin G_{f, t}} \frac{1}{L - | G_{f, t} |} .

Similarly, inverse positional ES were computed to test for rank enrichment at the bottom/end of the rank-ordered list using an inverse shRNA rank set $G_{f, t}^{I}$ , where

G_{f, t}^{I} = L - G_{f, t} + 1 .

Finally, dES were computed as

ε_{d} (G_{f, t}) = \max [\begin{array}{l} \max ({\vec{P}}_{H} (G_{f, t}) - {\vec{P}}_{M} (G_{f, t})), \\ \max ({\vec{P}}_{H} (G_{f, t}^{I}) - {\vec{P}}_{M} (G_{f, t}^{I})) \end{array}]

and multiplied with −1 if the inverse directional ES was greater than the directional ES.

A Java implementation of the dRIGER algorithm is available at http://yaffelab.mit.edu/driger/.

Logistic Regression and Lasso

A logistic regression model with Lasso regularization¹⁶ (Lasso model) was used for integrated feature selection and hit identification. Feature weights were computed as

\arg \min_{\vec{β}} \sum_{i = 1}^{N} \log (1 + e^{- y_{i} \vec{β} {\vec{x}}_{i}}) + λ \sum_{j = 1}^{F} | β_{j} |,

where $\vec{β} = (β_{1}, \dots, β_{F})$ are the weights of the F features, $(y_{i}, \dots, y_{K})$ are the labels of the training set with K genes, ${\vec{x}}_{i} = (x_{i . 1}, \dots, x_{i . F})$ are the dNES of all features for gene i in the training set, λ is the Lasso tuning parameter, and log refers to the decadic logarithm here and in the rest of this article. If no convergence was achieved, the positive observations in the training set were up-sampled twofold and the model was refit. The optimal λ was identified by trying 100 different λ from a geometric sequence of values between 1 and 10⁻⁴. The Lasso then selected the λ that produced the model with the minimum expected model deviance (the optimal model) using 10-fold cross-validation. The model deviance was measured using the mean squared error, which is defined as

M S E = \frac{1}{n} \sum_{i = 1}^{n} {({\hat{Y}}_{i} - Y_{i})}^{2},

where n is the number of observations in the test data, ${\hat{Y}}_{i}$ is the model’s prediction for observation i, and Y_i is the actual label of observation i. A suboptimal, larger λ was selected to produce a sparser Lasso model. To compute this suboptimal λ, the standard error of the model deviances for all λ was computed. Then, the largest λ that produced the model with the largest deviance within 1 standard error of the minimum deviance was chosen as the λ for the selective model. The selective model tolerated a worse fit in exchange for fewer selected features. Finally, each selected set of features formed a readout profile whose statistical significance was evaluated based on the profile’s entropy (see Supplemental Materials).

Network Analysis

SteinerNet,¹⁹ an implementation of the Prize-Collecting Steiner Tree (PCST) algorithm, was used to produce a focused view of a protein-protein interaction network of interest. Interactions and genes were annotated with edge costs and node prizes, respectively, and fed into SteinerNet (see Supplemental Materials).

Results and Discussion

To identify novel molecular components of the DDR after IR, we performed an image-based HC RNAi screen, looking for unknown DDR modulators in seven functional categories (kinases, phosphatases, chromatin modifiers, RNA binding proteins, DDR modulators, oncogenic regulators, and miRNA machinery). For this multidimensional HC assay, we screened five distinct phenotypic readouts (DNA content, γH2AX, pHH3, CC3, and tubulin) at four time points (before IR and 1, 6, and 24 h after IR) to systematically quantify both temporal and spatial changes in the DDR, thus enabling a sophisticated understanding of the signal transduction network that governs the cell’s response to DNA damage.

dRIGER Transforms shRNA-Level into Gene-Level Data

To capture the consistency of the differential knock-down effects of multiple shRNAs targeting the same specific gene, we developed dRIGER, an extension of the Gene Set Enrichment Analysis–based RIGER.^17,18 We developed this method because RIGER was originally designed for continuous signal-to-noise ratios or (log) fold-changes. Inherently, RIGER does not capture the enrichment of ranks of shRNAs targeting the same specific gene toward the bottom of a rank-ordered list of all screened shRNAs. Our new method, dRIGER, computes dNES to quantify the enrichment of ranks of shRNAs targeting the same specific gene toward both the top and the bottom of a rank-ordered list of all screened shRNAs, therefore capturing the consistency of both increased and decreased phenotypic knock-down responses.

We applied dRIGER to all genes on all screened plates to compute dNES for each feature at each time point. To demonstrate how dRIGER captures both statistical location and statistical spread of differential knock-down phenotypes of shRNAs targeting specific genes, we computed dNES for the integrated γH2AX intensity feature 1 h after IR for a small number of selected genes. We chose Brd4, H2AFX, and the negative control luciferase because the phenotypic responses to H2AFX and Brd4 knock-down are well characterized.^15,20 As expected, knock-down of H2AFX substantially decreased recorded γH2AX intensity 1 h after IR, and Brd4 knock-down substantially increased it ( Fig. 1A ). Although the majority of shRNAs targeting Brd4 and H2AFX induced a consistent phenotypic effect, outliers existed in both cases. Negative control knock-downs induced a wide range of phenotypic effects, from increased to decreased γH2AX intensities ( Fig. 1A ). dRIGER effectively captured these variable phenotypic effects and assigned high dNES to the H2AFX and the Brd4 knock-down but a low dNES to the negative control knock-down ( Fig. 1B ). dRIGER successfully quantified statistical location and statistical spread—or the lack thereof—for known DDR modulators and negative controls. At the same time, dRIGER transformed shRNA-level data (67,584 rows) into gene-level data (10,892 rows), which led to a more than sixfold reduction of our data’s dimensionality. All subsequent analyses were performed on gene-level data.

Figure 1.

Directional RNA interference gene enrichment ranking (dRIGER) captures the consistency of differential effects of multiple short hairpin RNAs (shRNAs) and transforms shRNA-level data into gene-level data. (A) S-curves of shRNAs targeting the genes Brd4, H2AFX, and the negative control luciferase, for integrated γH2AX intensity 1 h after ionizing radiation (IR). As expected, knock-down with most shRNAs targeting the chromatin modifier Brd4 leads to vastly increased γH2AX intensity, whereas most shRNAs targeting H2AFX have the opposite knock-down effect. shRNAs targeting the negative control luciferase surprisingly induce a wide variety of different phenotypic responses, including increased and decreased γH2AX intensities. (B) Directional ES (dES) of Brd4, H2AFX, and the negative control luciferase for integrated γH2AX intensity 1 h after IR. dRIGER rewards strong, consistent knock-down phenotypes with high dES. shRNAs targeting Brd4 and H2AFX are enriched at the top and bottom of the rank-ordered list of all screened shRNAs, respectively, resulting in high dES. shRNAs targeting luciferase are widely spread over the entire list, resulting in a substantially lower dES.

Feature Selection with the Lasso

As M-RAM’s explicit purpose is to generate more reliable hypotheses for follow-up experiments, we wanted to select the phenotypic readouts and time points for which these follow-up experiments would prove most successful from the set of all screened readouts and time points ( Fig. 2A ). To select the features that were most predictive for DDR modulators and discard features mainly capturing noise, we used a logistic regression model with Lasso regularization (Lasso model).

Figure 2.

Logistic regression with least absolute shrinkage and selection operator (Lasso) regularization selects the most informative phenotypic readouts and time points that best capture the differences between knocked-down genes and negative controls. (A) Images of four fluorescent channels recorded by automated microscopy capturing five phenotypic readouts at four time points. Each readout is used to identify different biological objects (DNA: nuclei; γH2AX: ionizing radiation–induced foci; phospho-histone H3/cleaved caspase 3 (pHH3/CC3): pHH3/CC3-positive cells; tubulin: cells). Cell Profiler was used to generate 60 numeric features capturing morphological and intensity characteristics of each recorded object. (B) Readout profiles for feature sets selected by the optimal and selective Lasso models for four different sets of genes. A readout profile describes how many features were selected for each phenotypic readout at each time point. Only functionally coherent gene sets (DNA damage initiation signaling and checkpoint signaling) led to models that selected statistically significant feature sets with a confidence level of 95%. Readout–time point combinations with more than two selected features were additionally labeled for improved readability. p values reflect the statistical significance of a readout profile’s Shannon entropy. (C) Readout traces for different Lasso models as function of the tuning parameter λ at four time points (0, 1, 6, 24 h). Colored lines represent the number of selected features per readout for any given λ. They indicate what readouts and time points best capture the phenotypic characteristics that differentiate knocked-down genes from negative controls. As λ increases, fewer features are selected. For DNA damage initiation signaling genes, the γH2AX readout at the 1 h time point and the pHH3 readout at the 6 h time point are most predictive.

First, we wanted to investigate whether a feature set existed that was able to capture a putative “über phenotype” shared among all the knock-downs of a large set of functionally diverse DDR modulators. We trained our Lasso model on a training set consisting of 17 genes known to play a prominent role in various aspects of the DDR (Suppl. Table S4) and three negative control genes (GFP, RFP, lacZ). To determine the optimal Lasso tuning parameter λ, we 10-fold cross-validated our model and identified the minimum-deviance model (the optimal model) by selecting the λ that produced the model with the optimal fit (Suppl. Fig. S4). We then selected a larger tuning parameter λ to produce an even sparser model with suboptimal fit (the selective model). The models selected 16 and 10 of the 60 features, respectively (Suppl. Fig. S5). Surprisingly, in both cases, the extracted readout profile, a tabular representation of the selected features grouped by phenotypic readout and time point, was not statistically significant with a confidence level of 95% ( Fig. 2B ).

We hypothesized that the different and diverse functions of the various DDR modulators used as positive observations in the Lasso model’s training set were the reason for the lack of statistical significance of the selected feature set and postulated that our predictive models might be better at successfully capturing knock-down phenotypes of more functionally coherent gene sets. Knock-down of genes that are functionally coherent (i.e., that participate in a similar process within the larger set of molecular events that constitute the DDR) is likely to induce similar phenotypic responses that can be captured by automated microscopy and subsequently numerically captured in the extracted features. We therefore trained Lasso models for DNA damage initiation signaling, checkpoint signaling, and, as a stringent control, the union of these two, to test our hypothesis. As before, GFP, RFP, and lacZ served as negative observations in the training sets. All Lasso models converged (Suppl. Fig. S4), selecting varying numbers of features (Suppl. Fig. S5) from different phenotypic readouts ( Fig. 2C ). However, only the readout profiles identified by the selective model for DNA initiation signaling alone and checkpoint signaling alone were statistically significant with a confidence level of 95% ( Fig. 2B ). We therefore focused all subsequent analyses on the selective models. The selective model for DNA initiation signaling identified five features ( Table 1 ), resulting in a dimensionality reduction by a factor of 12. Four of these five features belonged to the γH2AX readout 1 h after IR. This statistically significant feature set reconfirmed the extreme importance of γH2AX intensity as a marker of DNA damage initiation signaling activity, consistent with our prior selection of γH2AX metrics for a more basic analysis of the high-throughput RNAi screen for DDR genes.¹⁵

Table 1.

Features selected by the selective Lasso models trained on DNA damage initiation and checkpoint signaling genes.^a

Readout	Time	Feature	Weight	Scaled Weight
DNA damage initiation signaling
H2AX	1	Maximum nucleic intensity	0.039715	46
		Standard deviation of foci intensity	0.022996	27
		Standard deviation of nuclei intensity	0.012257	14
		Number of foci	0.010534	12
pHH3	6	Number of pHH3+ nuclei	0.000333	1
Checkpoint signaling
H2AX	0	Number of foci	0.007222	3
pHH3	0	Standard deviation of pHH3+ nuclei	0.047811	22
		Minimum nucleic intensity	0.015302	7
		Maximum pHH3+ nucleic intensity	0.014266	7
		Mean nucleic intensity	0.004452	2
	1	Number of pHH3+ nuclei	0.057373	26
	1	Integrated pHH3+ nucleic intensity	0.042539	19
DNA	24	Integrated nucleic intensity	0.006422	3
Tubulin	6	Minimum cellular intensity	0.013966	6
	24	Mean cellular intensity	0.011349	5

Scaled weights represent feature weights that were normalized to sum to 100 for better readability. Lasso, least absolute shrinkage and selection operator; PHH3, phospho-histone H3.

Surprisingly, only two of the five selected features were canonical features likely to be picked manually. These two features, the number of γH2AX foci 1 h after IR and the number of pHH3-positive nuclei 6 h after IR, received the lowest feature weights in the Lasso model. The three remaining features, all γH2AX features 1 h after IR (maximum nucleic intensity, standard deviation of IR foci intensity, and standard deviation of nucleic intensity), received significantly higher weights ( Table 1 ). These features either directly captured information about the statistical spread of γH2AX intensities (standard deviations) within segmented nuclei or foci or were highly sensitive to outliers and increased statistical spread (maximums). This analysis reveals that the statistical spread of intensities within each object better captured knock-down phenotypes of DNA damage initiation signaling genes than estimators for statistical location such as average γH2AX intensity.

One potential cause for the importance of statistical spread estimators over statistical location estimators is the wide variety of RNAi-induced changes on the single-cell level. The microenvironment of cells that are subject to RNAi can be a potential source of the stochasticity of differential phenotypic responses.¹² Additional contributors to this intranuclear or intrafoci variation in γH2AX intensity include varying levels of shRNA integration and expression or stochastic effects of equally expressed shRNAs on protein expression, particularly if these effects result in local alterations in chromatin structure or DNA damage repair efficiency. Indeed, image analysis on the single-cell level visually confirmed a high variability of phenotypes of single cells that were targeted by the same shRNA.²¹ Imperfect knock-down and puromycin selection can also lead to multiple subpopulations of cells that exhibit more variable and convoluted phenotypic effects at the single-cell level. We therefore propose that features that capture statistical spread might be able to better quantify the resulting variability of knock-down effects and thus better identify hits in RNAi screens.

It should be noted that the entirety of phenotypic information in an HC screen can be captured on the single-cell level only by analyzing distributions of populations of individual cells in screened wells. However, as each well frequently contains large numbers of individual cells, single-cell analysis would potentially increase the computational effort required to analyze the data by multiple orders of magnitude. We consider statistical spread in combination with statistical location as a good compromise to estimate characteristics of cell population distributions in HC screens such as ours, in which the richness and amount of data render single-cell analysis impractical.

The selective Lasso model for DNA damage initiation signaling did not select γH2AX intensity features at the 6 h time point, meaning that γH2AX features did not reliably differentiate DNA damage initiation signaling genes from negative control genes at this time point. In our previous study,¹⁵ we simply hand selected three image features (integrated γH2AX intensity, number of IR foci per nucleus, and mean IR foci area) at 1 h and 6 h after IR as a metric to rank chromatin modifier genes using quartile thresholding. This basic thresholding method selected Brd4 as top hit, as did M-RAM. However, some of the chromatin modifier genes ranked in the top and bottom quartile using these metrics were not in the top or bottom quartile of M-RAM–ranked genes. Because half of the thresholds in our previous analysis were applied to imaging features from the 6 h time point, a time point shown by M-RAM to be entirely unpredictive, the lack of perfect agreement in gene ordering between these two lists can be attributed to the ad hoc nature of our selection process for image features and time points used in our previous analysis.

To learn if a simple method such as quartile thresholding would exhibit increased performance after automatic feature selection, we dropped the 6 h time point as suggested by our model. Quartile thresholding of the three features at the 1 h time point alone led to a relative increase in sensitivity by 11.2% and a relative decrease in specificity by 0.53% as compared with thresholding at both time points. Therefore, even simple hit identification methods such as quartile thresholding may benefit from a priori feature selection.

The selective Lasso model for checkpoint signaling also produced a statistically significant readout profile with a confidence level of 95% ( Fig. 2B ). Sixty percent of the profile’s features were derived from the pHH3 readout, and two-thirds of these specifically captured pHH3 before IR, although the highest-scoring features were selected at the 1 h time point ( Table 1 ). The high prevalence of pHH3 features before IR likely reflects the importance of CHEK1 and CHEK2 in cell cycle control even in the absence of exogenous DNA damage. This finding suggests that intrinsic DNA damage in an unperturbed cell cycle in these cells is already sufficient to control cell cycle progression rates through CHEK1 and CHEK2.

Furthermore, we investigated whether images of cells treated with shRNAs targeting genes in the training set for DNA damage initiation signaling (H2AFX, ATM) and cells treated with shRNAs targeting genes in the training set for checkpoint signaling (CHEK1, CHEK2) would have similar phenotypes within their respective functionally coherent groups. Indeed, knock-down of γH2AX and ATM led to a decrease in IR-induced γH2AX foci 1 h after IR, whereas knock-down of CHEK1 and CHEK2 resulted in an increased number of mitotic cells (Suppl. Fig. S6).

The readout profiles of the control model trained on the union of DNA damage initiation and checkpoint signaling genes were not statistically significant ( Fig. 2B ). Therefore, we conclude that statistical significance of the selected feature sets depends on functional coherence of the positive observations in the training sets. This finding is important because it shows that broad computational approaches to identify complex phenotypes cannot be blindly performed using a diverse set of genes, which are important in various different parts of a biological process. Instead, only genes that function together to control a limited portion of a complex phenomenon are likely to be useful in training predictive models that capture their more well-defined phenotypes. To capture a complex biological process in its entirety, it will likely be necessary to use smaller subsets of the whole, each representing a functionally coherent subcomponent.

M-RAM Identifies DDR Modulators Missed by Univariate Methods

We used the selective Lasso model for DNA damage initiation signaling with the selected feature set ( Table 1 ) to identify novel DDR modulators. Intuitively, this Lasso model ranked all screened genes based on how much their knock-down phenotype resembled the knock-down phenotypes of genes in the DNA damage initiation signaling training set (Suppl. Table S4). Genes were ranked from strongest phenotypic resemblance (intuitively corresponding to low γH2AX 1 h after IR) to strongest opposite phenotype. Genes at the top and bottom of the list are therefore likely to be true hits. In agreement with this, the top 10 and bottom 10 ranked genes ( Table 2 ) contained numerous canonical DDR signaling components, many of which were not part of the training set.

Table 2.

Best-ranked hits that resemble the knock-down phenotype of DNA damage initiation signaling genes (top) or the opposite phenotype (bottom).^a

Top 10 Hits
Gene Symbol	Gene Name	M-RAM Rank	2BHM Rank
H2AFX	Histone H2A.X	1	120
ATM	Ataxia telangiectasia mutated	2, 5	203, 1232
PRKACG	cAMP-dependent protein kinase catalytic subunit γ	3	537
TEX14	Testis expressed 14	4	2338
BRCA2	Breast cancer 2	6	8
PRKAR1A	cAMP-dependent protein kinase type I-α regulatory subunit	7	442
EXO1	Exonuclease 1	8	11
CCND1	Cyclin D1	9	18
CHEK2	Checkpoint kinase 2	10	17

Bottom 10 Hits
Gene Symbol	Gene Name	M-RAM Rank	2BHM Rank

BRD4	Bromodomain-containing protein 4	1, 4	12, 49
EPHA2	EPH receptor A2	2	1
GRK1	Rhodopsin kinase	3	254
PI4K2A	Phosphatidylinositol 4-kinase type 2 α	5	246
PFKFB1	6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 1	6	98
PIKFYVE	PI-3-phosphate/PI 5-kinase, type III	7	25
PRKCI	Protein kinase C ι	8	633
MID2	Midline 2	9	29
BRAF	V-raf murine sarcoma viral oncogene homolog B1	10	82

MRAM, multivariate robust analysis method; 2BHM, second-best hairpin method; cAMP, cyclic adenosine monophosphate.

To establish a baseline for comparisons with M-RAM, we applied a popular method for identifying hits in HC RNAi screens, the second-best hairpin method (2BHM; see Supplemental Materials), to our normalized HC data set. Using 2BHM on the integrated γH2AX intensity 1 h after IR ranked the negative control lacZ as top hit (Suppl. Fig. S8). Moreover, other negative control shRNAs were also widely spaced over the rank-ordered list of second-best shRNAs.

No sound justification exists to select the second-best shRNA, and not the best, third best, or any other. Selecting one arbitrary, single shRNA makes the implicit assumption that all other shRNAs with stronger or weaker effects do not contribute useful information. A single shRNA, by definition, can be a measure of only statistical location but not statistical spread. High spread implies inconsistent knock-down effects that should decrease the confidence in an identified hit. This highly important aspect of hit identification is completely lost using 2BHM but captured by shRNA aggregation methods such as dRIGER.

To compare M-RAM’s and 2BHM’s classification performance, we performed leave-one-out cross-validation on the training set of DNA damage initiation signaling genes. The selective Lasso model outperformed 2BHM (area under the receiver-operating characteristic curve of 0.83 and 0.77, respectively; Fig. 3A ). M-RAM consistently ranked independent caffeine controls closer to the top of the hit list (where one would expect knock-downs that decrease γH2AX) than 2BHM ( Fig. 3B ). In addition, it ranked Brd4 and selected protein phosphatase 2 (PP2A) subunits²² closer to the bottom of the list (where one would expect knock-downs that increase γH2AX) than 2BHM ( Fig. 3C ).

Figure 3.

Multivariate robust analysis method (M-RAM) outperforms second-best hairpin method (2BHM). (A) ROC curve comparing M-RAM’s (selective Lasso model for DNA damage initiation signaling) and 2BHM’s performance using leave-one-out cross-validation. M-RAM provides superior sensitivity and specificity. AUC refers to area under the receiver-operating characteristic (ROC) curve. (B) Q-Q plot comparing the ranking of independent caffeine controls by M-RAM and 2BHM. M-RAM ranks the independent caffeine controls more accurately than 2BHM (closer to zero, which indicates lower γH2AX and higher similarity to knock-down of genes belonging to DNA damage initiation signaling). In addition, M-RAM ranks the controls more precisely (ranks provided by M-RAM have less than half the statistical spread than ranks provided by 2BHM). The p value was computed using a Wilcoxon rank-sum test. (C) Dot plot of Brd4 and selected protein phosphatase 2 subunit ranking as provided by M-RAM and 2BHM. As in (B), M-RAM ranks the genes more precisely and more accurately than 2BHM. Brd4 and PPP2R5A are displayed more than once because they were independently screened on multiple different plates.

Lastly, we used dRIGER to rank screened genes based on their integrated γH2AX intensity 1 h after IR to investigate whether dRIGER provides potential performance gains over 2BHM, even in the absence of systematic feature selection. The top and bottom of the list of screened genes ranked by their dNES were heavily enriched in genes known to be involved in the DDR and oncogenic processes (Suppl. Table S6). In addition, dRIGER alone ranked independent caffeine controls better than 2BHM but worse than M-RAM (Suppl. Fig. S9). Therefore, although dRIGER alone outperforms 2BHM, M-RAM provides even better performance than dRIGER because of its integrated feature selection step.

Network Analysis Puts Identified Hits into Context

To generate even more reliable hypotheses about how the hits previously identified by M-RAM potentially interact among themselves and with known DDR modulators, we investigated how these hits could be tied into known protein-protein interaction networks that were enriched with kinase-substrate predictions. We anticipated that the most tightly connected network structures would suggest potential mechanisms of DDR signaling. For this purpose, we employed the PCST, a network flow algorithm successfully applied in the biological domain²³ by using M-RAM to assign prize values to genes in the shRNA screen.

First, we constructed a base network from four sources: a prior knowledge network (PKN), the screened genes, the filtered STRING interactome, and Scansite²⁴ predictions. We defined a small, tightly connected network, the PKN, that represented well-established DNA damage initiation signaling genes^20,25,26 (Suppl. Fig. S10). We speculated that genes closely connected to the PKN were more likely to play a role in DNA damage initiation signaling. To connect putative M-RAM hits with the PKN, we filtered the STRING interactome²⁷ for experimentally verified, high-confidence interactions. The filtered STRING interactome had 9857 nodes and 483,940 edges. We placed our screened genes and the PKN in this large network (see the Materials and Methods section). To expand our network analysis beyond static protein-protein interactions, we used 70 position-specific scoring matrices (PSSMs) in Scansite to predict putative substrates of kinases and putative binding partners of proteins for which PSSMs were available. A total of 4517 high-confidence interactions were predicted and added to our base network. Because the resulting base network was of prohibitively high complexity, we reduced it to screened genes and STRING interactome genes that were closely connected to the PKN. The extracted subnet had 4719 nodes (half of the number of the base network’s nodes) and 52,834 edges (nearly 10 times fewer edges than the base network), representing a substantial reduction of complexity.

Because the filtered base network was still far too complex to allow its intuitive interpretation and visualization, we employed the PCST to extract the most confident subnetwork. We rewarded high confidence in a gene with high node prizes and high confidence in an interaction with low edge cost. The PCST extracted a network consisting of the 6 genes from the PKN, 35 screened genes, and 6 genes from the filtered STRING interactome, which we visualized with a hive plot²⁸ (Suppl. Fig. S11). Three of the extracted screened genes were originally ranked below 100 by the selective Lasso model. All three were previously described as being involved in the DDR (Suppl. Table S7). Further-more, all six genes extracted from the filtered STRING interactome were implicated in the DDR (Suppl. Fig. S11; Suppl. Table S8).

B-Raf Is Involved in the DDR

M-RAM identified the knock-down phenotype of B-Raf in U2OS cells as similar to that of Brd4 ( Table 2 , bottom 10 hits), a chromatin-modifying protein whose knock-down we recently showed results in expanded chromatin architecture and enhanced DNA damage signaling with elevated γH2AX foci intensity.¹⁵ Importantly, B-Raf’s knock-down phenotype was ranked significantly worse by 2BHM because no single shRNA induced an exceptional signal, although the vast majority of its shRNAs had a highly consistent, positive effect (Suppl. Fig. S7).

To independently verify the knock-down effects of B-Raf on γH2AX foci intensity, we used additional shRNA sequences against B-Raf that differed from those used in the HC screen (see Supplemental Materials). Note that M-RAM’s selection of the most informative phenotypic readouts at the most informative time points vastly reduced the effort of follow-up experiments by allowing us to focus on the γH2AX readout at the 1 h time point. shRNA-II, but not shRNA-I, resulted in a 74% reduction in B-Raf protein levels and a corresponding decrease in the levels of phospho-Erk ( Fig. 4A ). Importantly, the B-Raf shRNA-II knock-down cells showed a marked increase in γH2AX foci intensity at 1 h following application of 10 Gy of IR ( Fig. 4B, C ) when compared with the control shRNA knock-down cells at the same time point after IR. Quantification of the resultant images verified a statistically significant ~40% increase in the intensity of γH2AX foci per nucleus or per nuclear area (thereby excluding the possibility that the B-Raf knock-down caused an increase in γH2AX indirectly through changing the nuclear size).

Figure 4.

Validation of B-Raf as a modifier of the DNA damage response in U2OS cells. (A) U2OS cells were infected with retroviruses encoding control and B-Raf–directed short hairpin RNAs (shRNAs), harvested 72 h after the final infection, and lysates analyzed for B-Raf and phosphor-Erk levels by immunoblotting. (B) Control and BRAF–shRNA-II infected cells were fixed before and 1 h after irradiation with 10 Gy of ionizing radiation (IR) and stained for γH2AX. (C) γH2AX staining intensity was quantified using four representative fields containing greater than 100 nuclei total, from three independent experiments. Shown are integrated γH2AX intensity per nuclear area, normalized to that measured in the control shRNA-infected cells at 1 h after IR. Values are mean ± SEM, with p values calculated using a Student unpaired t test.

M-RAM also identified DNA damage signaling alterations following knock-down of various components of protein kinase A (PKA) complexes. These components were missed by 2BHM for the same reasons described above (Suppl. Fig. S7). Knock-down of the PKA less-active catalytic subunit γ²⁹ and PKA type I-α regulatory subunit closely resembled the knock-down phenotype of DNA damage initiation signaling components ( Table 2 , top 10 hits), whereas knock-down of the more active catalytic α and β subunits displayed the opposite phenotype, showing increased γH2AX and DNA damage signaling (Suppl. Table S5). Although the role of PKA signaling in the DDR is complex and remains poorly understood, our findings are consistent with two recent studies. Cho et al. (2014)³⁰ reported that PKA activity stimulates PP2A to dephosphorylate γH2AX and suppress ATM signaling after IR, whereas Jarrett et al. (2014)³¹ showed that PKA phosphorylation of ATR promotes recruitment of xeroderma pigmentosum complementation group A to ultraviolet-induced DNA damage sites to enhance DNA repair and clear DNA lesions. Both of these studies can rationalize the lower levels of γH2AX signals when the net catalytic activity of PKA complexes within the cell is high and the converse when it is low. Additional studies, however, are clearly required to more thoroughly characterize the involvement of specific PKA subcomplexes in the DDR.

In conclusion, we conducted an image-based HC RNAi screen to identify novel regulators of the DDR. We then proceeded to develop M-RAM, a novel computational method to tap the full potential of this and similar HC screens. Employing dRIGER, an enhanced version of RIGER, we significantly reduced the dimensionality of the screening data. We transformed shRNA-level data into gene-level data, capturing consistency and variability of differential shRNA effects, and achieved a nearly sevenfold reduction in dimensionality. Lasso models selected the most predictive features at the applicable time points. In the case of DNA damage initiation signaling, the feature selection step resulted in a more than 50-fold dimensionality reduction. Functional coherence of training sets—that is, the specific selection of genes for a training set that function together within a single process within a much larger multiprocess phenomenon such as the DDR—was required to select statistically significant feature sets. The resulting selective logistic regression model generated a rank-ordered list from which hits could be selected for further analysis and verification. Canonical DDR regulators were highly clustered toward the top and bottom of this hit list. Comparison of the sensitivity and specificity of M-RAM with the 2BHM demonstrated that our method provides superior performance. In addition, our method ranked independent controls better than 2BHM. Lastly, we applied a PCST to a network consisting of our weighted hits, Scansite predictions, a PKN and the filtered STRING interactome to narrow down the hit list and generate hypotheses about how the hits might modulate the DDR. M-RAM identified both B-Raf and specific subunits of PKA as hits that were missed by 2BHM. Follow-up experiments further verified that B-Raf knock-down in U2OS cells indeed markedly increased γH2AX 1 h after IR, a finding that has potentially important clinical applications for the addition of radiation therapy in the treatment of B-Raf mutant tumors that are being concurrently treated with B-Raf inhibitors.³²

We believe that M-RAM has two important advantages over other published multivariate approaches for the analysis of HC screens. First, M-RAM elegantly combines hit identification and feature selection in one single computational step. Other multivariate approaches treat feature selection and hit identification as a multistep procedure in the HCS data analysis pipeline, complicating implementation and interpretation of results. Furthermore, M-RAM requires only the selection of the Lasso tuning parameter λ. The appropriate λ can be easily determined using cross-validation and by computing the statistical significance of the resulting readout profiles. Although we did not implement a rigid binary computational classification for hit selection, as we believe this is best done in consultation with biologists familiar with the process being studied, we do provide an explicit method for doing so using the resulting rank-ordered list to place the genes in a relevant signaling network based on preexisting knowledge with a PCST algorithm.

Second, our method provides integrated feature selection, not dimensionality reduction such as principal component analysis or factor analysis. The inherent objective of computational methods for the analysis of HC screens is to generate hypotheses for follow-up experiments from primary HC data. It is essential to reduce the number of screened phenotypic readouts and the number of time points without losing essential biological information to save experimentalists the effort of rescreening all of them. Our method efficiently selects the most predictive phenotypic readouts at the most predictive time points, therefore vastly simplifying confirmatory experiments. Hence, we believe that M-RAM will find more widespread adoption than other published multivariate approaches.

Footnotes

Supplementary material for this article is available on the Journal of Biomolecular Screening Web site at .

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This project was supported by National Institutes of Health (NIH) grants U54-CA112967, R21-NS063917, and R01-ES015339 to M.B.Y., and a pilot grant from the Center for Environmental Health Sciences NIH grant P30-ES002109. J.R. was supported by the International Fulbright Science and Technology Award, the Howard Hughes Medical Institute International Student Research Fellowship, the Hugh Hampton Young Memorial Fund Fellowship, and a David H. Koch Graduate Fellowship. Y.D. was supported by a postdoctoral fellowship from the Mazumdar-Shaw International Oncology Fellows Program at Koch Institute for Integrative Cancer Research at MIT. K.K. and T.E. were supported by Marshall Plan Scholarships.

References

Dürr

Duval

Nichols

. Robust Hit Identification by Quality Assurance and Multivariate Data Analysis of a High-Content, Cell-Based Assay. J. Biomol. Screen. 2007, 12, 1042–1049.

Liberali

Snijder

Pelkmans

Single-Cell and Multivariate Approaches in Genetic Perturbation Screens. Nat. Rev. Genet. 2014, 16, 18–32.

Collinet

Stöter

Bradshaw

C. R.

. Systems Survey of Endocytosis by Multiparametric Image Analysis. Nature 2010, 464, 243–249.

Bakal

Aach

Church

. Quantitative Morphological Signatures Define Local Signaling Networks Regulating Cell Morphology. Science 2007, 316, 1753–1756.

Nir

Bakal

Perrimon

. Inference of RhoGAP/GTPase Regulation Using Single-Cell Morphological Data from a Combinatorial RNAi Screen. Genome Res. 2010, 20, 372–380.

Loo

L.-H.

L. F.

Altschuler

S. J.

Image-Based Multivariate Profiling of Drug Responses from Single Cells. Nat. Methods 2007, 4, 445–453.

Yin

Sadok

Sailem

. A Screen for Morphological Complexity Identifies Regulators of Switch-Like Transitions between Discrete Cell Shapes. Nat. Cell Biol. 2013, 15, 860–871.

Zhang

Boutros

A Novel Phenotypic Dissimilarity Method for Image-Based High-Throughput Screens. BMC Bioinformatics 2013, 14, 1–9.

Singh

D. K.

C.-J.

Wichaidit

. Patterns of Basal Signaling Heterogeneity Can Distinguish Cellular Populations with Different Drug Sensitivities. Mol. Syst. Biol. 2010, 6, 1–10.

10.

Fuchs

Pau

Kranz

. Clustering Phenotype Populations by Genome-Wide RNAi and Multiparametric Imaging. Mol. Syst. Biol. 2010, 6, 370.

11.

Chia

Goh

Racine

. RNAi Screening Reveals a Large Signaling Network Controlling the Golgi Apparatus in Human Cells. Mol. Bystems Biol. 2012, 8, 1–20.

12.

Snijder

Sacher

Rämö

. Single-Cell Analysis of Population Context Advances RNAi Screening at Multiple Levels. Mol. Syst. Biol. 2012, 8, 1–18.

13.

Singh

Carpenter

A. E.

Genovesio

Increasing the Content of High-Content Screening: An Overview. J. Biomol. Screen. 2014, 19, 640–650.

14.

Young

D. W.

Bender

Hoyt

. Integrating High-Content Screening and Ligand-Target Prediction to Identify Mechanism of Action. Nat. Chem. Biol. 2008, 4, 59–68.

15.

Floyd

S. R.

Pacold

M. E.

Huang

. The Bromodomain Protein Brd4 Insulates Chromatin from DNA Damage Signalling. Nature 2013, 498, 246–250.

16.

Tibshirani

Regression Shrinkage and Selection via the Lasso. J. R. Stat. Soc. Ser. B Stat. Method 1996, 58, 267–288.

17.

Luo

Cheung

H. W.

Subramanian

. Highly Parallel Identification of Essential Genes in Cancer Cells. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 20380–20385.

18.

Subramanian

Tamayo

Mootha

V. K.

. Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles. Proc. Natl. Acad. Sci. U.S.A. 2005, 102, 15545–15550.

19.

Tuncbag

McCallum

Huang

S.-S. C.

. SteinerNet: A Web Server for Integrating “Omic” Data to Discover Hidden Components of Response Pathways. Nucleic Acids Res. 2012, 40, W505–W509.

20.

Sancar

Lindsey-Boltz

L. A.

Unsal-Kaçmaz

. Molecular Mechanisms of Mammalian DNA Repair and the DNA Damage Checkpoints. Annu. Rev. Biochem. 2004, 73, 39–85.

21.

Jones

T. R.

Carpenter

A. E.

Lamprecht

M. R.

. Scoring Diverse Cellular Morphologies in Image-Based Screens with Iterative Feedback and Machine Learning. Proc. Natl. Acad. Sci. U.S.A. 2009, 106, 1826–1831.

22.

Kalev

Simicek

Vazquez

. Loss of PPP2R2A Inhibits Homologous Recombination DNA Repair and Predicts Tumor Sensitivity to PARP Inhibition. Cancer Res. 2012, 72, 6414–6424.

23.

Huang

S.-S. C.

Fraenkel

Integrating Proteomic, Transcriptional, and Interactome Data Reveals Hidden Components of Signaling and Regulatory Networks. Sci. Signal. 2009, 2, ra40.

24.

Obenauer

J. C.

Cantley

L. C.

Yaffe

M. B.

Scansite 2.0: Proteome-Wide Prediction of Cell Signaling Interactions Using Short Sequence Motifs. Nucleic Acids Res. 2003, 31, 3635–3641.

25.

Harper

J. W.

Elledge

S. J.

The DNA Damage Response: Ten Years After. Mol. Cell 2007, 28, 739–745.

26.

Reinhardt

H. C.

Yaffe

M. B.

Phospho-Ser/Thr-Binding Domains: Navigating the Cell Cycle and DNA Damage Response. Nat. Rev. Mol. Cell Biol. 2013, 14, 563–580.

27.

Franceschini

Szklarczyk

Frankild

. STRING v9.1: Protein-Protein Interaction Networks, with Increased Coverage and Integration. Nucleic Acids Res. 2013, 41, D808–D815.

28.

Krzywinski

Birol

Jones

S. J.

. Hive Plots-Rational Approach to Visualizing Networks. Brief. Bioinform. 2012, 13, 627–644.

29.

Zhang

Morris

G. Z.

Beebe

S. J.

Characterization of the cAMP-Dependent Protein Kinase Catalytic Subunit Cgamma Expressed and Purified from sf9 Cells. Protein Expr. Purif. 2004, 35, 156–159.

30.

Cho

E.-A.

Kim

E.-J.

Kwak

S.-J.

. cAMP Signaling Inhibits Radiation-Induced ATM Phosphorylation Leading to the Augmentation of Apoptosis in Human Lung Cancer Cells. Mol. Cancer 2014, 13, 1–15.

31.

Jarrett

S. G.

Horrell

E. M. W.

Christian

P. A.

. PKA-Mediated Phosphorylation of ATR Promotes Recruitment of XPA to UV-Induced DNA Damage. Mol. Cell 2014, 54, 999–1011.

32.

Rahman

M. A.

Salajegheh

Smith

R. A.

. BRAF Inhibitors: From the Laboratory to Clinical Trials. Crit. Rev. Oncol. Hematol. 2014, 90, 220–232.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

3.46 MB

0.00 MB