Functional Genomics for Target Identification

Abstract

A major objective of biomedical research is to define how genes and their products function together in physiology and disease. Functional genomics is the study of how genes and pathways contribute to disease. Functional genomics screens allow us to systematically perturb large numbers of genes or proteins, revealing cellular phenotypes that enable one to infer gene function.

The first functional genomics screens were run in simple organisms, such as yeast, and these efforts have revealed basic principles of eukaryotic biology and cell homeostasis.¹ However, despite the enormous value of functional genomics efforts in model eukaryotes, human pathology cannot be fully modeled in single-cell organisms such as yeast. Therefore, in order to realize the full potential of functional genomics, particularly in the setting of human disease, there has been a continuous drive to employ more advanced and predictive models in functional genomics screens. This trend is exemplified by several reports in this special issue, where more translatable cell and animal models are reviewed and demonstrated in functional genomics screens.

From a technological perspective, before the advent of CRISPR, three main technologies enabled functional genomics efforts in mammalian systems: (1) random mutagenesis using chemicals, transposons, or retroviruses; (2) RNA interference (RNAi) technology in its different iterations (siRNA, shRNA, etc.); and (3) cDNA overexpression screens that allowed researchers to associate the effect of the overexpression of any given gene with a cellular phenotype. Together, these three approaches have revealed basic biological principles that are relevant for enhanced understanding of human disease mechanisms, but their impact has been restricted by issues such as off-target effects and incomplete penetrance.

Based on the perturbation modality of choice, functional genomics screens can be further classified into (1) genetic (which use approaches such as gRNA, siRNA, or cDNA) or (2) nongenetic (which use external agents such as small molecules or protein). Although the latter really belong to the category of “functional proteomics” screens, for simplicity they are often grouped under the umbrella of “functional genomics.”

While genetic screens employing RNAi technology are still employed, the majority of genetic functional genomics screens are now run using the CRISPR/Cas9 technology in its various forms, allowing the generation of gene knockout (CRISPRn), gene downregulation (CRISPRi), and gene upregulation (CRISPRa) in mammalian cells at scale.² Phenotypic screening using small-molecule libraries requires significant investment in target deconvolution experiments; consequently, functional genomics efforts employing protein libraries are gaining traction due to the relatively small size of the library compared with other screening libraries and the lack of cell manipulation needed to enable screening. Notably, in this special issue we have collected reports on both genetic and nongenetic functional genomics screens, illustrating the great variety of tools available for researchers in this field.

Genetic functional genomics screens can be subdivided into pooled and arrayed screens. In pooled screens (almost exclusively performed using lentivirus-based genetic libraries), genetic agents (gRNA or siRNA) are synthesized and cloned in a pool and subsequently infected into cells en masse, that is, all lentiviruses, targeting all genes together in one flask. Deconvoluting screen outputs to correlate gene and phenotype requires next-generation sequencing (NGS), which limits pooled screening to those where the desired phenotype is selectable—primarily growth-related phenotypes, where the depletion or enrichment of a given genetic agent in the population is used as a method of selection.

In contrast, in arrayed screens the agents of choice (gRNA and siRNA) are already distributed into individual wells such that in a single well only a single gene will be targeted. Libraries are available in numerous formats, including chemically synthesized (synthetic), plasmid-based in vitro transcribed (IVT), and arrayed lentivirus. Synthetic libraries are currently gaining traction in the field due to their ease of use, reproducibility, high efficiency, and compatibility with the high-throughput screening platforms that are most commonly used in industry. Most critically, the use of synthetic libraries is enabled and made more reproducible by commercial manufacture, improvement of synthesis methods, and provision of genome-wide libraries as off-the-shelf products. However, it is likely that arrayed lentivirus libraries will provide particular benefit when arrayed screening moves into harder-to-transfect models such as primary cells, where efficient lipid delivery of synthetic RNA could be challenging.

A key advantage of arrayed screens is that complex phenotypes can be measured, and they enable the full power of high-content microscopy to be harnessed. These screens are, however, more expensive and labor-intensive and require substantial investment in automation infrastructure. By comparison, pooled screens are relatively inexpensive and can generally be performed by a single person without specialized equipment. The use of pooled screens for target discovery is well established, with institutes around the globe having delivered genome-wide screens across several hundred cell lines. Widespread recognition of the value of these screens has driven several recent important and innovative collaborative initiatives, for example, the recently established Functional Genomics Centre in Cambridge, United Kingdom, a partnership between Astra Zeneca and Cancer Research UK. A constraint to the application of pooled screens is that they have predominantly been limited to cell proliferation and cell death phenotypes, or to fluorescent reporters coupled to fluorescence-activated cell sorting.

Despite the proven power of functional genomics screens, their interpretation can be a challenge. In the field we have started to observe a number of groups that are attempting to address this gap by creating computational pipelines to (1) assess library quality, (2) quality control (QC) and analyze screening outputs using various machine learning approaches, and (3) perform hit identification and ranking with the aim to discover novel targets.

Additionally, the majority of functional genomics studies run to date interrogate mainly coding regions of the human genome, despite increasing evidence indicating a driver role of noncoding regions in predisposition to common human diseases. This highlights the need for the development of functional genomics approaches and reagents that are capable of interrogating the involvement of noncoding genomics regions in mechanisms of diseases, opening a whole new paradigm for researchers in this area.

It is clear that functional genomics holds great promise not only for the dissection of molecular mechanisms of human diseases but also to reduce the very high attrition rate in the drug discovery process, recently highlighted in a number of interesting review articles.³ The long-term success of functional genomics platforms in addressing these challenges will be driven by advancements in three strategic pillars that form the foundation of the functional genomics discipline: (1) development of more translatable models of diseases, (2) creation of validated screening libraries and technologies to perturb gene and protein function, and (3) establishment of “end-to-end” computational pipelines that allow the quantitative analysis of cellular phenotypes resulting from genetic or nongenetic perturbations.

Translatable Models

The use of the “right” biological models is a cornerstone of the functional genomics discipline. The value generated by functional genomics efforts (measured as the number of newly identified drug targets with a reduced clinical attrition rate) will be directly proportional to the translatability of the cellular models employed in the screening campaigns in which they were discovered. Consequently, there is a strong drive among researchers in this field to use cellular models that can closely recapitulate the disease-relevant phenotypes (e.g., co-culture systems, 3D, organoids, and iPSC-derived models), rather than “easy-to-screen” 2D cell lines with limited physiological relevance. A key challenge here is the balance that needs to be struck between the relevance of the models used in functional genomics campaigns and the feasibility and robustness of such models using automation and workflows currently employed in the high-throughput screening environment.

The manuscript from Gee et al.⁴ illustrates efforts in this regard and describes the development of a co-culture system utilizing primary T cells and tumor cells to identify immuno-oncology targets important for tumor cell evasion of immune cell killing. The authors demonstrate that this co-culture system is fit for purpose in arrayed genetic screening using a focused synthetic gRNA library. Their approach identifies the gene ICAM1, whose knockout enhances resistance to T-cell killing, highlighting the promise of tumor/immune co-culture systems for the identification of important mechanisms and demonstrating the value of further investigation of different co-culture models that can provide more robust phenotypic output.

Turner et al.⁵ describe the development of a high-content assay to study kidney fibrosis using primary human kidney fibroblasts. To identify novel targets that can modulate kidney fibrosis, the authors performed genome-wide arrayed genetic screening using lentiviral gRNA libraries. While several reports have shown successful utilization of CRISPR technology in primary fibroblasts from different origins,⁶ to our knowledge this work describes the first whole genome-wide arrayed CRISPR screen in such a model.

The perspective from Rubbini et al.⁷ highlights how the zebrafish model might represent an attractive solution for functional genomics studies in in vivo models that allow the retention of some level of throughput. One of the key advantages of the use of zebrafish in target discovery is the availability of well-characterized phenotypic readouts related to several human diseases (including neurological disorders, inflammation, cancer, and cardiovascular disease) at a fraction of the cost compared with other animal models. In addition, the relatively high degree of conservation of the genome and physiology with humans, as well as the compatibility with CRISPR technology to rapidly generate gene knockouts at medium or high throughput, potentially offers an alternative way forward to accelerate discovery of targets with a reduced clinical attrition rate.

Screening Libraries and Technologies

While small-molecule and siRNA libraries have been more traditionally used in phenotypic screening campaigns to identify novel targets, more recently gRNA libraries have attracted a lot of attention in the functional genomics field. There are some key advantages in the use of gRNA libraries; for example, (1) CRISPR technology is less affected by off-target effects, which are commonly cited as a downside of RNAi technology,⁸ and (2) CRISPR screening provides a relatively immediate association between gene perturbation and phenotypic readout without needing challenging and resource-intensive target deconvolution efforts, a hallmark of small-molecule-based phenotypic screening.

The paper from Ross-Thriepland et al.⁹ reports the first end-to-end functional genomics CRISPR arrayed screen using synthetic gRNA libraries aimed at identifying novel targets that modulate productive delivery of lipid nanoparticle (LNP) encapsulated mRNA. The authors screened the druggable genome using synthetic gRNA libraries and validated 44 genes that modulated the productive delivery of LNP-mRNA. They applied pathway analysis to show that these genes clustered into families involved in host cell transcription, protein ubiquitination, and intracellular trafficking. Focusing on the latter and using orthogonal genetic perturbation and small-molecule inhibition, the authors elegantly validate two genes (UDP-glucose ceramide glucosyltransferase [UGCG] and V-type proton ATPase [ATP6 V]) that significantly modulate the productive delivery of LNP-mRNA. These findings show for the first time how the combination of functional genomics and CRISPR screening technology can increase our understanding of mechanisms modulating productive LNP-mRNA delivery.

O’Shea et al.¹⁰ describe the establishment of a lentiviral-based arrayed CRISPR kinome screening platform capable of identifying a number of canonical modulators of the NF-κB signaling pathway. After stochastic rank aggregation of primary hits, the authors performed follow-up on a 152-gene subset. The authors identified the majority of kinase genes with known regulatory roles in TNF-α-mediated NF-κB signaling at a higher success rate than observed with previous RNAi-based studies. As with the work from Turner et al.,⁵ these data demonstrate that lentiviral-based arrayed screening reagents can be supplied at scale and with sufficient consistency to obtain meaningful screening results.

The paper from Ding et al.¹¹ reports a phenotypic screen using a small-molecule library comprising compounds that are annotated as pharmacological regulators of target genes that were identified in a previous siRNA screen to significantly affect the replication of human rhinovirus. Two hundred seventy small-molecule compounds selected for interaction with 122 target gene hits were screened in human bronchial epithelial cells. This led to the identification of Fms-related tyrosine kinase 4 (FLT4) as a novel target regulating rhinovirus replication. This elegant study demonstrates that a combination of siRNA and small-molecule compound screening can be an effective and straightforward approach for the identification of targets that regulate rhinovirus replication. This is particularly thought-provoking in the context of ongoing efforts to identify targets that can prevent infection from SARS-CoV-2, the causal agent of the COVID-19 pandemic.

An alternative functional genomics approach to the use of genetic or small-molecule libraries is the use of secretome libraries for target discovery, reviewed here by Ding et al.¹² While a secretome library can only capture biology orchestrated from the plasma membrane, this modality of screening offers a number of benefits over other libraries, including (1) the small size of the library (typically a few thousand active proteins), (2) the lack of cellular manipulation typical of functional genomics screens employing various CRISPR libraries, and (3) shorter deconvolution (when the cognate receptor is known). Ideally, a secretome library should be used as a complementary approach alongside other genetic and nongenetic libraries.

Finally, as alluded to earlier, the vast majority of commercially available genetic libraries for functional genomics screens target coding regions of the human genome. As described in the perspective from Papanicolaou and Bonetti,¹³ increasing evidence indicates that genetic variants associated with susceptibility to common diseases are often located in noncoding regions of the genome, such as tissue-specific enhancers or long noncoding RNAs. This highlights a clear limitation for researchers in this field wishing to interrogate large regions of the human genome, namely the lack of validated reagents. To circumvent this gap, the thought-provoking frameshift proposed by the authors deserves special mention: to use recently described genome-wide “omics” technologies to study DNA-DNA and RNA-DNA interactions to inform the development of genetic libraries capable of targeting noncoding regions of the human genome.

Computational Pipeline and Data Analytics

Despite the fact that the proven power of functional genomics to annotate gene function at scale promises to accelerate the discovery of novel therapeutic targets and the elucidation of new biology, the interpretation of such screens can be a challenge. This highlights the need in functional genomics platforms for end-to-end computational pipelines encompassing (1) assessment of library quality, (2) QC and analysis of screening outputs, and (3) hit identification and ranking.

The paper from Guerriero et al.¹⁴ describes a computational pipeline to support the analysis and interpretation of internal arrayed genetic screens using gRNA libraries. This end-to-end pipeline integrates the evaluation of the quality of guide RNA libraries, image analysis, the evaluation of assay result quality, data processing, hit identification, ranking, visualization, and biological interpretation. Of note, this pipeline also includes a deep learning approach to perform segmentation of nuclear and cytosolic fractions from label-free phase-contrast images, which are captured alongside the screen-specific fluorescent markers, enabling researchers to extract more useful information from their image-based functional genomics screens.

Omta et al.¹⁵ describe an approach utilizing unsupervised analysis followed by a supervised analysis carried out on a previously analyzed data set from an image-based genetic screen using siRNA libraries. The authors show that the combination of unsupervised and supervised data analytics methods has the potential to enhance the ability to identify new knowledge in functional genomics screens when compared with the use of unsupervised methods alone.

Concluding Remarks

The exciting and innovative science showcased in this special issue clearly demonstrates the potential for functional genomics to transform our understanding of biological pathways and mechanisms of human disease. These contributions also highlight the multidisciplinary nature of the field, and how experts from different areas of science and technology have driven recent successes. Additionally, we believe that this special issue demonstrates how biology, particularly the evolution of more predictive biological models of disease, remains central to the current and future success of functional genomics.

Footnotes

Declaration of Conflicting Interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

References

Costanzo

Vander Sluis

Koch

E. N.

, et al. A Global Genetic Interaction Network Maps a Wiring Diagram of Cellular Function. Science 2016, 353.

Kampmann

CRISPRi and CRISPRa Screens in Mammalian Cells for Precision Biology and Medicine. ACS Chem. Biol. 2018, 13, 406–416.

Morgan

Brown

D. G.

Lennard

, et al. Impact of a Five-Dimensional Framework on R&D Productivity at AstraZeneca. Nat. Rev. Drug. Discov. 2018, 17, 167–181.

Gee

Nelson

Bornot

, et al. Developing an Arrayed CRISPR-Cas9 Co-Culture Screen for Immuno-Oncology Target ID. SLAS Discov. 2020, 25, 581–590. DOI: 10.1177/2472555220916457.

Turner

Golz

Wollnik

, et al. A Whole Genome-Wide Arrayed CRISPR Screen in Primary Organ Fibroblasts to Identify Regulators of Kidney Fibrosis. SLAS Discov. 2020, 25, 591–604.

Weigle

Martin

Voegtle

, et al. Primary Cell-Based Phenotypic Assays to Pharmacologically and Genetically Study Fibrotic Diseases In Vitro. J. Biol. Methods 2019, 6, e115.

Rubbini

Cornet

Terriente

, et al. CRISPR Meets Zebrafish: Accelerating the Discovery of New Therapeutic Targets. SLAS Discov. 2020, 25, 552–567.

Evers

Jastrzebski

Heijmans

J. P.

, et al. CRISPR Knockout Screening Outperforms shRNA and CRISPRi in Identifying Essential Genes Nat. Biotechnol. 2016, 34, 631–633.

Ross-Thriepland

Bornot

Butler

, et al. Arrayed CRISPR Screening Identifies Novel Targets That Enhance the Productive Delivery of mRNA by MC3-Based Lipid Nanoparticles. SLAS Discov. 2020, 25, 605–617.

10.

O’Shea

Wildenhain

Leveridge

, et al. A Novel Screening Approach for the Dissection of Cellular Regulatory Networks of NF-κB Using Arrayed CRISPR gRNA Libraries. SLAS Discov. 2020, 25, 618–633.

11.

Ding

Tyrchan

Bäck

, et al. Combined siRNA and Small-Molecule Phenotypic Screening Identifies Targets Regulating Rhinovirus Replication in Primary Human Bronchial Epithelial Cells. SLAS Discov. 2020, 25, 634–645. DOI: 10.1177/2472555220909726.

12.

Ding

Tegel

Hober

, et al. Secretome-Based Screening in Target Discovery. SLAS Discov. 2020, 25, 535–551.

13.

Papanicolaou

Bonetti

The New Frontier of Functional Genomics: From Chromatin Architecture and Noncoding RNAs to Therapeutic Targets. SLAS Discov. 2020, 25, 568–580.

14.

Guerriero

M. L.

Corrigan

Bornot

, et al. Delivering Robust Candidates to the Drug Pipeline through Computational Analysis of Arrayed CRISPR Screens. SLAS Discov. 2020, 25, 646–654.

15.

Omta

W. A.

Van Heesbeen

R. G.

Shen

, et al. Combining Supervised and Unsupervised Machine Learning Methods for Phenotypic Functional Genomics Screening. SLAS Discov. 2020, 25, 655–664.