Targeting Human Long Noncoding Transcripts by Endoribonuclease-Prepared siRNAs

Abstract

Broad sequencing enterprises such as the FANTOM or ENCODE projects have substantially extended our knowledge of the human transcriptome. They have revealed that a large portion of genomic DNA is actively transcribed and have identified a plethora of novel transcripts. Many newly identified transcripts belong to the class of long noncoding RNAs (lncRNAs), which range from a few hundred bases to multiple kilobases in length and harbor no protein-coding potential. Although the biological activity of some lncRNAs is understood, the functions of most lncRNAs remain elusive. Tools that allow rapid and cost-effective access to functional data of lncRNAs are therefore essential. Here, we describe the construction and validation of an endoribonuclease-prepared siRNA (esiRNA) library designed to target 1779 individual human lncRNAs by RNA interference. We present a compendium of lncRNA expression data for 11 human cancer cell lines. Furthermore, we show that the resource is suitable for combined knockdown and localization analysis. We discuss challenges in sequence annotation of lncRNAs with respect to their often low and cell type–specific expression and specify esiRNAs that are suitable for targeting lncRNAs in commonly used human cell lines.

Keywords

lncRNA noncoding RNA esiRNA RNAi localization

Introduction

The full sequence of the human genome has been available for more than a decade,¹ and we have since learned a great deal about the genetic information stored in almost every cell of the body. In contrast, our understanding of the human transcriptome is much less complete. Recent advances in sequencing technologies have paved the way for a more comprehensive catalog of transcribed genomic loci. Major contributions came from large sequencing projects such as FANTOM,² ENCODE,³ or GENCODE,^4,5 which revealed that about 75% of genomic DNA is actively transcribed⁶ but less than 2% of the RNAs contain exons with protein coding potential.³ The most frequently encoded transcripts with a total count of 113,513⁷ belong to the class of long noncoding RNAs (lncRNAs), which range from a few hundred bases up to multiple kilobases in length.^3,5 Thus far, only a few lncRNAs have been studied in great detail, but their functional relevance in the regulation of diverse biological processes including chromatin modification,⁸ imprinting,⁹ gene expression,¹⁰ and splicing¹¹ has been demonstrated. For example, the lncRNA Xist mediates global changes in the chromatin structure of the X-chromosome, leading to its inactivation for dosage compensation in females.⁸ In genomic imprinting, maternally and paternally inherited alleles are differentially regulated, a process that is partially mediated by the lncRNA AIR.⁹ AIR recruits H3 lysine 9 methylase G9a in cis, leading to transcriptional silencing of the respective genomic loci in the paternal allele.⁹ Long noncoding transcripts have also been found to directly regulate gene transcription. The lncRNA PANDA was shown to associate with the transcription factor NF-YA, displacing it from its promoter sites and leading to transcriptional shut down of the respective genes.¹⁰ In addition, IncRNAs have been implicated in the regulation of alternative splicing. MALAT1 interacts with serine/arginine splicing factors and influences their distribution in nuclear speckles.¹¹ Furthermore, lncRNAs have been shown to play an important role in the onset of diverse human diseases such as cancer^12,13 and Alzheimer’s disease.¹⁴ These and other features make lncRNAs a relevant and interesting subject for future functional studies. Biological characterization of lncRNAs, however, comes with substantial technical challenges. In contrast to coding transcripts, noncoding RNAs are more difficult to annotate because of their often reduced and tissue-specific expression^5,15,16 and low phylogenetic conservation.⁵ These challenges influence the methods applied for their functional dissection. Therefore, the available technologies for the systematic functional characterization of lncRNAs are limited.

Endoribonuclease-prepared siRNA (esiRNA) technology has been widely used as an efficient tool for mediating RNA interference (RNAi) for coding transcripts in mammalian cells, ranging from single transcript targeting to genome-wide loss-of-function (LOF) screening.^17–26 Furthermore, esiRNAs have been shown to be efficient in various cell types, including diverse human cancer cell lines,^17–24 mouse embryonic stem cells,²⁵ and cells in the developing mouse brain.²⁶ esiRNAs are pools of several hundreds of individual siRNAs generated by enzymatic digestion of a typically 300- to 600-base-pairs-long double-stranded RNA (dsRNA) derived from a single target transcript.²⁷ Pooling of siRNAs has been shown to increase on-target specificity by decreasing off-target effects,^27,28 because the siRNAs that make up the pool exist in comparable amounts and have the same on-target competence. Therefore, the silencing capacity for the intended target is additive, whereas off-target effects are diluted out. Genome-scale esiRNA libraries for coding transcripts of Mus musculus and Homo sapiens origins have been used intensively for LOF screening^{17–22,24,25} and are commercially available. Furthermore, an esiRNA library for targeting mouse lncRNAs has been successfully used in an LOF screen identifying genes implicated in the maintenance of pluripotency in embryonal stem cells.²⁹ We reasoned that the same advantages of the esiRNA technology should apply for silencing human lncRNAs. The generation of a library for the systematic screening of human lncRNAs by RNAi and localization profiling may therefore be a suitable method to perform a small-, medium-, or large-scale investigation of lncRNA function. Here, we present a first-generation human esiRNA library targeting 1779 lncRNAs. We demonstrate that esiRNAs designed against specific lncRNAs effectively deplete their targets, and fluorescence in situ hybridization (FISH) probes generated from the same source can detect the subcellular localization of long noncoding transcripts in fixed human cells. For screening, we recommend selecting target lncRNAs that are expressed in the chosen cell line to lower costs and labor. To facilitate screening, we provide the expression pattern of the targeted lncRNAs in 11 commonly used cell lines.

Materials and Methods

esiRNA Synthesis

esiRNAs were synthesized as described previously.³⁰ Briefly, we amplified the lncRNA transcript region of interest in a two-step PCR using transcript-specific primers with sequence tags added to their termini (left tag: 5′-TGACACTATAGAAGTG-3′, right tag: 5′-CTCACTA-TAGGGAGA-3′). The sequences of all primers are given in Supplementary Table S1. As template, a cDNA mixture from different human cancer cell lines and human induced pluripotent stem cells was used. For reverse transcription, the enzyme Superscript III (Life Technologies, Carlsbad, CA) was applied together with either random hexamers or oligo-(dT)_16-20 (Life Technologies) primers (ratio 1:1). The product of the first PCR was used in a subsequent PCR reaction (primer left: 5′-GCTAATACGACTCACTATAG-GGAGATGACACTATAGAAGTG-3′; right: 5′-GCTAA-TACGACTCACTATAGGGAGA-3′), which introduced full T7-promoter regions at both termini, allowing for bidirectional in vitro transcription and generation of long dsRNAs using T7 polymerase. Note that the left primer contains a part of the SP6-promoter (underscored). Therefore, the PCR products may also be applied for combined knockdown and localization analysis of noncoding RNAs (c-KLAN),²⁹ a method that allows the synthesis of single-stranded hybridization probes and esiRNAs from a single source. All PCR products were sequenced to verify their identity. The long dsRNAs were digested by RNase III and purified by anion-exchange chromatography as described previously.³⁰ The esiRNA sequences are given in Supplementary Table S1. For transfection, all esiRNAs were brought to the same concentration by diluting in TE-buffer and were arrayed in 96-well plates and assigned to identifiers, which indicate their species of origin (“HNC-”), and a unique five-digit number for each target gene. All esiRNAs for human lncRNAs are commercially available (www.eupheria.com). Alternative resources for targeting lncRNA are available from General Electric (www.ge.com) or Exiqon (www.exiqon.com).

Transcriptome Analysis

Analysis of gene expression was carried out using RNA-Seq data generated by the Cold Spring Harbor Laboratory as a part of the ENCODE project.⁶. Raw sequencing reads obtained from polyA+ cytosolic fractions of the 11 analyzed cell lines were downloaded from the UCSC repository and aligned to the human reference genome (hg19 assembly, February 2009) using STAR version 2.4.0e.³¹ The read alignment process was guided by a splice junction database constructed from a set of protein-coding transcripts obtained from the Ensembl database (release 74) combined with a set of noncoding transcripts from LNCipedia (www.lncipedia.org, release 3.0). Gene expression levels were estimated using an in-house developed application, which calculates fragments per kilobase of expressed exons per million mapped reads (FPKM values) in a manner similar to NEUMA.³² In our approach, to calculate an effective length of genes, instead of using simulated data, we used a pooled set of aligned RNA-Seq reads for assessing genome mapability. Further analyses of gene expression results and generation of plots were performed in R (version 3.1.2), with the aid of “plyr” and “ggpot2” packages. Two-dimensional kernel density estimations were calculated using the “kde2d” function from the “MASS” package, with 500 grid points in each direction.

Quantification of lncRNAs

For quantitative PCR assays, HeLa cells were transfected in 12-well cell culture plates with 300 ng esiRNA using 4.2 µL oligofectamine (Life Technologies) per well. After 24 h incubation, the cells were harvested and total RNA was extracted using the RNeasy Kit (Qiagen, Venlo, the Netherlands) including an on-column DNase I digest (Qiagen) as given in the manufacturer’s protocol. Total RNA was reverse transcribed using SuperScript III reverse transcriptase (Life Technologies) and Oligo-(dT)_16-20 or random hexamer primers. Quantification of the targeted RNA was conducted using the ABsolute QPCR SYBR Green Kit (Thermo Scientific, Waltham, MA) according to the manufacture’s protocol using a CFX96 Touch real-time PCR machine (Bio-Rad, Hercules, CA). The expression levels of either GAPDH or TBP transcripts were used as endogenous controls. All primer sequences are provided in Supplementary Table S2. For Nanostring³³ experiments, HeLa cells were transfected with esiRNAs as mentioned above. Cells were harvested and total RNA was extracted using the RNeasy Kit (Qiagen) including an on-column DNase I digest (Qiagen). Three micrograms g of total RNA was hybridized for 22 h according to the manufacturer’s protocol with a custom-made probe set (Integrated DNA Technologies, Coralville, IA) composed of two adjacent single-stranded DNA oligomers of 60 to 85 nucleotides in length (Suppl. Table S3). The probes were attached to a unique fluorescent barcode composed of six segments, each labeled in one of four colors or a biotin for capturing. Barcodes were counted on an nCounter Digital Analyzer (Nanostring Technologies, Seattle, WA) according to the manufacturer’s protocol using the highest possible resolution. A total of six probes were included as technical positive and negative hybridization controls. Raw counts were compiled and analyzed using nSolver (Nanostring Technologies) and normalized to TBP expression. Transfection of a nontargeting control (Renilla Luciferase [RLUC] esiRNA) served as a reference for calculating knock-down levels.

Combined Immunofluorescence and RNA FISH

HeLa cells were grown on Labtek chamber slides (Thermo Scientific Nunc) until they reached confluency. The chambers were washed with 1× phosphate buffer saline (PBS) and fixed with 4% paraformaldehyde for 10 min at room temperature. The cells were then washed two times with 1× PBS and permeabilized with 0.5% (v/v) triton X-100 in PBS and 2 mM vanadium ribonucleoside for 5 min. Following permeabilization, the cells were washed three times with 1× PBS and blocked with 1% bovine serum albumin (BSA) in PBS for 15 min at room temperature. Primary antibodies were diluted in 1% BSA, and cells were incubated in a humidity chamber with the antibody solution for 20 to 30 min. Subsequently, the chambers were washed three times for 5 min each with PBS and incubated with secondary antibodies (diluted in 1% BSA) in a dark humidity chamber for 1 h. Thereafter, the cells were postfixed with freshly made 4% paraformaldehyde for 10 min at room temperature. The cells were then washed twice in 2× saline-sodium citrate (SSC) buffer for 5 min, and RNA FISH was performed as described previously.²⁹ To prepare probes for RNA FISH using the c-KLAN approach,²⁹ the first round PCR products were amplified with forward primer containing full SP6 sequence (5′-GAATTTAGGTGACACTA-TAGAAGTG-3′) and reverse primer containing full T7 sequence (5′-GCTAATACGACTCACTATAGGGAGA-3′). The amplicons were transcribed in vitro using Chromatide Alexa Fluor-546–tagged UTPs (Invitrogen, Carlsbad, CA) with either T7 polymerase for antisense riboprobe or SP6 polymerase for sense riboprobe (according to the manufacturer’s instructions). The RNA probes were purified using the RNeasy mini kit (Qiagen) according to the manufacturer’s instructions and diluted in 2× hybridization buffer (20% 20× SSC, 50% dextran sulfate, 20% BSA, and 10% vanadyl ribonucleoside). For hybridization, 15 to 20 ng probe, 10 µg salmon sperm DNA (Invitrogen), 10 µg human Cot-1 DNA (Invitrogen), and 10 µg yeast tRNA (Sigma Aldrich, St. Louis, MO) were mixed with two volumes of 100% ethanol and dried. The dried mixture was then resuspended in 5 µL of 100% formamide at 37 °C for 10 min followed by denaturation at 74 °C for 7 min and then resuspended in 5 mL hybridization buffer. Twenty microliters of probe was then added to the cells and placed in a humidifying chamber overnight with gentle rocking. Slides were washed the next day with 4× SSC followed by washes (three times each) with 2× SSC, 50% formamide, at 39 °C for 5 min and 2× SSC at 39 °C for 5 min and washed with 1× SSC at room temperature for 10 min followed by staining with DAPI. FISH images were acquired using Delta Vision Core Widefield deconvolution fluorescence microscope (Applied Precision Inc., Mississauga, Ontario, Canada) using an Olympus UPlanSApo 100×/1.4 oil immersion lens. The refractive index of immersion oil (Applied Precision Inc.) used was 1.518. For RNA FISH, the cells were covered with 4× SSC buffer, and images were taken at a resolution of 15.528 pixels per micron and voxel size of 0.06 × 0.06 × 0.20 micron. Other parameters of image acquisition for RNA FISH are as follows: bits per pixel = 16, dimension order = XYCZT, camera type = CoolSNAPHQ2/HQ2-ICX285, illumination = SSI-lumencor transmitted light (LED), dichromatic mirrors = DAPI/FITC/TRITC/Cy5, excitation bands = 381-399 (DAPI), 529-556 (Alexa 546), 650-670 (Cy5). Image processing for all RNA FISH images was done as follows: images were maximally projected using the Z Project tool of Fiji. The maximum and minimum displayed values were then manually adjusted to make the background invisible using the brightness and contrast settings tool. The same value ranges were applied to all figures wherever comparisons were made. The images were cropped to zoom into specific regions in the figure followed by a conversion to an eight-bit image. Merged images were generated using LUTs green for Alexa 546, red for Cy5, and blue for DAPI.

Results and Discussion

esiRNA Library Design

Systematic investigations of lncRNAs require resources to functionally test the roles of these molecules in cells. Indeed, some companies have developed reagents to study lncRNAs at larger scale. General Electric offers a Lincode collection of predesigned siRNAs against 2231 characterized human lncRNAs, and Exiqon applies chemically derived single-stranded LNA-based antisense oligomers complementary to the target lncRNA. To build a first-generation esiRNA library targeting human lncRNAs, we investigated the current annotation of long noncoding transcripts. We used LNCipedia,⁷ an integrated database, which compiles the content of several other resources. Currently, the LNCipedia database reports a total of 113,513 long noncoding transcripts expressed from 63,038 genes of the human genome.⁷ However, not every cell or tissue expresses the full set of lncRNAs. To investigate the expression pattern of lncRNAs in different human cells, we analyzed RNA-sequencing data generated by the ENCODE consortium³ for 11 commonly used cell lines derived from different tissues and cell types (GM12878, K562, A549, HelaS3, HepG2, HUVEC, IMR90, MCF-7, SK-N-SH, NHEK, H1hESC; Table 1 ) and compared their transcriptome composition. We found that noncoding transcripts were on average lower in expression than protein-coding transcripts in all cell lines tested ( Fig. 1A ). We also observed a significant number of lncRNAs being expressed at comparable levels to highly expressed coding RNAs. However, a larger number of lncRNAs are expressed at a much lower level than most coding RNAs ( Fig. 1A ), suggesting that noncoding RNAs are generally more versatile in their expression than translated RNAs. The broad range of expression was observed between different cell types and also between lncRNA and mRNA levels in the same cell type ( Fig. 1A ; Suppl. Fig. S1). For instance, the lncRNA lnc-ITGB3BP-1, which is found to be expressed at rather high levels in the cell lines K562, A549, MCF-7, and H1-hESC, is not expressed in any of the other seven lines ( Fig. 1B ). We conclude that lncRNAs have a stronger tissue- and cell-type dependency than mRNAs and that their expression covers a wider range. For a first-generation esiRNA library, we nominated a collection of 2944 long noncoding transcripts covering the full range of expression levels found in the analyzed 11 cell lines (Suppl. Fig. S2). Because of the high tissue specificity of lncRNA expression, we asked how many target transcripts from the esiRNA library are expressed in different cell lines. By analyzing expression data from the 11 cell lines, we found that only 697 lncRNAs (cytosolic) from the library (39%) were shared among all cell types analyzed ( Table 1 ; Suppl. Table S4). Furthermore, the number of esiRNAs targeting expressed lncRNAs varied significantly ( Table 1 ). For example, we observed that 1263 esiRNAs (71% of the library) targeted lncRNAs expressed in H1hESC cells, whereas 1042 (59%) transcripts were targeted in IMR90 cells. This observation illustrates the large tissue specificity of lncRNA expression but also provides a challenge for systematic RNAi screening. Because a significant portion of the target transcripts may not be present in the cell type of interest, an adaption of the library composition to the respective cell line may be advisable. Therefore, we analyzed the expression data of all esiRNA target transcripts and put together a compendium of expressed lncRNAs in the 11 cell lines (Suppl. Table S4). These data may serve as a guide for library composition for screening in cells of different origins. In addition, the resource may be used for choosing nonexpressed lncRNAs as negative controls.

Table 1.

Long noncoding RNA (lncRNA) expression analysis.^a

Cell Line	Tissue of Origin	Number of Cytosolic Expressed lncRNAs Covered in the Library	Number of Nuclear Expressed lncRNAs Covered in the Library
GM12878	Blood, lymphoblastoid carcinoma	1168	1279
K562	Blood, leukemia	1050	1195
A549	Lung, epithelial carcinoma	1227	1254
HelaS3	Cervical carcinoma	1162	1263
HepG2	Liver carcinoma	1113	1224
HUVEC	Blood vessel endothelium	1138	1286
IMR90	Lung fibroblast	1042	1315
MCF7	Breast carcinoma	1261	1343
SK-N-SH	Brain, neuroblastoma	1233	1317
NHEK	Skin, epidermal keratinocytes	1095	1292
H1hESC	Embryonic stem cells	1263	1291
Common	All of the above tissues	697	880

Cell lines (tissue of origin as indicated) were analyzed for the expression of transcripts (cytosolic and nuclear) covered by endoribonuclease-prepared siRNAs (esiRNAs). The total number of suitable esiRNAs for the respective cell line and compartment is given.

Figure 1.

(A) Two-dimensional kernel density plot of protein coding genes (red) and long noncoding RNAs (lncRNAs; blue) is shown as a function of mean (x-axis) and standard deviation (y-axis) of their expression levels across 11 cell lines (FPKM = fragments per kilobase of expressed exons per million mapped reads). An increasing transparency of contour lines shows a decrease in the density value (0.05–0.15). (B) A UCSC genome browser view with an example of differential expression of lncRNA lnc-ITGB3BP is shown. The topmost tracks show structures of all annotated isoforms in the LNCipedia database as denoted. The following pile-up tracks represent RNA sequencing reads from polA+ cytosolic fractions of 11 cell-lines. The most bottom track indicates the position of the designed endoribonuclease-prepared siRNA.

To design esiRNAs for the silencing of the 2944 lncRNAs, we analyzed the transcript sequences in more detail. We found that 1963 (67%) genes have more than one splicing variant and 129 genes have close paralogs in the human genome (Suppl. Table S4). To cover all splicing variants of a particular lncRNA, we decided to design the esiRNAs against the longest common sequence stretches. Consequently, all variants of a particular lncRNA can be knocked down using a single esiRNA. To determine the regions with the highest susceptibility for RNAi within the target transcripts or common stretches, we used the DEQOR algorithm.³⁴ DEQOR is an algorithm that uses empirical design criteria³⁴ and has proven its utility in the design of efficient esiRNAs for coding transcripts.²⁷ The 216-684 bp DEQOR regions for the respective lncRNAs were designed and amplified with specific primers from human cDNA using PCR ( Fig. 2A ). We have often seen that because of their low phylogenetic conservation⁵ as well as low and tissue-specific expression,^5,15,16 uncertainties remain in the sequence annotation of lncRNAs. We therefore verified the sequences of all PCR products using Sanger sequencing. In 60.0% (1779) of the cases, the sequences matched the annotation given in LNCipedia ( Fig. 2A ; Suppl. Table S1). All of these PCR products were converted to esiRNAs, constituting a library of 1779 esiRNAs. Among the PCR products that did not match the sequences on LNCipedia, we identified numerous cases in which a product of unexpected size was generated. For example, the esiRNA HNC-02141-1 was designed to target the lncRNA lnc-CLDN20-3 encoded on chromosome 6. Based on the transcript annotation of lnc-CLDN20-3, a PCR product with a size of 375 base pairs in length should be generated. However, we observed amplification of a product with 497 base pairs. By blasting the amplified sequence against the human genome, we found an additional exon of 122 bases in length inserted after the first exon ( Fig. 2B ). In contrast, the PCR product for the esiRNA HNC-00635-1 designed against the lncRNA lnc-ADCY3-1:4 was 142 bases shorter in length than expected. By blasting the amplified sequences against the human genome, we found a splicing site different from the annotated site, leading to a shorter product (Suppl. Fig. S3). Hence, the current annotation of many lncRNAs is still preliminary and will likely change in the future. This example also illustrates the advantage of the esiRNA technology as the only mammalian RNAi mediator that uses cDNA as a source ( Fig. 2A ). This feature ensures that esiRNAs are generated for authentic transcripts only. In contrast, technologies relying on chemical synthesis, vector-expressed shRNAs, or antisense technologies depend on correct sequence annotation in a database, an advantage that applies even more to lncRNAs, which still show more ambiguity in sequence annotation than coding RNAs. We conclude that the first-generation esiRNA library constitutes a useful and valuable resource for probing lncRNA function.

Figure 2.

Endoribonuclease-prepared siRNA (esiRNA) synthesis and validation. (A) The desired target transcript regions (2944) are amplified by PCR using gene-specific primers and cDNA as template (step 1). Primers contain an overhang (black), which introduces a T7-promoter on both termini of the PCR product to allow bidirectional in vitro transcription, generating a corresponding long double-stranded RNA (dsRNA; step 2). Digestion of the long dsRNA by RNase III (step 3) results in a pool of short dsRNAs. Purification by ion exchange chromatography (step 4) removes longer and undigested dsRNA fragments, yielding the final esiRNA products (here: 1779). Final concentrations are determined by optical density measurements, and samples are adjusted to the same concentration and arrayed in multiwell plates (step 5). (B) The transcript of the long noncoding RNA (lncRNA) lnc-CLDN20-3 as given in the LNCipedia database (exon 1 and 2, black) and with an alternative exon (1a, red) as identified by PCR and sequencing (electropherogram, bottom) for the indicated esiRNA (HNC-02141-1). The genomic location is shown (top part) with the numbers indicating the chromosomal base position and the sequence details of the alternative exon 1a (bottom part, red) with the numbers indicating the base position in the transcript number 1 of the lncRNA lnc-CLDN20-3. (C) Knock-down efficiency of esiRNAs against lncRNAs measured by quantitative real-time PCR. The percentage of remaining lncRNA normalized to the expression level of negative control–treated cells (Renilla Luciferase, black) is shown. The error bars indicate standard deviation (n = 3).

esiRNA Library Validation

To validate the quality of our library and to quantify its RNAi capability, we nominated 46 esiRNAs that target lncRNAs expressed in HeLa cells for quantitative real-time PCR (qRT-PCR) analysis ( Fig. 2C ). To ensure that the 46 chosen target genes are representative of the library, we selected transcripts that covered the full range of lncRNA expression levels (Suppl. Fig. S4). We found that on average, the respective target transcripts could be depleted by 60.4%. The esiRNA HNC-02372-1 (LNCipedia-ID: lnc-ATP6AP2-5) exhibited the highest knockdown efficiency by decreasing the target transcript levels to 7.5%. In addition, 35 esiRNAs (76%) reduced the target transcript levels at least by half. We observed a moderate knockdown (38%–48%) of targets by four esiRNAs and no significant knockdown by the other five esiRNAs. These knockdown efficiencies are comparable with an earlier study, which employed esiRNAs for the silencing of lncRNAs in the mouse genome.²⁹ However, the average silencing efficiency of esiRNAs against lncRNAs is slightly lower compared with those that target protein-coding transcripts. The reason for this reduced efficiency is currently not clear, but targeting might be improved by incorporating secondary structure predictions³⁵ for the lncRNAs into the DEQOR algorithm. Interestingly, in two cases, the esiRNA transfection led to an unexpected strong up-regulation of the respective target-transcript levels ( Fig. 2C , HNC-02733-1 and HNC-02773-1) as measured by qRT-PCR. Although such cases have been reported earlier,²⁹ we wondered if this up-regulation may be due to pitfalls in our experimental setup or may represent a biological phenomenon. qRT-PCR, a widely and successfully applied method, comes with the drawback that it might introduce amplification bias. Furthermore, qRT-PCR is inherently sensitive to genomic DNA contamination. To investigate this possibility, we applied the Nanostring technology as an alternative method for the quantification of transcripts. The Nanostring technology is based on an amplification-free detection method that also discriminates against genomic DNA contamination.³³ It also abolishes the need for DNA digestion by DNase I typically done prior to qRT-PCR analysis. For the quantification, we used two oligonucleotides for each targeted lncRNA (probe A and B, Suppl. Table S3). Both probes were designed to be complementary to the target transcripts ( Fig. 3A ). The probes were composed of a transcript-specific part as well as a reporter or a capture tag. These tags allow hybridization of two oligonucleotides, which are either biotinylated for capturing on a chip surface or barcoded with a unique fluorescent barcode ( Fig. 3A ). The detection of the hybridized oligomers is conducted by microscopy on a single molecule level. The total count of each captured oligomer provides an absolute measure for the expression level of the cognate transcript ( Fig. 3A ). To compare the Nanostring and qRT-PCR results, we nominated three esiRNAs based on our qRT-PCR data for Nanostring analysis. To avoid biological variation, we used the same sample preparation used in qRT-PCR analysis for the Nanostring experiments. To cover the full range of the observed qRT-PCR results ( Fig. 2C ), we picked esiRNAs, which showed a strong (HNC-02773-1) and a moderate (HNC-02354-1) up-regulation, as well as one esiRNA, which showed an effective knock-down (HNC-02360-1, 87%) of the target transcripts. In contrast to the qRT-PCR results, Nanostring analysis was able to detect transcript depletion by esiRNAs HNC-02773-1 (84%) and HNC-02354-1 (28%; Fig. 3B ). The esiRNA HNC-02360-1 showed a comparable depletion down to 18% as observed by qRT-PCR. Hence, the unexpected up-regulation as measured by qRT-PCR is most likely a technical artifact. We checked the melting curves for the up-regulated transcripts, but this analysis did not reveal a plausible explanation for the erroneous result. Generally, qRT-PCR experiments rely on a specific and efficient amplification of the intended transcript. Especially, lncRNAs, which are typically low in expression, are more demanding in qRT-PCR quantification because many PCR cycles need to be applied to detect them. Therefore, primer design is critical, and each primer pair should be carefully validated before being used in a qRT-PCR experiment. The results also demonstrate that the Nanostring technology is useful for measuring the knockdown levels of lncRNAs after esiRNA-mediated RNAi. Amplification-free methods such as the Nanostring technology appear to be highly suitable for high- to medium-throughput quantification of lncRNAs and possibly other transcripts, in particular when their expression level is low.

Figure 3.

Transcript quantification using Nanostring technology. (A) Long noncoding transcripts are detected and quantified by two hybridization probes (A and B) labeled with a fluorescent barcode (probe A, colored circles) or a biotin moiety (yellow pentagon) for immobilization on a chip surface (probe B). Following immobilization, the fluorescent barcodes are counted by microscopy, and the counts for the negative control–treated cells (Renilla Luciferase [RLUC], left part) are used as reference for the normalization of the counts obtained after endoribonuclease-prepared siRNA (esiRNA) treatment (right part). (B) Knock-down efficiency of esiRNAs against long noncoding RNAs measured by quantitative real-time PCR (hatched) or Nanostring technology (gray). The percentage of remaining RNA normalized to the expression level of negative control–treated cells (RLUC, black) is shown. The error bars indicate standard deviation (n = 3).

We conclude that our esiRNA library is capable of silencing the majority of human IncRNA transcripts. Hence, the presented esiRNA resource represents a valuable screening tool that should expedite the molecular characterization of human IncRNAs.

c-KLAN

The generation of LOF phenotypes by RNAi provides a valuable approach to gaining insights into the biological functions of lncRNAs. Additional understanding may come from determining the subcellular localization of lncRNAs.^18,29,36 To combine knockdown and localization experiments, we previously established a method²⁹ that allows the generation of sense- and antisense-labeled riboprobes for RNA FISH and dsRNAs for esiRNA synthesis from a single source. c-KLAN²⁹ was successfully applied to the functional dissection of mouse long noncoding transcripts.²⁹ To extend c-KLAN to human lncRNAs, we selected two PCR products (HNC-02360-1, HNC-02586-1) designed to target the lncRNAs lnc-APC-6 and lnc-HSP90AA1-9 from our library and generated labeled FISH probes for hybridization in HeLa cells. We detected distinct signals for both probes using fluorescence microscopy and observed diverse localization patterns for the different lncRNAs. lnc-APC-6 mainly showed nucleolar localization, whereas lnc-HSP90AA1-9 was found dispersed as spots in the nucleus and cytoplasm ( Fig. 4 ). In both cases, we did not observe a signal for the sense probes (Suppl. Fig. S5), and the antisense probe signal was sensitive to esiRNA-mediated RNAi ( Fig. 4 ), demonstrating that the probes specifically detected their targets and revealed their cellular localization. Because all of the PCR products used to generate esiRNAs are suitable for the production of hybridization probes, we conclude that this resource may also be valuable to screen for localization patterns of human lncRNAs.

Figure 4.

Localization of long noncoding RNAs (lncRNAs) by fluorescence in situ hybridization. HeLa cells stained by fluorescently labeled RNA probes (green), anti-tubulin antibody (red), and DAPI (blue) after transfection with a negative control endoribonuclease-prepared siRNA (esiRNA; Renilla Luciferase, left) and an esiRNA (right) designed against the target lncRNAs lnc-APC-6 (HNC-02360-1, top panel) and lnc-HSP90AA1-9 (HNC-02586-1, bottom panel). All images were processed using the same exposure settings and with identical threshold adjustments to represent the actual signal intensities. Scale bars: 5 µm.

In conclusion, we present in this study synthesis and utility of a renewable resource of esiRNA and FISH probes for the dissection of human lncRNA functions by RNAi and by localization profiling. This scalable resource allows the systematic characterization of lncRNAs and can be extended to a genome-wide collection covering the entire noncoding transcriptome, allowing for a comprehensive investigation of lncRNA function in human biology and disease.

Footnotes

Acknowledgements

The authors thank Sebastian Rose and Romy Heinze for technical assistance and Annett Erkes for computational support.

Supplementary material for this article is available on the Journal of Biomolecular Screening Web site at .

Declaration of Conflicting Interests

The authors M.T., M.P.-R., and I.W. declare a conflict of interest due to an affiliation with Eupheria Biotech GmbH. F.B. is an adviser for Eupheria and therefore declares a conflict of interest. D.C. declares no conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The work was funded by the Bundesministerium für Bildung und Forschung (BMBF) grant Go-Bio 2 (0315980).

References

Lander

E. S.

Linton

L. M.

Birren

. Initial Sequencing and Analysis of the Human Genome. Nature 2001, 409, 860–921.

Maeda

Kasukawa

Oyama

. Transcript Annotation in FANTOM3: Mouse Gene Catalog Based on Physical cDNAs. PLoS Genet. 2006, 2, e62.

Dunham

An Integrated Encyclopedia of DNA Elements in the Human Genome. Nature 2012, 489, 57–74.

Harrow

Frankish

Gonzalez

J. M.

. GENCODE: The Reference Human Genome Annotation for the ENCODE Project. Genome Res. 2012, 22, 1760–1774.

Derrien

Johnson

Bussotti

. The GENCODE v7 Catalog of Human Long Noncoding RNAs: Analysis of Their Gene Structure, Evolution, and Expression. Genome Res. 2012, 22, 1775–1789.

Djebali

Davis

C. A.

Merkel

. Landscape of Transcription in Human Cells. Nature 2012, 489, 101–108.

Volders

P. J.

Helsens

Wang

. LNCipedia: A Database for Annotated Human lncRNA Transcript Sequences and Structures. Nucleic Acids Res. 2013, 41, D246–D251.

Wutz

Gene Silencing in X-Chromosome Inactivation: Advances in Understanding Facultative Heterochromatin Formation. Nat. Rev. Genet. 2011, 12, 542–553.

Nagano

Mitchell

J. A.

Sanz

L. A.

. The Air Noncoding RNA Epigenetically Silences Transcription by Targeting G9a to Chromatin. Science 2008, 322, 1717–1720.

10.

Hung

Wang

Lin

M. F.

. Extensive and Coordinated Transcription of Noncoding RNAs within Cell-Cycle Promoters. Nat. Genet. 2011, 43, 621–629.

11.

Tripathi

Ellis

J. D.

Shen

. The Nuclear-Retained Noncoding RNA MALAT1 Regulates Alternative Splicing by Modulating SR Splicing Factor Phosphorylation. Mol. Cell. 2010, 39, 925–938.

12.

Tsai

M. C.

Spitale

R. C.

Chang

H. Y.

Long Intergenic Noncoding RNAs: New Links in Cancer Progression. Cancer Res. 2011, 71, 3–7.

13.

Gupta

R. A.

Shah

Wang

K. C.

. Long Non-Coding RNA HOTAIR Reprograms Chromatin State to Promote Cancer Metastasis. Nature 2010, 464, 1071–1076.

14.

Faghihi

M. A.

Modarresi

Khalil

A. M.

. Expression of a Noncoding RNA Is Elevated in Alzheimer’s Disease and Drives Rapid Feed-Forward Regulation of Beta-Secretase. Nat. Med. 2008, 14, 723–730.

15.

Mercer

T. R.

Dinger

M. E.

Sunkin

S. M.

. Specific Expression of Long Noncoding RNAs in the Mouse Brain. Proc. Natl. Acad. Sci. U.S.A. 2008, 105, 716–721.

16.

Cabili

M. N.

Trapnell

Goff

. Integrative Annotation of Human Large Intergenic Noncoding RNAs Reveals Global Properties and Specific Subclasses. Genes Dev. 2011, 25, 1915–1927.

17.

Kittler

Pelletier

Heninger

A. K.

. Genome-Scale RNAi Profiling of Cell Division in Human Tissue Culture Cells. Nat. Cell Biol. 2007, 9, 1401–1412.

18.

Theis

Slabicki

Junqueira

. Comparative Profiling Identifies C13orf3 as a Component of the Ska Complex Required for Mammalian Cell Division. EMBO J. 2009, 28, 1453–1465.

19.

Slabicki

Theis

Krastev

D. B.

. A Genome-Scale DNA Repair RNAi Screen Identifies SPG48 as a Novel Gene Associated with Hereditary Spastic Paraplegia. PLoS Biol. 2010, 8, e1000408.

20.

Fazzio

T. G.

Huff

J. T.

Panning

An RNAi Screen of Chromatin Proteins Identifies Tip60-p400 as a Regulator of Embryonic Stem Cell Identity. Cell. 2008, 134, 162–174.

21.

Raychaudhuri

Loew

Korner

. Interplay of Acetyltransferase EP300 and the Proteasome System in Regulating Heat Shock Transcription Factor 1. Cell. 2014, 156, 975–985.

22.

Collinet

Stoter

Bradshaw

C. R.

. Systems Survey of Endocytosis by Multiparametric Image Analysis. Nature 2010, 464, 243–249.

23.

Zhu

Lawo

Bird

. The Mammalian SPD-2 Ortholog Cep192 Regulates Centrosome Biogenesis. Curr. Biol. 2008, 18, 136–141.

24.

Roguev

Talbot

Negri

G. L.

. Quantitative Genetic-Interaction Mapping in Mammalian Cells. Nat. Meth. 2013, 10, 432–437.

25.

Ding

Paszkowski-Rogacz

Nitzsche

. A Genome-Scale RNAi Screen for Oct4 Modulators Defines a Role of the Paf1 Complex for Embryonic Stem Cell Identity. Cell Stem Cell. 2009, 4, 403–415.

26.

Calegari

Haubensak

Yang

. Tissue-Specific RNA Interference in Postimplantation Mouse Embryos with Endoribonuclease-Prepared Short Interfering RNA. Proc. Natl. Acad. Sci. U.S.A. 2002, 99, 14236–14240.

27.

Kittler

Surendranath

Heninger

A. K.

. Genome-Wide Resources of Endoribonuclease-Prepared Short Interfering RNAs for Specific Loss-of-Function Studies. Nat. Methods 2007, 4, 337–344.

28.

Myers

J. W.

Chi

J. T.

Gong

. Minimizing Off-Target Effects by Using Diced siRNAs for RNA Interference. J. RNAi Gene Silencing 2006, 2, 181–194.

29.

Chakraborty

Kappei

Theis

. Combined RNAi and Localization for Functionally Dissecting Long Noncoding RNAs. Nat. Methods 2012, 9, 360–362.

30.

Kittler

Heninger

A. K.

Franke

. Production of Endoribonuclease-Prepared Short Interfering RNAs for Gene Silencing in Mammalian Cells. Nat. Methods 2005, 2, 779–784.

31.

Dobin

Davis

C. A.

Schlesinger

. STAR: Ultrafast Universal RNA-seq Aligner. Bioinformatics 2013, 29, 15–21.

32.

Lee

Seo

C. H.

Lim

. Accurate Quantification of Transcriptome from RNA-Seq Data by Effective Length Normalization. Nucleic Acids Res. 2011, 39, e9.

33.

Geiss

G. K.

Bumgarner

R. E.

Birditt

. Direct Multiplexed Measurement of Gene Expression with Color-Coded Probe Pairs. Nat. Biotechnol. 2008, 26, 317–325.

34.

Henschel

Buchholz

Habermann

DEQOR: A Web-Based Tool for the Design and Quality Control of siRNAs. Nucleic Acids Res. 2004, 32, W113–W120.

35.

Shao

Chan

C. Y.

Maliyekkel

. Effect of Target Secondary Structure on RNAi Efficiency. Rna 2007, 13, 1631–1640.

36.

Theis

Paszkowski-Rogacz

Buchholz

SKAnking with Ska3: Essential Role of Ska3 in Cell Division Revealed by Combined Phenotypic Profiling. Cell Cycle 2009, 8, 3435–3437.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

1.97 MB

0.00 MB