Abstract
Keywords
INTRODUCTION
Inherited myopathies exhibit a wide range of genetic and phenotypic heterogeneity, making their molecular diagnoses particularly challenging [1–3]. Molecular diagnostics is an increasingly important part of the clinical workup, but the extensive genetic heterogeneity, and the requirement to test the largest and most complex genes in the genome [e.g. DMD (2.4 Mb, largest gene), TTN (largest protein, 363 exons)] can lead to costly and time consuming molecular diagnostic workups [4, 5]. Single-gene testing is now being replaced by faster and less expensive highly parallel next-generation sequencing methods that have recently entered the clinical realm [6–9].
There have been concerns regarding adequate sequence coverage of the highly complex myopathy candidate genes using whole genome sequencing [10]. Targeted re-sequencing panels containing candidate genes for muscular dystrophies and myopathies have been developed and proven successful in providing molecular diagnoses [11, 12]. Jones et al. used a targeted PCR based enrichment panel for 24 genes causing congenital disorders of glycosylation (CDG) on 12 previously diagnosed CDG patients and were able to confirm diagnoses for all patients [13]. Valencia et al. compared two different targeted sequencing enrichment strategies, i.e. solution based hybrid capture (SureSelect) and microdroplet based PCR (RainDance) for 12 congenital muscular dystrophy (CMD) genes in 12 patients [14]. They recommend microdroplet based PCR technology as a more efficient enrichment strategy for molecular diagnostics in the clinical laboratory. To further test their microdroplet PCR congenital muscular dystrophy panel, Valencia et al. showed its application in 20 undiagnosed patients and 6 controls [15]. Vasli et al. tested a targeted sequencing panel using Agilent Sureselect and Illumina sequencing for 267 genes known to cause neuromuscular diseases (NMD) in 16 patients with known and unknown molecular diagnoses [16]. They were able to characterize pathogenic mutations in several NMD genes including large genes like TTN, DMD and RYR1. Lim et al. reported a panel for congenital muscular dystrophy with dystroglycanopathies covering 26 genes, and reported data for 4 patients [17]. More recently, Savarese et al. reported a panel for non-syndromic muscle disorders covering 93 genes including larger genes using the Haloplex enrichment system (Agilent) for 177 myopathy patients with a mutation success rate of ∼29% [18]. All these studies reported lower sequence coverage in certain exons and recommended supplemental Sanger sequencing for low coverage regions.
Here, we describe the development and testing of a targeted sequencing panel of 45 genes associated with muscular dystrophy and congenital myopathies in 94 undiagnosed muscle disease patients. For sequence enrichment, we utilized the RainDance emulsion PCR based microdroplet technology system [19] followed by next-generation sequencing on the Illumina platform [20]. Whole exome sequencing was done in parallel for a subset (n = 10) of these patients, to compare the sensitivity of targeted and exome sequencing methods for molecular diagnosis of patients with muscle disease.
MATERIALS AND METHODS
Patients
Patients were recruited under the institutional review board (IRB) protocol 2405, which has been reviewed and approved by the Office for the Protection of Human Subjects at Children’s National Medical Center, Washington DC, USA. Forty-two patients were recruited through the Institute for Neuroscience and Muscle Research, Sydney, Australia. Ethics approval was obtained from the Human Research Ethics Committee of the Sydney Children’s Hospitals Network (Approval No: 10.CHW.45) and written informed consent was obtained from all participants. Genomic DNA was extracted from blood using the Gentra Puregene blood kit (Qiagen, Germantown, MD) following manufacturer’s instructions.
Design and use of a targeted sequencing panel for muscle disease
Candidate genes for our muscle disease targeted sequencing panel were chosen based on literature reports and the Leiden muscular dystrophy database to compile genes associated with non-syndromic muscular dystrophies and congenital myopathies. Raindance emulsion microdroplet PCR technology was selected for sequence enrichment [19] and a list of 45 muscle disease causing genes submitted for primer design. The target library consisted of 1851 amplimers targeting 1426 exons of the 45 candidate muscle disease genes (Table 1). Details of target library including disease association, number of exons targeted, and amplicon sizes are included in supplementary information (Supplementary Table 1).
Sequence enrichment
Primer library was custom-designed by RainDance (RainDance Technologies, Lexington, MA) using the National Center for Biotechnology Information’s GRCh36/hg18 human reference sequence. Primers were designed to amplify 1426 unique exons and exon/ intron boundaries for the 45 muscular dystrophy and myopathy genes listed in Table 1, (total library size of ∼82 kb) resulting in 1851 PCR amplimers. The amplimers ranged in size from 200–600 bp with guanine cytosine (GC) content ranging from 24–80%. Genomic DNA (2.5–3 μg) was sheared to 2–4 kb fragments using a Covaris S220 instrument (Covaris, Woburn, MA). For target sequence amplification, input DNA template mixture was prepared and loaded on to the RDT1000 system (RainDance Technologies, Lexington, MA) for merging with primer library. After the merge, emulsion mixture was transferred to a standard thermocycler (Applied Biosystems 9700) for PCR amplification using the following conditions: Initial denaturation at 94°C for 2 min; 55 cycles at 94°C for 15 s; 54°C for 15 s; 68°C for 30 s; final extension at 68°C for 10 min and a 4°C hold. The emulsion was broken to release amplified PCR droplets that were purified using Qiagen MinElute column purification (Qiagen, Germantown, MD). The purified PCR product was then run on a DNA high sensitivity chip (Agilent Technologies, Santa Clara, CA) to confirm that amplicon profile matched the predicted histogram profile. End repair and ligation was performed using NEBNext end repair module (NEB, Ipswich, MA) for 30 min at 25°C followed by sample purification. The PCR fragments were concatenated using the NEB quick ligation kit (NEB, Ipswich, MA) and incubated at 20°C for 2 hrs followed by sample purification. The concatenated samples were sheared to 200 bp using the Covaris S220 instrument (Covaris, Woburn, MA). The ligated-sheared product was then run on a DNA high sensitivity chip (Agilent Technologies, Santa Clara, CA) to confirm sample quality. The Qubit dsDNA HS Assay Kit (Life Technologies, Carlsbad, CA) was used for sample quantification. For exome enrichment, the Illumina TruSeq exome capture kits were used following manufacturer’s instructions (Illumina, San Diego, CA).
Sequencing and data analysis
The standard Illumina Tru-seq (Illumina, San Diego, CA) sample preparation protocol for sequencing was followed. Concentrations of indexed samples were determined by quantitative-PCR (KAPA Biosystems, Woburn, MA). Enriched DNA was denatured and diluted to a concentration of ∼12 pM. Cluster generation and 100 bp paired end sequencing was performed using standard HiScanSQ and MiSeq (Illumina, San Diego, CA) protocols using version 3 and 2 kits respectively. Multiplex paired-end sequencing of 17–24 samples per lane of the flow cell was performed for targeted sequencing. For exome sequencing, 3–4 samples per flow cell were multiplexed on the Illumina HiScan SQ sequencer. Only reads with a Q score > 30 were used for further analysis. CASAVA 1.8.1 (Illumina, San Diego, CA) was utilized for conversion of the raw.bcl files to.fastq files and de-multiplexing the barcoded samples per lane. Alignment to the human genome reference (hg19), variant calling, and annotation was performed using NextGene (SoftGenetics, State College, PA) software.
Variant filtering and classification
Variants were filtered to include only non- synonymous, novel (not present in dbSNP, <1% frequency in 1000 genomes) variants present in coding regions+/–5 bp as shown in Fig. 1. Prediction scores of mutation pathogenicity and conservation were referenced using dbNSFP [21] within NextGene. The potential novel variants were cross-referenced using Alamut software (Interactive Biosoftware, Rouen, France) for HGVS naming and classification. We further filtered variants using our in-house sequencing database as well as the National Heart, Lung, and Blood Institute (NHLBI) exome variant server [22] and ExAC [23] databases. Interpretation of variants was done using the recommendations from the ACMG guidelines [24]. A report of the novel variants was sent to the referring physician including whether the variant has been previously reported, mutation pathogenicity predictions by dbNSFP [21], brief summaries and web links for phenotypes typically associated with the probable causative genes. We did not have access to phenotype data before variant analysis to make it an unbiased approach. The referring physician then helped limit the likely candidate pathogenic genes by cross-referencing to patient phenotype. Variants were classified as likely pathogenic or variants of unknown significance (VOUS/VUS). Patients with no mutations found in our targeted sequencing panel (26/94) were listed as no mutations found. Variants were classified based on these features 1) whether variant has been previously reported (pathogenic or VOUS) on databases like Leiden muscular dystrophy database, Clinvar 2) homozygous or compound heterozygous mutations were prioritized as more likely to be disease causing 3) frame-shifting variants were prioritized compared to missense variants 4) conservation score of nucleotide or amino acid as well as physicochemical difference of changed amino acids 5) the protein domain that mutation is present in 6) if mutation correlates with phenotype and inheritance pattern observed. Based on these criteria and literature reviews we assigned pathogenicity to variants. Graphical representation of variants in all patients (Fig. 2) was performed using open-source software d3 supported on d3pie.org.
Processing samples for comparisons of depth vs. dropouts
Illumina.fastq files of all samples (targeted and exome) were mapped to the human genome (ucsc.hg19.fasta); forward and reverse (paired-end reads) files were paired into one SAM file using the Burrows-Wheeler Aligner (BWA, v0.5.9-t26-dev). Post alignment, the SAM files were converted to BAM using Samtools (v0.1.18) and all files per sample were merged into a single BAM file using Picard tools (v1.58). The BAM files were sorted using Samtools, duplicate reads marked and removed with Picard, and BAM files indexed using Samtools. Lastly, we performed covariate analysis and recalibrated tables using Genome Analysis Tool Kit (GATK, v1.3).
Coverage calculations
In order to analyze coverage per exon, we used a combination of samtools and bedtools (v2.16.2) to extract the reads in each BAM file. The BED file we used contained the coordinates for titin variant 1 C (NM_001267550.1). The average coverage of each sample was determined using the extracted reads for each of the 20 samples (10 targeted and 10 exome). The samples were then normalized to 100x average coverage by multiplying the read count of each exon by a scalar factor. Next, each normalized sample was counted to determine how many exons had reads equal to or below 30x, 20x, 10x, 5x, 1x, and 0. The data was averaged, calculated for standard deviation and graphed. The frequency distribution histogram of coverage per exon was created using the average coverage of each exon across the 10 samples sequenced in the panel and binned for every 10x. The percent coefficient of variance histogram: The average for each exon for the 10 samples was calculated for each method along with the standard deviation and coefficient of variation. The table with percentage of covered exons was extracted from bedtools output. The Kolmogorov-Smirnov test was used to compare the distribution of reads between exome and TS exon populations.
Validation of variants by Sanger sequencing
Select variants were validated with Sanger sequencing as noted in the online clinical supplementary table. Primer sequences can be made available upon request. Amplicons were sequenced using the Big Dye Terminator v3.1 cycle sequencing kit (Life Technologies, Carlsbad, CA) on the ABI 3730 DNA Analyzer (Life Technologies, Carlsbad, CA). Sequences were compared to reference sequences using Sequencher TM 4.8 (Gene Codes Corporation, Ann Arbor, MI).
RESULTS
Targeted sequencing results
Mutation detection
We analyzed sequencing data covering 45 genes (1851 amplimers targeting 1426 exons) known to cause myopathies and muscular dystrophies in 94 patients that were referred to us for molecular diagnostics under a research based diagnostic protocol. For detection of mutations in our targeted panel data we used an analysis and filtering pipeline as shown in Fig. 1. Based on phenotype information and discussions with the referring physician, we classified variants into two categories that reflected pathogenicity likelihood i.e. mutations that were likely pathogenic (Table 2) and variants of unknown significance (VOUS/VUS; Table 3). There were 24 patients with no mutations detected in our targeted panel; these are described in Table 4, including their current diagnostic status (if any). Information on phenotype and other confirmatory studies has been provided in the online clinical table and uploaded on ClinVar.
Mutation distribution for our targeted sequencing data from 94 patients is represented in Fig. 2. We were able to provide likely diagnoses to 33 of 94 patients (approximately 35% molecular diagnostic rate). Variants of unknown significance with unknown pathogenicity were seen in 35/94 (∼37%) of patients respectively and represent the largest group of mutations detected. No mutations were found in our targeted panel for 26/94 (∼28%) patients. Mutation type distribution (missense, splice site, frameshifting, nonsense) was compared in the two groups – likely pathogenic and VOUS in the inner pie chart of Fig. 2. Missense mutations predominated as the most common mutation type in both groups. In the VOUS group, ∼71% mutations were of the missense type compared to the pathogenic/likely pathogenic group which comprised of ∼51% missense mutations. The percentage of splice site mutations was similar in both groups; this is likely due to our relaxed definition of splice sites as +/–5 bps of coding sequences. Using a more stringent splice site acceptor/donor (+/–2) mutations only definition, we eliminated all splice site mutations found in the VOUS group.
Mutations in TTN
Titin variants were reported using titin skeletal muscle isoform (NM_133378.4) as well as full length titin isoform (NM_001267550.1) using the Mutalyzer program [25]. The region of the protein containing the variant is noted i.e. Z-disk, I-band, A-band, M-band. For patients with potentially pathogenic titin mutations with phenotypes resembling reported titinopathies, we ranked variants as likely pathogenic pending further protein and segregation analysis studies. Table 5 lists all patients with likely pathogenic mutations in titin along with phenotype observed.
Depth and exon dropouts in targeted panel
In order to determine the optimal sequencing depth needed to minimize the number of exon dropouts i.e. missing exons of targeted genes; we had to first divide our samples into two groups depending on the sequencer used. Depth was dependent on the sequencer used; with higher depth obtained for samples sequenced on the Illumina HiScan SQ (range of 293x –2901x) as compared to Illumina Miseq benchtop sequencer (range of 71x –211x). Depth and dropouts were compared for a few samples that were sequenced using both sequencers (twelve MiSeq and eight HiScan sequenced patient samples) as shown in Fig. 3. Average read depth varied between samples and sequencer used. Increasing average read depth decreased the number of exon dropouts; until it reached a saturation point of ∼200x where less than 2% of exons had equal or less than 10x depth compared to 7% at 71x depth.
Comparing sensitivity of targeted vs. exome panels
Design comparisons
To compare the regions covered by our targeted panel and exome methods; we compared the design (BED files) of our 45 targeted myopathy genes in both methods. However, the design was created in 2011 (Table 1) and updates to the RefSeq database have introduced new exons and isoforms to the list of targeted myopathy genes. To ensure that we are obtaining the most accurate coverage of the genes, we decided to update our list of exons for each gene at during time of analysis (2014). We found the updated 45 targeted myopathy genes contain 1483 unique exons (57 more exons than counted in the design). Of 1483 exons covered by the targeted panel, 41 exons were not targeted by exome (Supplementary Table 2A). With the inverse analysis, a single exon in the exome panel was not covered in the targeted panel (Supplementary Table 2B).
Most of the non-targeted exome (41 exons) regions were in genomic regions that contained high sequence similarity to other regions in the genome. Most of the non-targeted loci were part of three genes: The SMN1 gene (9 exons), SMN2 (9 exons), NEB gene (16 exons); each of these genes contains exon repeats that are part of intragenic sequence duplications. Other non-targeted regions were due to discrepancies in differing annotations, GC content, and isoform information present at the time of design. For example, we found four 5’ CAPN3 exons on the TS panel annotated under the 3’ region of an upstream gene GANC in the exome panel. Additional annotation discrepancies include MATR3 exons that were annotated as SNHG4 exons due to sequence overlap. Four missing exons were likely omitted due to particularly high or low GC content such as SEPN1 (87%, exon 1, NM_020451.2), POMGNT1 (83%, exon 1, NM_017739.3), FHL1 (80%, exon1, NM_001159702.2) and FKTN (37%, exon2, NM_001079802.1) (Supplementary Table 2A). Another source of discrepancy between the panels may be the isoform choice and EST (expressed sequence tag) evidence available during the time of design (see below for an example with TTN) or the exons could simply have been missed.
Coverage comparisons
Twelve patient samples were enriched for target exons by using both targeted sequencing (TS) and exome methods. Two out of twelve samples had poor quality for either targeted or exome data and were discarded from further coverage comparisons. The exome samples averaged at approximately 30x coverage, and overall coverage ranged from 11x –50x (Supplementary Table 3). The targeted samples averaged 1016x coverage and ranged from 53x –2525x overall coverage (Supplementary Table 4).
Comparing TTN coverage
To compare performance of the two enrichment methods, we tested coverage of TTN transcript variant 1 C (TTN-1 C, NM_001267550.1) for targeted and exome data. We used titin as an exemplar as it has 363 coding exons (largest protein) and extensive alternative splicing. TTN-1 C was chosen for comparison as it is inclusive of most exons of the TTN gene. The specific TTN isoform used for design of the exome panel (Illumina TruSeq exome enrichment kit) was not disclosed in the product information. Before determining coverage of exons, we determined how many exons of TTN-1 C each panel was designed to target. After intersecting the coordinates of TTN-1 C by targeted amplicons, we discovered that 47/363 exons of TTN-1 C were not targeted by our targeted panel. TTN-1 C exon coordinates by the exome panel targets showed 49/363 untargeted TTN-1 C exons. The ∼50 exons untargeted by both panels were nearly identical (Supplementary Table 5). We determined that the lack of these 50 exons was due to the timing of EST/genome builds, where the TTN-1 C transcript was solidified about 3 years after the targeted and exome panels were designed. These exons were excluded from subsequent analyses.
To directly compare the distribution of coverage across the 314 TTN-1 C loci, the coverage of TTN-1 C exons by both panels was extracted and normalized to 100x (Fig. 4A). The distribution of exon coverage showed that targeted samples contained a larger spread than exome, with more exons in the 0–10x range and >100x. The exome samples in contrast had a tighter distribution with more exons concentrated in the 30–80x range. We performed a two-sample Kolmogorov-Smirnov test to determine if the population of normalized exon reads between TS and exome are similar (Fig. 4A, Table 6) and found that the two methods (targeted and exome) to be significantly different with a corrected combined K-S p-value of 0.002. We calculated the mean, standard deviation and percent coefficient of variance (% CV) for the ten samples that were enriched by both methods. We found that % CV for the exons in the exome samples ranged from 5 to 35% (15% CV considered good). In contrast the targeted samples contained 33/314 TTN-1 C exons that ranged beyond exome samples from 36 to 294 % CV (Fig. 4B), 23/33 targeted exons were above 50% CV and 15/33 targeted exons had standard deviations that were significantly different from the exome samples (Supplementary Table 6) indicating high sample-to-sample variability in targeted exons. The normalized samples enriched by both methods were counted for exons containing reads equal to or below 30x, 20x, 10x, 5x, 1x and zero reads (excluding TTN-1 C exons untargeted by both panels). On average, the targeted samples contained more reads equal to or below 30x when compared to exome (Fig. 4 C). Similar results were also observed in a previous study comparing a custom RainDance congenital myopathy panel to an alternative in solution hybrid capture method, Agilent Sure Select [14]. In addition, the standard deviation for number of exons below 30x in targeted samples is higher than exome. Notably targeted samples also contained more exons that were completely missed or contained zero reads, which may be due to abnormal GC content. Exome samples contained fewer TTN-1 C exons with less than 100% base coverage (Table 7). From the raw coverage of TTN-1 C, we used BedTools coverage to determine percent base coverage of each TTN-1 C exon in both TS and exome methods. TS had an average of 14.1 exons with less than 100% coverage of which 1.9 are exons that were not covered at all. None of the targeted exons were missed in the exome samples for the TTN-1 C.
Comparison of coverage of exons in additional genes
The same analyses performed on TTN-1 C were replicated in five other large genes including DMD, NEB, DYSF, COL6A3 and RYR1 (Fig. 5). We found the results from these additional genes to be similar to the detailed studies of TTN-1 C above (Fig. 5 and Table 7).
DISCUSSION
Mutation distribution in targeted panel data
In this study, we carried out targeted sequencing with a myopathy candidate gene panel of 45 genes in a cohort of 94 undiagnosed muscle disease patients. These patients were recruited from a broad referral population and did not have a molecular defect identified prior to enrollment in the study protocol. We were able to provide likely diagnoses for 33/94 (35%) of patients with likely pathogenic mutations (Fig. 2). Most exome sequencing studies have seen a molecular diagnostic rate of approximately 22–25%, with better rates seen in cases of Mendelian disorders [26, 27]. Our molecular diagnostic rate is similar to that reported by Savarese et al. (29% pathogenic mutations) that used a candidate muscle-disease gene panel with 93 genes for 177 undiagnosed myopathy patients [18]. Twenty-five patients with likely pathogenic variants (from 33) had Sanger sequencing validation carried out by independent laboratories, and there was a 100% concordance rate of our next-gen data and independent validations. This data agrees with previous studies showing the robust nature of the nextgen sequencing approaches, and also questions whether Sanger validations are absolutely necessary (when coverage is sufficient). The likely pathogenic mutations (33 patients) were detected in 19 of the 45 targeted genes. Savarese et al. reported a relatively high frequency of pathogenic mutations in the RYR1 (6/52 patients), NEB (3/52 patients) and TTN (2/52 patients) genes [18]. We observed a relatively high frequency of pathogenic and likely pathogenic mutations in TTN (5/33 patients) and RYR1 (5/33 patients) genes. In most previous reports of sequencing of both TTN and RYR1 genes, Sanger sequencing was limited to mutational hot-spot regions, making Sanger based molecular diagnostics problematic as these large genes also have many polymorphisms. Variant interpretation to determine pathogenicity for TTN and RYR1 mutations is reported below.
Titin mutations
Titin (TTN) mutations are known to be responsible for both cardiac and skeletal muscle diseases with wide- ranging phenotypes and modes of inheritance, although the complex and large size of the gene has made interpretations of pathogenicity challenging [MIM # 188840]. Based on previous literature and newer TTN mutational surveys, generalizations emerge that: (1) recessive loss-of-function mutations in TTN are causative of muscle diseases with or without heart defects [28, 29] (2) dominant (late onset) and recessive mutations (early onset) in the last exons coding for the M-line region of titin protein lead to late onset tibial muscular dystrophy [30–32] (3) dominant missense variants in the mutational hotspot exon 343 lead to hereditary myopathy with early respiratory failure (HMERF) without cardiac abnormalities [33–39] (4) dominant truncating mutations are generally responsible for adult onset cardiomyopathies [40–42].
As our current study was focused on myopathy patients, we first used titin loss of function (frameshift, indels, splice/likely splice, nonsense) variants for initial ranking to help identify potential pathogenic variants. We discussed variant-positive patients with the referring physicians to check whether our molecular data fit the clinical phenotype and inheritance patterns observed for the patient (e.g. tibial muscular dystrophy, HMERF phenotypes). Using this classification scheme, we were able to shortlist five patients with likely pathogenic titin variants (recessive LOF, Table 5) from a total of nineteen patients with TTN variants (rest classified as VOUS). Our mutation classification for titin was more stringent than the probably pathogenic titin missense mutations reported by Vasli et al. using a targeted sequencing panel, where 2 out of 16 patients were reported with possible pathogenic missense TTN mutations [16].
Chauveau et al. reported a comprehensive meta-analysis of all 127 coding titin pathogenic variants reported to date [43]. Out of the 127 variants, 19 were located in the carboxyl-terminal M-line region corresponding to a skeletal muscle-specific isoform, and thought to contribute to skeletal muscle disease. However, the authors pointed out that this region (coded by six exons, ex 258–363) has been preferentially screened in the past and until the rest of the titin gene has been screened equally it is not accurate to refer to it as a hotspot. We found a total of 29 variants in the titin gene in 19 patients (19/95), of which 10 variants were classified as likely pathogenic for 5 patients (Table 5) that resembled clinical picture for titin recessive mutations. Three of the 10 likely pathogenic variants were in the M-line region, despite this domain representing only ∼6.5% of the TTN gene. Thus, our data is consistent with a probable mutational hotspot in the M-line domain for skeletal muscle disease.
RYR1 mutations
Mutations in the ryanodine receptor type 1 gene are related to malignant hyperthermia susceptibility and myopathies including central core, multiminicore, centronuclear, congenital fiber type disproportion, and King-Denborough syndrome [MIM# 180901]. Prior to next-gen sequencing, there were three hotspots for RYR1 mutations located in the N-terminal, central, and C-terminal domains corresponding to RYR1 protein however recent studies have shown that mutations are located throughout the gene [44, 45]. Generally, central core disease and malignant hyperthermia susceptibility present as dominant mutations in the C-terminus [46]. Recessive mutations can cause core and non- core related myopathy and are located throughout gene, with usually one mutation causing hypomorphic expression of RYR1 protein [47, 48]. In our cohort, RYR1 compound heterozygous mutations were seen in two patients with congenital myopathy phenotype; RYR1 missense mutations were seen in three patients with two showing a congenital fiber type disproportion phenotype and another with core-like myopathy. Some myopathies are seen with malignant hyperthermia susceptibility, however the risk for malignant hyperthermia for most patients with RYR1 variants remains unclear [49]. RYR1 is included in the ACMG (American College of Medical Genetics and Genomics) ‘medically actionable’ incidental findings list for whole genome and exome sequencing; it is required to report known pathogenic mutations that cause autosomal dominant malignant hyperthermia [50]. However, evaluating malignant hyperthermia susceptibility mutations is challenging as the disease shows variable expressivity, incomplete penetrance and is not easily diagnosed in the clinic [51, 52].
Comparing targeted and whole exome sequencing strategies
Our results suggest that the exome panel (Illumina TruSeq) showed better performance compared to our targeted panel (RainDance) in terms of sensitivity, coverage uniformity, and target design. This is likely due to the hybrid capture technology used in the exome panel compared to PCR based enrichment in the targeted panel. Our results are similar to those obtained by Valencia et al. comparing a hybrid capture (Agilent SureSelect) and PCR enrichment (Raindance) method [15]. In contrast, Savarese et al. reported higher coverage using their targeted ‘motorplex’ panel of 93 genes (Agilent) compared to whole exome sequencing, with discovery of 20–30% more variants with targeted sequencing [18]. The targeted Agilent hybrid-capture approach utilized by Savarese et al. in their motorplex assay is methodologically more similar to exome hybrid-capture than the emulsion PCR targeted sequencing approach used by Valencia et al. and in our studies here. Thus, it is likely that hybrid-capture approaches in general lead to better sensitivity and coverage than targeted PCR approaches. For clinical diagnostic settings, the quality criteria for next-gen panels, including coverage and diagnostic yield should be clearly defined and validated before being offered as a diagnostic test [53]. One of the major limitations of both targeted and exome sequencing is the inability to detect large structural variations i.e. large insertions/deletions, and copy number variations (CNVs) [54]. With newer bioinformatics tools, sensitivity of CNV/indel detection in neuromuscular diseases has increased [55–57]. Although this still remains dependent on the depth of data and sequencing batches utilized.
CONFLICT OF INTEREST
The authors have no conflict of interest to report.
Footnotes
ACKNOWLEDGMENTS
We would like to thank all the patients and families that participated in this study. This work was supported by grants from the National Institutes of Health 3R01NS29525 (EPH), National Institutes of Health Bedside to Bench program 5R01NS029525-18A1 (CB, EPH), National Institute of Arthritis and Musculoskeletal and Skin Diseases T-32 AR056993 (AK), Neurological Sciences Academic Developmental Award (K12 awarded to CT-R), National Health and Medical Research Council of Australia grants 1022707 and 1031893 (NFC and KNN), and the Center for Genetic Medicine Research at Children’s National Medical Center.
JP is a pre-doctoral student in the Molecular Medicine Program of the Institute for Biomedical Sciences at the George Washington University. This work is from a dissertation to be presented to the above program in partial fulfillment of the requirements for the Ph.D. degree.
