Abstract
Next-generation sequencing refers to a high-throughput technology that determines the nucleic acid sequences and identifies variants in a sample. The technology has been introduced into clinical laboratory testing and produces test results for precision medicine. Since next-generation sequencing is relatively new, graduate students, medical students, pathology residents, and other physicians may benefit from a primer to provide a foundation about basic next-generation sequencing methods and applications, as well as specific examples where it has had diagnostic and prognostic utility. Next-generation sequencing technology grew out of advances in multiple fields to produce a sophisticated laboratory test with tremendous potential. Next-generation sequencing may be used in the clinical setting to look for specific genetic alterations in patients with cancer, diagnose inherited conditions such as cystic fibrosis, and detect and profile microbial organisms. This primer will review DNA sequencing technology, the commercialization of next-generation sequencing, and clinical uses of next-generation sequencing. Specific applications where next-generation sequencing has demonstrated utility in oncology are provided.
Introduction
Although nucleic acid sequencing technology has only existed for about 40 years, the technology represents an outstanding example of progress resulting from continuous improvement and increases in cost efficiency. The newest sequencing technologies are frequently referred to as next-generation sequencing (NGS). The results from NGS testing have been translated into clinical laboratories to produce clinically relevant information that directly impacts patient care. Some molecular tests employ a “one-gene one-test” approach by using specific sets of primers and polymerase chain reaction (PCR) to detect one specific mutation. In contrast, NGS is able to detect thousands or even hundreds of thousands of genetic variants in a single test run. This primer is written to provide an introduction to NGS for those health-care professionals who may have heard of the technology in the lay press or in grand rounds. While not an exhaustive review, it does lay the foundation for understanding the power of this innovative technology.
First-Generation DNA Sequencing Technology
After the discovery of the chemical composition of DNA in the late 19th century, nearly 50 years passed before the structure of DNA was eluciated 1 and another quarter of a century elapsed prior to developing methods to sequence DNA. 2,3 The principal method published in 1977 involves sequencing by synthesis (SBS) of a radioactively labeled DNA strand complimentary to the interrogated template strand using the dideoxy chain termination technique. The resulting fragments were then analyzed by polyacrylamide gel electrophoresis. This method, known as Sanger sequencing, became the basis for the “first-generation” sequencing technology. The original Sanger sequencing method has subsequently been automated and commercialized. 4,5 Major innovations include the introduction of fluorescent-labeled nucleotides instead of radioactivity, 6 replacement of gel electrophoresis with capillary electrophoresis, 7,8 and improvement of the DNA polymerases. 9 Additional progress was achieved through adoption of molecular biology techniques, such as recombinant DNA technology 10 and the PCR, 11 which allowed production and amplification of DNA fragments. The Sanger method-based sequencing technology was used to sequence a number of increasingly large genomes starting with bacteria and phages, 12 –15 and eventually mammalian 16,17 and human genomes. 18,19
One of the major limitations of Sanger sequencing is that only one sequence reaction can be analyzed per electrophoresis lane or capillary tube, hence the necessity to divide the DNA from a biological sample into individual template fragments. This was achieved by randomly cloning the fragmented DNA from a biological sample (by insertion into vectors, transformation of the bacteria, and extraction of pure individual fragments from the resulting colonies). This very labor-intensive process was one of the reasons why the first human genome project took more than 10 years and cost US$2.7 billion (https://www.genome.gov/sequencingcosts/). Subsequent improvements allowed another human genome to be sequenced using the same technology for approximately US$10 million. 20
Despite these advances, the efficiency of this method has approached its limit and further use of this technology was considered time and cost prohibitive. It should be noted that Sanger sequencing remains the gold standard for confirming DNA sequences due to the stability of the technology and is still broadly used for targeted re-sequencing in research and clinical laboratories.
Next-Generation Sequencing
The terms NGS (sometimes subdivided into second- and third-generation sequencing) massively parallel sequencing or high-throughput sequencing usually refers to technologies that allow sequencing without the physical separation of individual reactions into separate tubes, capillaries, or lanes. Instead, the sequencing reactions occur in parallel on a solid surface (such as glass or beads, depending on the technology) and are only spatially separated. Thus, billions of sequencing reactions occur and are analyzed simultaneously, dramatically improving the throughput and decreasing the labor compared to Sanger sequencing. Regardless of the platform, NGS involves several common steps (see reviews 21 –23 for details), which are outlined in Figure 1.

Schematic of sample processing for next-generation sequencing. See Table 1 for definitions of specific terms.
Commercialization of Next-Generation Sequencing Technology
These novel approaches introduced early in the 21st century were rapidly adopted resulting in strong competition in the NGS market. There are several technical differences in the technologies. 21 –23 A glossary of terms used in molecular biology terms is provided in Table 1.
Glossary of Molecular Biology Terms.
The first commercial NGS technology was introduced in 2004 by 454 Life Sciences (later purchased by Roche). This technology 24 utilized luminescent detection of a pyrophosphate released upon incorporation of a correct nucleotide during SBS and produced relatively long sequences (called “reads”). This technology was used to sequence the genome of James Watson and the price dropped from US$10 million with Sanger sequencing to about US$2 million. 25 Within 2 years, other platforms emerged (Illumina/Solexa 26 and ABI SOLiD); however, they only produced very short reads. Illumina utilizes an SBS technology originally developed by a company called Solexa which uses reversibly terminated fluorescently labeled nucleotides. 26 Illumina scientists managed to significantly increase the sequencing read length and dramatically improve accuracy and throughput. As a result, the costs were decreased and several protocols were developed for a variety of NGS applications. 27,28 More recently, Illumina introduced 2 new instruments, the HiSeq X and NextSeq 500. The first is able to sequence a human genome at ×30 coverage for less than US$1000, 29,30 and the latter does the same for a slightly higher price but in less than 20 hours. Moreover, a new series of instruments introduced in 2017 (NovaSeq) should reduce the costs by almost another order of magnitude.
A conceptually different sequencing platform called Ion Torrent 31 was introduced in 2011. This SBS technology detects the minute changes in pH caused by H+ ions released during the incorporation of the correct nucleotide in the microenvironment around the beads with the attached clonally amplified DNA template molecules. Consequently, it does not require fluorescently labeled nucleotides and expensive optics to detect the fluorescence (see the study by Heather and Chain 21 , Reuter et al 22 , and Morey et al 23 for details). Currently, the most popular applications from this company (now part of ThermoFisher) are targeted disease panels used in clinical settings (eg, cancer).
In the last decade, the amount of sequencing data has increased exponentially, accelerating translational research, clinical usage of genomics findings, and development of new genomics tests to support precision medicine approaches. 32,33 Progress in sequencing technologies was also facilitated by the expansion and adoption of improved molecular biology methods. While early protocols required microgram quantities of high-quality nucleic acid, now samples with very low (ie, picograms) quantities of nucleic acid may be sequenced. Sequencing of samples from formalin-fixed paraffin-embedded material has become routine. These advances also led to the discovery of new circulating biomarkers, 34,35 a revolution in prenatal diagnostics (genetic testing of fetal DNA from mother’s blood samples 36,37 ) and single-cell genomics approaches. 38 –40
Comparison of Ion Torrent and Illumina
Illumina has developed an impressive line of instruments that differ in their productivity, speed, and price tags from small benchtop sequencers (producing 1.65-7.5 Gb per run) to production scale systems (producing thousands of Gb of data per run). All these instruments implement similar chemistry with the sequencing performed by synthesis using reversibly terminated fluorescently labeled nucleotides and capturing the fluorescent images after each nucleotide incorporation event. The sequencing data are deconvoluted from the image data based on the color of the labels.
In the clinical setting, the majority of the existing NGS tests provide a limited amount of sequence information. MiSeq is an instrument with relatively low productivity which fits the current need. Depending on the specific test, the instrument may produce from 500 Mb up to 15 Gb of data in 4 to 56 hours. Currently, Illumina produces a validated, Food and Drug Administration (FDA)-regulated custom amplicon kit that enables clinical laboratories to design custom NGS assays for the FDA-approved MiSeqDx and NextSeq550Dx instruments.
Ion Torrent has 3 instruments in their portfolio. The sequence is determined by measuring the change in pH in the microenvironment around the beads with the attached clonally amplified identical template molecules immediately after addition of a nucleotide (one at a time). There is no definitive stop at each position and the synthesis immediately continues in case of repeats on the template. The pH change in such cases is stronger than when just 1 nucleotide is incorporated allowing the calculation of the nucleotides in the repeat. However, this technology is more prone to homopolymer detection and frameshift errors. 41 The first instrument by this company, the Personal Genome Machine, is also approved by the FDA for clinical NGS tests (Ion PGM Dx). This instrument produces up to 2 Gb of sequencing data (200-400 bp) in 2 to 4 hours. The newer instrument S5 uses the same sequencing approaches and produces up to 15 Gb data (200-400 bp) in 2- to 4-hour runs. Both instruments require additional time and instrumentation for library preparation prior to sequencing.
Currently, Illumina has the largest market share and was the first to obtain FDA approval for their MiSeq instrument. A comparison of the 2 reveals advantages and disadvantages. The initial cost of both instruments is similar. The Ion Torrent will generate sequence data faster than Illumina, an important consideration for a clinical diagnostic test with urgent requests for results, that is, prenatal samples. The Ion Torrent system offers automated library preparation, template preparation, and automated chip loading with the purchase of a separate piece of equipment, the Ion Chef, and does not depend on a technician once the Ion Chef has been loaded. This should lead to more reproducible results by removing the variability between technologists. The manual library preparation workflow for the Illumina occupies the technician’s time and requires considerable molecular biology expertise. The actual applications for both instruments appear to be comparable and both will perform targeted resequencing and whole-exome sequencing (WES). The Illumina system has a lower cost per base of sequence. The FDA-approved Illumina MiSeqDx is able to generate a complete report for its cystic fibrosis assay, while the Ion Torrent needs manual analysis. An attractive feature of the Ion Torrent in the cancer field is the ability to generate a report that matches the sequence results with ongoing clinical trials. While this may be done with the Illumina data, the report requires third-party software. The S5 is the latest instrument from Ion Torrent; consequently, there are limited user data for comparison. It should be noted that the Ion Torrent’s underlying technology has not changed with the release of the S5. Comparison of the Illumina and Ion Torrent systems has been published in several articles in areas such as clinical microbiology, germline variant detection, and prenatal testing as well as somatic variant in oncology. Both platforms performed well and the results are comparable. 42,43
Requirements for Performing Clinical Next-Generation Sequencing
Next-generation sequencing results for clinical purposes have substantially increased the amount of information available by generating massive amounts of sequence data. As a result, the overall detection rate of disease-causing alterations has grown significantly. Next-generation sequencing has allowed the creation of targeted gene panels that will sequence hundreds of genes at once for less expense compared to the Sanger method or PCR assays. A clinical setting has many factors to consider in choosing a platform compared to a research setting. Factors to consider in a clinical laboratory system include specificity and sensitivity, reproducibility, and analytical accuracy to ensure clinicians receive accurate results. 44,45 Next-generation sequencing technology yields massive amounts of data that require substantial analysis to produce a clinically relevant, concise result. This analysis requires appropriate infrastructure including analysis software, data storage, and accessibility. 46 –48 The report provided to the clinicians needs to be appropriately formatted. For example, a genetics or oncology report should include the classification of the variant (ie, pathogenic), literature describing the reportable variant(s), recommendations for further testing, and for oncology reports, indicate whether the variant is inherited or somatic. The data analysis usually involves massive genetic, genomic, and oncologic bioinformatics research and data analyses. Fortunately, there are jointly proposed guidelines from professional organizations for NGS testing, validation, proficiency testing, 42 reporting, and quality assurance/quality control requirements and documentation. 49 These articles will help standardize the proper application and interpretation of the NGS data for clinical utilization. Interested parties should refer to the extensive documentation in these articles for additional information.
Next-Generation Sequencing for Hereditary Disorders
Next-generation sequencing testing for hereditary disorders faces technical challenges, data management issues, reporting on incidental findings, and variant interpretation. 50 Despite these challenges, NGS has been valuable in identifying the underlying molecular cause of disorders, especially for those diseases which are genetically heterogeneous. In addition, NGS has been valuable in the detection of rare variants after single-gene analysis has been negative or when multigene panels were too labor-intensive and costly for Sanger sequencing. 51 For complex, rare phenotypes, NGS has also been a powerful tool in reducing the diagnostic odyssey often required to arrive at a diagnosis. 52,53 For example, in Usher syndrome, an autosomal recessive disorder characterized by sensorineural hearing loss, retinitis pigmentosa, and vestibular dysfunction (in a subset of cases), 54 NGS has allowed the development of targeted gene panels to survey all causative genes associated with Usher syndrome. Previously, a comprehensive analysis was limited by the labor-intensive, high cost of Sanger sequencing and turnaround times. Providing an earlier diagnosis for children with Usher Syndrome affords the opportunity for earlier medical management for patients and their families. 54
Next-generation sequencing has enabled diagnostic laboratories the ability to offer targeted disease panels for genetic disorders, such as connective tissue diseases, in addition to whole-genome sequencing (WGS) and WES. For patients with nonspecific clinical presentations, such as moderate to severe intellectual disability, 55 WGS is recommended. The diagnostic rate of WES is approximately 25% to 31% 53,56 similar to WGS 57 ; however, WGS has the additional advantage of detecting larger numbers of copy number variations. 57,58 Additionally, the reporting of incidental findings, defined as variants unrelated to the primary medical reason for testing, 59 need to be addressed in the context of exome and genome sequencing. The American College of Medical Genetics and Genomics updated their guidelines in 2017 for reporting incidental or secondary findings in 59 medically actionable genes in which known or expected pathogenic variants were identified. 59 Reporting known (or expected) disease causing mutations in conditions where preventive measures and/or treatments are available highlight the benefit of returning incidental findings to patients. 59 However, there are also limitations and challenges in identifying and reporting such findings including the consent process, follow-up diagnostic evaluation, and additional laboratory resources. 51,60,61
Next-Generation Sequencing for Detecting Microbial Organisms
The utility of NGS has been demonstrated for several applications involving pathogen biology and genomic epidemiology. These include targeted sequencing and unbiased interrogation of clinical samples for pathogen detection and identification (regardless of whether the organism can be cultivated or is viable), drug resistance profiling, strain typing and epidemiological outbreak investigation, microbiome studies, genomic determinant analysis of microbial functions including metabolism, and comparative ribosomal RNA phylogenetic studies. Next-generation sequencing brings added throughput, sensitivity, and informatics-based prowess to pathogen interrogation. It is emerging as a valuable diagnostic alternative when other methods fail to identify an organism or cannot decipher complex specimens such as in patients with polymicrobial infections. The use of NGS to detect evidence of Leptospira in the cerebral spinal fluid of a critically ill pediatric patient was a landmark case that demonstrated the clinical utility of unbiased NGS to achieve an actionable diagnostic result when other approaches failed, including phenotypic, immunologic, and targeted PCR-based assays. 62 Also, NGS can perform complete de novo genome sequencing for pathogens not yet fully characterized, providing reference genomes for further study. 63
HIV-1 genotyping for drug resistance prediction is a prototypical example of another value added by an NGS approach since it is more sensitive than Sanger sequencing and can detect small percentages of mutant quasi-species of potential importance to clinical management. A caveat is the technical and informatics challenges associated with authenticating minor variant calling in these applications. 63 There are several challenges for pathogen testing such as separating microbial nucleic acid from human DNA, library preparation from nonsterile site samples, de novo sequence assembly of uncharacterized organisms, and assigning clinical significance to microbial sequences. Microbial profiling by NGS for clinical purposes is currently limited to laboratories with the expertise and resources to support independently developed assays since none have yet been commercialized to the extent necessary for widespread adoption. Several academic and commercial reference laboratories now offer NGS services to laboratories without the means to employ NGS technology on their own.
Next-generation sequencing can be applied to comparative microbiome characterizations in healthy and disease states or pre- and postinterventions. Next-generation sequencing characterization of the intestinal microbiome before and after fecal microbiota transplant (FMT) in cases of Clostridium difficile colitis has added to our knowledge of microbiome protection, microbial pathogenesis, and therapeutic efficacy of FMT. 63,64 What we learn from NGS studies may create “personalized medicine for infectious diseases” by informing clinical management options and prognostication. 65
Next-generation sequencing utility has been demonstrated for microbial strain typing in epidemiological outbreak investigations, at least for those involving a limited number of strains or a single strain. A highly publicized example is the 2011 European shiga-toxin Escherichia coli outbreak during which NGS provided real-time de novo characterization of a novel outbreak strain. 63,65
Improvements are needed in order to make NGS an effective or adjunct tool for routine use in clinical microbiology, including commercialized, cost-competitive, user-friendly library preparation and instrumentation and software, standardized protocols and proficiency testing, well-curated reference genomes, and regulatory mandates revised to align with changing technology and practice. As previously mentioned, appropriate improvements in infrastructure may be required to accommodate the complex data to produce a succinct clinical report.
Next-Generation Sequencing Applications in Oncology
Next-generation sequencing tests for diagnosing and managing oncology patients have been used since the technology was utilized to diagnose patients with solid tumors 66 –68 or hematologic abnormalities. 69 The advantage of NGS lies in its ability to conduct large-scale inquiries for many sequence variants that are comprehensive, inclusive, and sensitive. 70,71 Consequently, the technology can actually save costs compared to multiple, individual nucleic acid-based tests (such as fluorescent in situ hybridization, PCR/sequencing, etc). The small amount of tissue required may also obviate the need for an additional procedure, such as a repeat biopsy, to obtain sufficient material for analysis. Older methods required more nucleic acid which could not be extracted from biopsies, but the smaller amounts of tissue necessary for NGS may allow successful sequencing of the original biopsy. 72 Next-generation sequencing offers clear advantages compared to the traditional one-gene one-test approach. Germline or somatic variants can be detected by NGS depending on the goal of testing. Libraries for somatic changes may be created from off-the-shelf panels or customized for individual types of malignancies. Previously, WGS and WES were not considered practical for routine clinical use; however, many academic laboratories and commercial vendors are developing test panels, using several genes or hundreds of genes, 73 to detect a variety of genetic/somatic variants in cancers. Testing somatic variants in tumor specimens requires sequencing at a higher depth (ie, 1000× average coverage) that is offered by targeted panels, in contrast to germline testing in which a lower sequencing depth (ie, 30× average coverage) may be undertaken to reliably detect variants. 42 To assess the clinical relevance of sequencing results, several determinants are considered, including single-nucleotide polymorphisms, point mutations or single-nucleotide variants, nucleotide insertions or deletions, gene fusion/rearrangements, and copy number variations (see Table 1 for definitions). Next-generation sequencing tests can be DNA or RNA based, or both, depending on the purpose and design of the test. As the technology matures, test panel costs are becoming affordable for routine clinical use and are being rapidly deployed in laboratories. This is especially true in the field of oncology for diagnostic and prognostic purposes, as well as the selection of appropriate therapies. 74,75 The NGS tool has become an important part of a personalized medicine approach to target therapy.
Next-generation sequencing has been used as a molecular diagnostic test for many solid tissue cancers as well as hematologic malignancies. Test results can be helpful for the initial diagnosis, tumor classification, determining the origin of the cancer, and prognosis. 76 Table 2 77 –130 provides a partial list of cancers where NGS information has provided value for managing patients. Thyroid nodules are a specific example where fine needle aspiration and cytologic examination may not yield a definitive diagnosis, while NGS has been shown to have high specificity and sensitivity for cancer detection. 77 However, not all patients will derive enough clinical benefit to justify the cost of using NGS testing and careful test utilization is prudent.
Examples Where NGS Provides Additional Information.*
Abbreviations: GIST, gastrointestinal stromal tumor; NGS, next-generation sequencing.
*Partial list of tumors/cancers where NGS has been shown to provide additional information.
A clear, clinically important use of NGS is to identify the most appropriate therapy for the individual patient. 131 The National Cancer Institute's Molecular Analysis for Therapy Choice trial is a good example of how NGS technology can be utilized in clinical practice. 74,75 Despite the promising clinical utility of NGS, the influence of molecular profiling on individual patient’s targeted therapy has yet to reach its full potential. For example, the Integrated Molecular Profiling in Advanced Cancers Trial and Community Oncology Molecular Profiling in Advanced Cancers Trial (IMPACT/COMPACT) trial showed that only 5% of patients received targeted treatments based on their profiling results. 132 While this is a relatively low number, the study did not comprehensively evaluate factors that may have influenced the targeted therapy. The trial was limited to specimens obtained many years prior to the molecular testing and did not profile the metastatic lesions, which may have yielded different molecular profile. Also some of the patients included in the study were heavily pretreated and were not well enough to receive further treatment based on the results of molecular testing. Molecular testing also did not include copy number variation or recurrent translocations, which may have influenced the therapeutic decision.
With the availability of clinical trials matching drugs targeting specific genetic alterations, many academic medical centers 133 and even larger community hospitals have begun to adopt NGS into their routine practice. Companion tests for targeted therapy are also in development. Among the current obstacles, preventing even wider adoption are the initial cost to purchase the instrument and complex bioinformatics to interpret the sequence data. Recently, commercial laboratories have entered into this market and competition will ultimately lower the cost and improve the quality of products. Future development will allow the NGS technology to be more affordable with wider applications such as cell-free DNA for circulating tumor DNA detection or liquid biopsy. 134 These can potentially be used for monitoring disease progression, finding secondary mutations (such as mutations in epidermal growth factor receptor), minimal residual disease management, 135 and occult tumor detection.
Conclusion
Next-generation sequencing has become a widely used technology in the field of pathology. Several advances have reduced the time and cost of the test, while data analysis has extended the utility. Routine histopathology and diagnostic work will not be replaced in the near future, but NGS offers significant advantages in selected cases.
Footnotes
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by NIH grant R21 AI112887-01, R01GM117519, and T32 GM86308 (DGR).
