Abstract
Contemporary molecular biology research tools have enriched numerous areas of biomedical research that address challenging diseases, including endocrine cancers (pituitary, thyroid, parathyroid, adrenal, testicular, ovarian, and neuroendocrine cancers). These tools have placed several intriguing clues before the scientific community. Endocrine cancers pose a major challenge in health care and research despite considerable attempts by researchers to understand their etiology. Microarray analyses have provided gene signatures from many cells, tissues, and organs that can differentiate healthy states from diseased ones, and even show patterns that correlate with stages of a disease. Microarray data can also elucidate the responses of endocrine tumors to therapeutic treatments. The rapid progress in next-generation sequencing methods has overcome many of the initial challenges of these technologies, and their advantages over microarray techniques have enabled them to emerge as valuable aids for clinical research applications (prognosis, identification of drug targets, etc.). A comprehensive review describing the recent advances in next-generation sequencing methods and their application in the evaluation of endocrine and endocrine-related cancers is lacking. The main purpose of this review is to illustrate the concepts that collectively constitute our current view of the possibilities offered by next-generation sequencing technological platforms, challenges to relevant applications, and perspectives on the future of clinical genetic testing of patients with endocrine tumors. We focus on recent discoveries in the use of next-generation sequencing methods for clinical diagnosis of endocrine tumors in patients and conclude with a discussion on persisting challenges and future objectives.
Introduction
Modern challenges in biomedical research demand more data-generating capacity than traditional DNA sequencing technologies such as the Sanger-based methods can offer. Next-generation sequencing (NGS) methods with their extraordinary speed and scalability enable biological researchers to study systems with a definition higher than was ever possible before. The first success in DNA sequencing occurred in the 1970s, and the advent of sequencing using capillary electrophoresis made the technology much more economical.1–3 In the 2000s, high-throughput sequencing platforms were developed, collectively known as NGS, which led to a significant and rapid expansion of genomic information. The sequencing technologies are categorized into short-read or long-read NGS. 4
Short-read NGS
Short-read NGS constitutes the sequencing of approximately 25–100 bp. 5 Both the sequencing by ligation (SBL) and sequencing by synthesis (SBS) types of short-read NGS require DNA fragmentation (physical, chemical, or enzymatic methods) and clonal template generation. 6 The size of the inserts is dependent on the specific NGS instrumentation and sequencing application. 7 Most of the sequencing technologies utilize library construction after DNA fragmentation. The inserts/fragments obtained are converted to double-stranded DNA (dsDNA) and ligated with specific adapters. 8 For RNA sequencing by NGS, the RNA is fragmented before its conversion to complementary DNA (cDNA). 9 In the case of library preparation from microRNA (miRNA) and other small RNAs, gel-based size enrichment of the ligated fragments is essential to separate the miRNA library (120 bp adapters and ~30 bp miRNAs) from adapter dimer side products generated during ligation reactions. 9 In addition, miRNA-sequencing libraries can be made using the small RNA expression kit (SREK) available from Ambion and Applied Biosystems. 10 A similar workflow can be adopted when preparing libraries for sequencing whole genomes or targeted regions within genomes (e.g. exome sequencing), and for chromatin immunoprecipitation sequencing (ChIP-seq) experiments. 11
The available strategies for clonal template generation are bead, solid-state, and DNA nanoball methods. In emulsion polymerase chain reaction (emPCR), the emulsion oil is mixed with library fragments, polymerase chain reaction (PCR) mix, and adapter-conjugated beads. Effective mixing forms microscopic water-based reaction chambers in an oil emulsion, with each chamber having one bead, one library construct, primers, and the PCR mix. In the first cycle of PCR, the reverse strand of the library construct is extended on the adapter site of the bead. In the second PCR cycle, both the original (reverse) strand and the newly synthesized forward strands are extended. Subsequent cycles lead to the amplification of each single library construct on a single bead.12,13 These beads with nearly 107 clonal copies of a unique template are then distributed in wells or on glass slides of the sequencing platforms.14,15 Sequencing platforms such as 454 (Roche), SOLiD (supported oligonucleotide ligation and detection system; Thermo Fisher), GeneReader (Qiagen), and Ion Torrent (Thermo Fisher) use emPCR for clonal amplification. 4 Solid-state bridge amplification is used by Illumina, and in this method, the DNA template is flanked by adapters on either side. The adapter on one side attaches to the solid surface and the adapter on the other side complements with another adapter on the solid surface, leading to the formation of the bridge. This template DNA is amplified using unlabeled deoxynucleotide triphosphates (dNTPs) and enzymes. The dsDNA thus formed is used for subsequent PCR cycles, forming millions of copies of a single sequence, which is then used for sequencing.8,15 Solid-phase template walking, used in SOLiD Wildfire (Thermo Fisher), involves hybridization of the free DNA template to a primer on a solid support followed by amplification. The dsDNA is then partially denatured, hybridized to a neighboring primer, and amplified, leading to cluster generation. 4 DNA nanoball generation is used by Complete Genomics (BGI), which is the only “in solution” clonal template amplification method. It involves a series of adapter ligations, circularization, a cleavage reaction, rolling circle amplification of the DNA, and the formation of complex DNA concatemers. These DNA nanoballs are then immobilized on a patterned flow cell for sequencing.4,16,17
SBS
The 454 pyrosequencer, Illumina, and Ion Torrent are SBS-dependent sequencing platforms. In 454 pyrosequencing, picotiter plates are used to accommodate thousands of reactions (Figure 1). Here, the DNA is immobilized and dNTPs are sequentially added. The incorporation of a nucleotide leads to the release of pyrophosphate (PPi). The PPi reacts with a chemiluminescent enzyme, leading to the production of light, which is detected and the sequencing data output is captured as a program.15,18,19 Illumina sequencing is dependent on the bridge amplification of single DNA templates on a solid surface. The incorporation of dNTPs labeled with different fluorescent dyes allows each base to be individually identified. In addition, the dye acts as a terminator and is cleaved after base detection to identify the next base. 20 The Ion Torrent NGS chip has a solid-state pH sensor and microwells for individual templates. The dNTPs are applied sequentially, and base incorporation causes the release of H+. The change in pH is detected by a sensor in the well. 21 Currently, Illumina dominates the short-read sequencing platforms with an accuracy rate of >99.5%, although it is reported to underrepresent AT- and GC-rich regions.4,22,23

Overview of sequencing technologies discussed in the review. The left panel represents short-read sequencing technologies. Clonal amplification methods such as emPCR, SSBA, and SPTW required for short-read sequencing are depicted. Conceptual indication of the various sequencing by synthesis and sequencing by ligation methodologies is shown. Illumina, bridge amplified DNA template is sequenced using differently labeled nucleotides on a solid surface; 454 pyrosequencing, clonally amplified DNA bead with enzyme and packing beads are shown in a picotiter plate. Ion torrent, the H+ released on dNTP incorporation on a DNA bead is detected by pH sensor. SOLiD, DNA template on a bead with its adapters, primer, and ligase are displayed. cPAL, anchor sequence and several combinations of probes are used for sequencing in series of ligation reactions. The right panel represents long-read sequencing technologies. In PacBio technology, the DNA is sequenced in zero-mode wavelength chip. The nucleotide analogs, DNA template, and polymerase are depicted. In Oxford nanopore technology, the DNA is sequenced as it enters a nanopore and each nucleotide affects ion flow variably. Note that the DNA adapter molecule and ion flow are shown.
SBL
SOLiD and Complete Genomics (BGI) sequencing technologies are based on SBL. In SOLiD, a series of hybridization and ligation reactions are used for sequencing. DNA template fragments are ligated to oligonucleotide adapters and clonally amplified by emPCR on a bead. A primer is annealed to the adapter, and ligation sequencing starts with the binding of fluorescently labeled interrogation probes. These probes are DNA octamers with two probe-specific bases and six degenerate bases. They are used in 16 possible two-base combinations, and they bind to template DNA adjacent to the primer.4,24 Ligation of the interrogation probe to a primer is followed by the detection of a fluorescent signal and cleavage of the probe leaving a 5′-P end that can bind to a new set of probes. These cycles are repeated seven times to extend the primer. The process is repeated with a complete set of reactions offset by one base in the adapter sequence. 24
Complete Genomics uses DNA nanoballs for clonal amplification and cPAL (combinatorial probe anchor ligation) for sequencing. 17 Here, the anchor sequence and a probe hybridize to the DNA at several places on a nanoball. The probe has a single known base and degenerate bases with a fluorophore. After imaging, the probe and anchor sequences are removed. These sequencing cycles are repeated on both sides of the adapter using a probe with a known base at the n + 1 position. 17
Long-read NGS
Long-read sequencing covers kilobases of DNA; hence, it can provide in-depth information about transcripts, long repetitive elements, and copy number variations. 25 Long-read NGS reactions are performed using single-molecule real-time (SMRT) sequencing and synthetic approaches. SMRT sequencing does not require clonal amplification. Pacific Biosciences (PacBio) uses the SMRT approach. In SMRT, a single molecule is sequenced in a zero-mode wavelength (ZMW) well on a smart cell. The associated optics are adequately sensitive to detect the incorporation of one fluorescently labeled dye molecule. Library preparation generates a circular SMRT bell wherein the template DNA is ligated with adapters with hairpin structures. Phi29 DNA polymerase is bound to the bottom of ZMW wells. The wells contain polymerase, template molecule, sequencing primer, and four nucleotides tagged with different fluorescent dyes. An optical system illuminates the wells from the bottom, and a parallel confocal recording system records single nucleotide incorporations in real time. The extension step leads to cleavage of the fluorescent dye, which then diffuses out of the illumination zone. 26
Oxford nanopore technology (ONT) is a single-molecule and base-free sequencing technology, in which the DNA molecule is unwound and passed through a nanopore. Each base blocks the flow of ions into the pore differently, enabling identification of the base. 27 Synthetic long-read NGS requires partitioning large DNA fragments into microtiter wells or the reaction chambers of an emulsion. Within each partition, the template DNA fragment is sheared, barcoded, sequenced on short-read instrumentation, and digitally reassembled using the barcodes 4 (Figure 1). The Illumina synthetic long-read sequencing platform and the 10X genomics emulsion-based system offer synthetic long-read technology. Synthetic long-read NGS is compatible with the sequencing of repetitive elements and can be performed with minimal starting material. 4 Some of the computational tools for NGS data analysis include Bowtie2, MAQ, NextGENe, and SpliceMap, and targeted NGS can be performed without the involvement of bioinformaticians.28,29 The recent applications of NGS in various fields include metagenomics, prenatal testing, epigenetic studies, and identification of cancer variants.8,30
NGS for thyroid cancer
Thyroid tumors are one of the most common types of endocrine cancers. According to the SEER (Surveillance, Epidemiology, and End Results) cancer statistics of the US National Cancer Institute (NCI), it is estimated that there will be 64,300 new cases of thyroid cancer, of which 1980 people will die in the year 2016 (http://seer.cancer.gov/statfacts/html/thyro.html). The increase in the number of diagnosed cases can be attributed to enhanced techniques, including ultrasonography, that have improved the detection of thyroid cancers. 31 Thyroid cancers are classified into different types, of which papillary thyroid cancer comprises 80% of all the cancer cases, followed by follicular thyroid cancer (20%). Both these types are differentiated thyroid cancers that arise from thyroid follicular cells. Medullary thyroid cancer arises from parafollicular C cells and represents 6%–8% of the thyroid cancer cases. Anaplastic thyroid cancer is an aggressive type of thyroid cancer, and it develops by dedifferentiation from differentiated thyroid cancer. 31 Most of the types of thyroid cancers behave indolently, and caretakers for patients with thyroid cancer must opt for the newer molecular diagnostic methods and targeted therapies.
Applications of NGS techniques in thyroid cancer diagnosis are discussed below. In the past few years, significant developments in NGS technologies have facilitated the identification of molecular markers, including gene mutation and gene expression classifiers. Routine sequencing of a large region of the genome is not possible, and enrichment of genomic regions of interest before NGS is desirable and can be achieved by various enrichment methods. For example, anchored multiplex PCR, a rapid and efficient target enrichment method for NGS that requires less nucleic acid input from formalin-fixed paraffin-embedded (FFPE) specimens, has identified gene fusion PPL-NTRK1 in thyroid carcinoma after analysis of 986 clinical FFPE samples. This suggests an important clinical and research application of anchored multiplex PCR in NGS. 32 Thyroid cancer represents a malignant neoplasm of the follicular or parafollicular thyroid cells. Papillary thyroid carcinoma (PTC) is the most common adult thyroid malignancy often characterized by the presence of multiple anatomically distinct foci within the gland. 33 To understand the clonal relationship between multiple tumors within the thyroid gland, whole-exome sequencing and targeted region sequencing were performed in eight multifocal PTC patients by Lu et al.33 They concluded that multifocal PTCs might arise from metastases of a single clone of a malignant cell or from multiple independent origins. 33 The clinical management of patients is hampered by indeterminate diagnoses, such as the similar follicular (or oncocytic) neoplasm versus suspicious for a follicular neoplasm (FN/SFN). Nikiforov et al. analyzed 143 fine-needle aspiration (FNA) samples with a cytologic diagnosis of FN/SFN by using a targeted and customized ThyroSeq v2 next generation sequencing (NGS) panel.34 This panel consists of testing point mutations in 13 genes and 42 types of gene fusions that are present in thyroid cancer. Genotyping of thyroid nodules using this panel provided accurate diagnoses for nodules with FN/SFN cytology and indicated the optimal clinical management of these patients. 34 Even though this panel was customized for FN/SFN, it can be used for a variety of thyroid tissues, and its most important advantages are that a lower amount of DNA is required and it provides a quantitative assessment of mutant alleles. 35 Moreover, unnecessary surgery linked to 34 indeterminate classifications of thyroid nodule diagnosis by FNA could have been better managed if NGS analyses of mutations in 50 genes were considered in the diagnostic process. 36 The above study by Mercier et al shows that NGS data can add important value in difficult diagnoses.36 All the above tests are helpful for patients who avoid surgery because of surgical risk. The unclear natural history of follicular lesion and the lack of any new guidelines in the follow-up of patients who avoid surgery because of the NGS molecular profile results hamper the final clinical decision making. Tests detecting mutations or RNA fusion transcripts in thyroid cancer can identify malignant nodules with high positive predictive value. 37 Microarray-based interrogation of different genes can detect benign nodules with a high negative predictive value. 37 The limitations of these techniques are their sensitivities, and the diagnostic yield can be further enhanced by adding new sensitive molecular markers, including miRNAs. In this regard, a novel and interesting diagnostic algorithm combining mutation detection and miRNA expression was employed by Labourier et al. 37 in the analyses of 109 undetermined FN/SFN cytology samples to improve the diagnostic yield of molecular cytology. In addition, miRNA isoforms called isomiRs, the analysis of which improved the understanding of thyroid tumorigenesis. 38 The NGS study of how miR-146b-3p/PAX8/NIS regulatory axis controls differentiation in thyroid cells provided an important insight into a therapeutic target that may allow modulation of differentiation to enhance antitumor responses to radio-iodide treatment. 39 A comprehensive analysis of anaplastic thyroid carcinomas using targeted massively parallel sequencing identified several novel mutations. 40
NGS for parathyroid tumors
Parathyroid tumor is a rare malignant neoplasm that occurs in one or more of the parathyroid glands. Unlike primary hyperparathyroidism, the parathyroid tumor is rare with an estimated incidence of less than one person per million and occurs with equal frequency in men and women.
41
The 10-year overall survival rates for parathyroid tumor vary from 49% to 77%.
41
Usually, high circulating levels of parathyroid hormone and hypercalcemia associated with parathyroid tumors cause morbidity and mortality rather than the tumor burden itself.
41
Various genomic alterations including mutation and alteration in the oncogenes
NGS for ovarian cancer
Ovarian carcinoma is a heterogeneous disease and is the most fatal among all reproductive tumors. Current therapies for ovarian carcinoma are often ineffective because of delayed diagnosis, recurrence of the tumors, resistance to platinum-based chemotherapy, and side effects from various drugs. As per NCI SEER statistics (http://seer.cancer.gov/statfacts/html/ovary.html), the number of estimated new cases is 22,280 and the number of estimated deaths is 14,240 in the US population. Mutations in breast cancer genes
NGS for testicular cancer
Testicular cancer develops in the testicles and has one of the highest cure rates, compared with other types of cancers, with an average 5-year survival rate of 95%. According to the US NCI, testicular cancer represents 0.5% of all new cancer cases in the United States (http://seer.cancer.gov/statfacts/html/testis.html). Estimated new cases and deaths in the year 2016 are projected to be 8720 and 380, respectively. It is more common in young adults, and data from studies between 2006 and 2012 suggested that 95.4% patients survive 5 years with testicular cancer (http://seer.cancer.gov/statfacts/html/testis.html). In comparison with the prevalence of many other cancers, the occurrence of testicular cancer is rare. Seminoma is a germ cell tumor of the testicle and is typically haploid with amplification involving chromosome arm 12p. Functional studies to identify the driver genes or genes in 12p are currently lacking, but mutations in a few genes such as the
NGS for adrenal tumors
Malignant neoplasms in the cells of the adrenal gland can cause adrenocortical carcinoma in the cortex region and neuroblastoma or pheochromocytoma in the medullary region. Although many of these tumors do not metastasize, they may cause significant problems because of their increased production of hormones that are normally kept under strict control and in balance with other hormonal signals. Pheochromocytomas and paragangliomas are rare neural crest-derived catecholamine-producing tumors that originate from adrenal chromaffin cells or extra-adrenal sympathetic and parasympathetic tissues.
58
Both these tumors lead to clinical symptoms caused by the cardiovascular effects of excess catecholamine secretion and are mostly associated with clinical syndrome von Hippel–Lindau disease, multiple endocrine neoplasia 2A (MEN2A), and neurofibromatosis type I caused by
NGS for neuroendocrine tumors
These are highly heterogeneous neoplasms that arise from the dispersed neuroendocrine system, which is composed of the endocrine and nervous systems (reviewed by Suresh et al.
61
). Gastroenteropancreatic neuroendocrine tumors (NETs) secrete peptides and neuroamines. Pancreatic NETs are extremely rare, with an incidence of one per 100,000 per annum. They are classified into functional (such as insulinomas, glucagonomas, somatostatinomas, and gastrinomas) and nonfunctional tumors (reviewed by Suresh et al.),
61
based on the secretion of hormones by the tumors. Midgut carcinoids arise from enterochromaffin cells in the mucosa of the jejunum, ileum, cecum, and the ascending colon (reviewed by Suresh et al.).
61
Understanding the genetics of these tumors is a major hurdle in the development of more efficient therapeutic options. Neuroendocrine lung tumors comprise 20%–25% of all lung cancers, and no effective treatments are available.
62
Lung NETs are classified into typical carcinoids (TCs), atypical carcinoids (ACs), large-cell neuroendocrine carcinomas (LCNECs), and small-cell lung carcinomas (SCLCs).
63
Their histological classification is based upon cell features, mitosis count, and necrosis.
63
NGS techniques were used to search for molecular alterations that differentiate between these tumors and revealed differences in gene signatures, including chromatin remodeling genes, cell cycle checkpoint genes, and cell differentiation regulators.
63
Non-synonymous mutations in the proto-oncogene
NGS for multiple endocrine neoplasia, pituitary, and pancreatic tumors
Multiple endocrine neoplasia (MEN) represents a group of disorders affecting multiple endocrine glands. MEN type 1 mainly involves the parathyroid glands, the pituitary gland, and the pancreas, whereas type 2 involves the thyroid gland (medullary thyroid carcinoma) and the adrenal gland. Type 2 is further sub-classified into type 2A, 2B, and familial medullary thyroid carcinoma. Type 3 is not a major form of MEN; however, type 4 has signs and symptoms similar to those of type 1, although it has mutations in a different gene. Margraf et al.
72
identified
Challenges of NGS and future directions
Although NGS is increasingly acclaimed to revolutionize the clinical diagnosis and management of endocrine tumors, there are challenges in terms of (1) quality (paraffin embedded, variably degraded, and heterogeneous) and quantity of available endocrine tumor specimens for NGS analysis; (2) most of the data obtained after NGS analyses are in the form of short reads, and interpretations of these data have challenges associated with base calling, sequence alignment and assembly, and variant calls; (3) as NGS provides a huge volume of data, data processing, storage, and management is another challenge that blocks translation of NGS analysis into clinical practice; (4) successful assay validation requires the establishment of sensitivity, specificity, limit of detection, reproducibility, precision, and accuracy as per regulatory guidelines for the clinical applicability of NGS developed in the research setting. Several reports provide information regarding the validation of NGS in terms of sequencing platforms, target capture technology, gene panels, reporting criteria, and overall performance in clinical laboratories. Addition of new gene markers to the existing tumor panel also requires revalidation. Selective capture and amplification of genomic areas in targeted NGS by multiplexed PCR-based target amplification approaches, such as Ion AmpliSeq and GeneRead target capture technology, are well suited; however, there exists an intrinsic bias in the capture of AT- and GC-rich areas. NGS has empowered geneticists with a plethora of genetic information about endocrine tumors, which can revolutionize personalized medicine and aid differential diagnosis, clinical decisions, drug development, tumor classification, and disease management (Table 1). However, storage and analysis of the massive amount of data generated by NGS is a major challenge. Although several bioinformatics tools are available for assigning base quality scores, demultiplexing, and identifying variants, the raw data should still be stored for future analysis with more advanced techniques. Sequencing centers will be confronted with the need for such storage systems. Furthermore, sequencing of longer fragments yields more variants, some of which may not be pathogenic but could be harmless polymorphisms. The unpredictability of patient responses to such information adds yet another layer of complexity. Hence, patient consent forms should acknowledge the possibilities of such genetic knowledge being revealed, and pre-test genetic counseling is essential. In addition, clinicians need to be trained to understand the genetics of a disease to correctly interpret the NGS data for translational medicine. In addition, it should be known that short-read NGS library preparation could bring complexity, bias, and batch effects that give rise to false variants; therefore, it can affect translational research. 9 The challenges of NGS testing have to balance the advantages of implementing it in clinical laboratories. The decrease in the cost of NGS testing is expected to further encourage its application in clinical laboratories. Moreover, the powerful third-generation technologies are expected to further simplify and revolutionize clinical genomics. For the best use of NGS in clinical genomics, establishing effective communication among clinical geneticists, cancer clinicians, and patients is essential.
Applications of NGS in endocrine tumors discussed in the review.
NGS: next-generation sequencing; FFPE: formalin-fixed paraffin-embedded; ATC: anaplastic thyroid carcinoma; miRNA: microRNA; FNA: fine-needle aspiration; MPTC: micropapillary thyroid cancer; PGM: Personal Genome Machine.
Footnotes
Acknowledgements
P.S.S. and T.V. contributed equally to this work.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: P.S.S. thank University Grants Commission (Faculty recharge programme), India, and Department of Science and Technology-Science and Engineering research board (DST-SERB), India (YSS/2014/000020), and T.V. thank DST-SERB (YSS/2014/000061).
