Abstract
Background:
Thiopurines are a class of immunosuppressive drugs that are often the mainstay of treatment for diverse immunological and malignant conditions. The protein product of the gene NUDT15 (nudix hydrolase 15) has been linked to thiopurine metabolism as it negatively regulates thiopurine activation and toxicity. A missense variant of the gene, NUDT15*3 (rs116855232, c.415C>T, and p.R139C), has been the most reported NUDT15 single nucleotide polymorphism in the literature.
Objective:
In this study, we devised an allele-specific polymerase chain reaction (ASPCR) assay to determine genotype at the rs116855232 locus and employed the method to estimate allele frequencies in the Bangladeshi population.
Materials and methods:
A total of 184 randomly selected individuals were recruited in this study. The allele frequencies of the NUDT15*3 variant (rs116855232) in the Bangladeshi population and genotypes at the locus were determined by ASPCR. The ASPCR genotype data were validated by Sanger sequencing to ascertain the reliability of the assay.
Results:
NUDT15 *1/*1 and NUDT15 *1/*3 genotype frequencies were 83.70% and 16.70%, respectively, while no instances of NUDT15 *3/*3 genotype were observed. Estimated allele frequencies of C and T alleles were 0.924 and 0.076, respectively.
Conclusion:
Overall, this study provides a comprehensive account of the distribution of the debilitating SNP and proposes a time- and cost-efficient pharmacogenetic testing procedure that may inform personalized thiopurine therapy.
Introduction
Thiopurines are purine antimetabolites comprising azathioprine, mercaptopurine, and thioguanine that are indicated for autoimmune diseases, e.g., inflammatory bowel disease (IBD) and rheumatoid arthritis,1 for acute lymphocytic leukemia (ALL)2 and to prevent rejection after a solid organ transplant.3 The enzyme encoded by the thiopurine S-methyltransferase (TPMT) gene has long been implicated in metabolizing thiopurines to modulate their cytotoxic effects,4,5 and TPMT genetic variations have been constantly associated with thiopurine-induced hematopoietic toxicity.6 While TPMT presents a well-studied example of pharmacogenes with the potential to individualize thiopurine therapy, new candidate polymorphisms have been identified in nudix hydrolase 15 (NUDT15) that independently contribute to thiopurine drug toxicity.7 The Nudix hydrolase 15 (NUDT15) gene (chromosomal location: 13q14.2) encodes an 18.6 kDa enzyme that belongs to the Nudix hydrolase superfamily. Members of this protein superfamily catalyze the hydrolysis reaction of nucleoside diphosphates.8 Substrate screen has identified that the NUDT15 enzyme could hydrolyze 6-thio-dGTP and 6-thio-GTP into their respective monophosphates,9 preventing their incorporation into nucleic acid chains and/or inhibitory effect on Rac1.10 This suggests a possible role for NUDT15 in thiopurine drug metabolism, tempering thiopurine activity and toxicity.
Several NUDT15 missense variants are associated with complications with respect to clinical usage of thiopurines, such as *3 (rs116855232, c.415C>T, and p.R139C), *4 (rs147390019, c.416G>A, and p.R139H), and *5 (rs186364861, c.52G>A, and p.V18I).7 These germline variants of NUDT15 diminish enzyme activity and have been reported to induce leukopenia in patients with ALL and IBDs.7,11,12 rs116855232 is relatively more abundant among the SNPs. Life-threatening myelosuppression has been noticed and largely explained by this genetic variation.13 Administration of thiopurine drugs, therefore, entails prior pharmacogenetic testing to pre-empt drug-related myelotoxicity.14 Devising highly accurate and affordable pharmacogenetic tests is imperative, especially in a lower-middle income country like Bangladesh.
Bangladesh is experiencing an increasing cancer burden with 156,775 new cancer cases in 2020 as estimated by WHO.15 Acute leukemias are the most prevalent form (42.4%) of hematological malignancies in Bangladesh,16 and ALL is the most common childhood malignancy affecting the population.16,17 Furthermore, IBDs including ulcerative colitis18 and rheumatological disorders19 are being diagnosed more often, altogether projecting an increase in thiopurine administration in treating putative clinical conditions. However, there has been no extensive pharmacogenetic study so far to have examined the frequency of NUDT15 variant alleles contributing to thiopurine toxicity in this population.
In this study, an allele-specific polymerase chain reaction (ASPCR)-based genotyping method has been developed for the rs116855232 SNP locus, which is inexpensive and simple to perform in the existing diagnostic set-ups of Bangladesh. We also investigated the genetic architecture of the Bangladeshi population based on the genetic profile at this locus.
Materials and Methods
Sample size estimation, sample collection, and DNA extraction
The Cochran’s method20 for determining an adequate sample size in cross-sectional studies with large population was used in this study. The formula used for determining the minimum sample size for our study was as follows:
However, we collected liquid blood samples from 184 random Bangladeshi individuals between April 2019 and November 2019. Genomic DNA was extracted from whole blood using FlexiGene® DNA Kit (QIAGEN) following the manufacturer’s protocol. The concentration and purity of the extracted DNA were measured using a NanoDrop™ 2000 spectrophotometer (Thermo Fisher Scientific).
Primer design and ASPCR
Allele-specific primers were designed for the rs116855232 locus in compliance with the principle described by Wangkumhang et al. 22 The primer set for the locus included two separate allele-specific forward primers and one common reverse primer ( Table 1). Target specificity of all the primers was checked in silico using the Primer-BLAST tool.23 All primers were purchased from Macrogen Inc. (South Korea). Temperature-gradient PCR was performed in a SimpliAmp™ thermal cycler (Thermo Fisher Scientific) to optimize annealing conditions for allele-specific PCR (ASPCR). An amount of 10–50 ng of genomic DNA template was used for amplification in a final reaction volume of 25 μL with 2.5 μL of 10× PCR buffer (B71; Thermo Fisher Scientific), 0.75 μL of 10 mM dNTP mix (R0191; Thermo Fisher Scientific), 0.5 μL of 10 μM rs116855232 allele-specific forward primer, 0.5 μL of 10 μM common reverse primer for rs116855232, 0.2 μL of 5 U/μL Taq DNA polymerase (EP0702; Thermo Fisher Scientific), and PCR grade water. The reaction cycle condition was as follows: an initial denaturation step at 95°C for 3 min, then 32 cycles each with denaturation at 95°C for 30 s, annealing at 60°C for 30 s, and elongation at 72°C for 30 s followed by a hold at 4°C. The PCR amplified products were resolved in 1.5% (w/v) agarose gel using 0.5× Tris-acetate-EDTA (TAE) buffer along with DNA size markers (SM0241; Thermo Fisher Scientific). Amplicons were observed and photographed in a gel documentation system (Fusion Solos S; Vilber), following incubation with EZ-Vision dye (97064-190; VWR Life Science) in TAE buffer. The determination of genotype at the rs116855232 locus was contingent upon the banding pattern in the gel photographs.
Sequences of the primers used in this study
WF: wild-type-specific, forward; MF: mutant-specific, forward; CR: common, reverse; Seq_For: sequencing primer, forward; Seq_Rev: sequencing primer, reverse.
DNA sequencing
To confirm the genotype data obtained through ASPCR, randomly selected DNA samples were amplified by PCR in a SimpliAmp™ thermal cycler (Thermo Fisher Scientific) using human NUDT15 gene-specific primers that encompass the rs116855232 locus (Table 1). An amount of 10–50 ng of genomic DNA template was used for amplification in a final reaction volume of 25 μL with 2.5 μL of 10× PCR buffer (B71; Thermo Fisher Scientific), 0.75 μL of 10 mM dNTP mix (R0191; Thermo Fisher Scientific), 0.5 μL of 10 μM sequencing forward primer, 0.5 μL of 10 μM sequencing reverse primer, 0.2 μL of 5 U/μL Taq DNA polymerase (EP0702; Thermo Fisher Scientific), and PCR grade water. The reaction cycle condition was as follows: an initial denaturation step at 95°C for 3 min, then 32 cycles each with denaturation at 95°C for 30 s, annealing at 58°C for 45 s, and elongation at 72°C for 1 min followed by a hold at 4°C. The PCR amplified sequences were resolved in 1.5% (w/v) agarose gel and observed in a gel documentation system (Fusion Solos S; Vilber). PCR products were purified using the FavorPrep™ GEL/PCR Purification Kit (FAGCK 001; Favorgen Biotech Corp.) following the manufacturer’s protocol. The concentration and purity of the purified PCR products were checked in a NanoDrop™ 2000 spectrophotometer (Thermo Fisher Scientific). Cycle sequencing was performed using the BigDye™ Terminator v3.1 Cycle Sequencing Kit (Thermo Fisher Scientific), and products from cycle sequencing were purified using the BigDye XTerminator™ Purification Kit (Thermo Fisher Scientific) following the manufacturer’s protocol. Capillary electrophoresis was performed on the ABI Prism 310 Genetic Analyzer (Thermo Fisher Scientific). Sequences were analyzed using the Unipro UGENE (version 1.29.0) software.24
Data analysis
The allele frequencies of NUDT15 rs116855232 in the Bangladeshi population and genotypes at the locus were calculated and presented using the Microsoft Excel 2019 (version 1808)25 and GraphPad Prism® (version 6) software.26
Results
ASPCR genotypes and their validation by Sanger sequencing
In this study, an ASPCR assay was developed and applied to determine individual genotypes at the rs116855232 locus. Fig. 1 shows the ASPCR results from rs116855232 genotyping. ASPCR data were confirmed by sequencing the DNA region of the SNP of randomly selected samples. Fig. 2 shows the sequence chromatograms alongside the corresponding ASPCR result. The DNA sequence chromatograms were consistent with ASPCR results and, thus, validated the ASPCR genotypes.

ASPCR of NUDT15 rs116855232 allelic variants. Amplified products were separated in 1.5% (w/v) agarose gel in 0.5× TAE buffer. Amplicon size of rs1168552325 allele-specific products was 559 bp.

Targeted sequencing to validate ASPCR results. DNA sequence chromatograms of representative samples alongside the corresponding rs116855232 ASPCR data are presented.
Allele frequency and genotype distribution of the study SNP among Bangladeshi population
A total of 184 Bangladeshi individuals were enrolled in this study. Of them, 77 participants were male (41.85%) and 77 were female (58.15%). In this representative Bangladeshi population, allele frequencies of the C and T alleles were 0.918 and 0.082, respectively ( Fig. 3a). One hundred and fifty four individuals had homozygous C (CC) genotype, accounting for 83.70% of the study population. The heterozygous CT genotype was observed in 30 study participants, comprising 16.30% of the population (Fig. 3b). T allele frequencies in males and females were 0.156 and 0.168, respectively. None of the individuals were tested homozygous for the rare T allele (TT).

Allele and genotype distribution at the NUDT15 rs116855232 locus in the Bangladeshi population. (a) Frequencies of C and T alleles at the rs116855232 locus. (b) Percentages of all three possible genotypes at the rs116855232 locus.
As we merged our experimental data to the genotype data of 86 Bangladeshi individuals harbored in the 1000 Genomes Database,21 C and T allele frequencies among 270 Bangladeshi individuals (184 individuals from our study plus 86 individuals from 1,000 genomes) were 0.924 and 0.076, respectively (Fig. 3a). In this combined population, 229 individuals were homozygous (CC) for the C allele (84.81%) and 41 individuals were heterozygous (15.19%). The 1,000 genomes data21 too lacked information for any individual who is homozygous for the variant T allele (Fig. 3b). The Hardy–Weinberg Equilibrium (HWE) test confirms that the observed genotype distribution in the Bangladeshi population conforms to Hardy–Weinberg expectations, and the SNP is under the null hypothesis of HWE (p-value = 0.18).
Frequencies of the variant T allele in different populations were also retrieved from 1,000 genomes browser and compared with our estimate in Fig. 4. The estimated rare allele frequency for the Bangladeshi population was significantly higher than the global frequency (p-value = 0.00078) and those for the African, American, and European populations (p-value < 0.05). No significant difference in T allele frequency was observed between the Bangladeshi and the South Asian population comprising Gujrati, Punjabi, Tamil, and Telegu people (p-value = 0.36). A nonsignificant difference was also observed when the Bangladeshi population was compared to the East Asian population (p-value = 0.095).

Distribution of rs116855232 rare allele (T) in world populations. The T allele frequency in the Bangladeshi population has been compared to the corresponding global estimate and also to those of the African, American, European, East Asian, and South Asian populations individually using chi-squared test. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001, and nsp > 0.05.
Discussion
Pharmacogenomics combines conventional pharmaceutical sciences with annotated knowledge of genes, proteins, and single nucleotide polymorphisms (SNPs), which are the most common variations in the human genome.27 There has been growing interest in the study of pharmacologically important SNPs as these have involvement in determining drug responses. While the NUDT15 enzyme plays a crucial role in the treatment of common childhood leukemia (ALL) and some autoimmune disorders by breaking down thiopurines, SNPs in these genes have been shown to cause loss of enzymatic activity and to predispose patients to thiopurine-related hematopoietic toxicity.7,11,12
In this study, our objective was to devise an ASPCR assay that could determine the rs116855232 genotype in individuals. We selected the SNP rs116855232 as it is the most reported NUDT15 SNP in the literature. This common missense variant has been shown to be consistently associated with thiopurine-induced leukopenia.11,12
Previous studies that focused on rs116855232 genotyping detected the alleles mostly using direct DNA sequencing.7,12,28 Other detection techniques were employed in some studies such as genome-wide genotyping on BeadChip,11 MALDI-TOF MS-based minisequencing,29 TaqMan hybridization assay,30,31 etc. However, high-throughput hybridization- and minisequencing-based techniques are very expensive and entail the use of sophisticated equipment. Direct sequencing and TaqMan allelic discrimination methods, though cheaper than array-based systems, are still costly to be used in diagnostics in the context of Bangladesh.
Ho et al. developed tetra-primer ARMS assays for rs116855232 genotyping and showed that the method has a lower cost per sample and has a straightforward interpretation compared to Sanger sequencing.32 We intended to devise similar ASPCR procedures to genotype rs116855232, requiring only a thermal cycler and agarose gel electrophoresis system, which would be simple, rapid, and cost-effective to be used in routine diagnosis in Bangladesh.
Allele-Specific Polymerase Chain Reaction or ASPCR is an easy method for screening any single-base mutation. ASPCR makes use of sequence-specific PCR primers that allow amplification of test DNA only when the sample contains the target allele.33,34 The presence or absence of a PCR product following the amplification reaction is indicative of the presence or absence of the allele under investigation. The specificity of the allele-specific primers was enhanced by inserting a mismatch at the penultimate base (second to the 3’-terminus) of the primers as suggested by Wangkumhang et al. 22 ASPCR results from randomly selected samples were duly validated against Sanger sequencing data to ascertain the reliability of the ASPCR method.
Though thiopurine therapy in the treatment of ALL and autoimmune diseases is contingent upon NUDT15 genetic testing, the practice is largely absent in Bangladesh. As mentioned earlier, the 1000 Genome Project data contain the genotype information at the study SNP locus for 86 individuals from Bangladesh with the MAF at 0.064.21 Other than that, no other study from Bangladesh has investigated this locus, and to our knowledge, this study is the first to report the distribution of any pharmacologically important NUDT15 allele among the Bangladeshi population. In this study, we applied the ASPCR method to investigate the genetic architecture of the population at the study locus.
There are several methods for determining the adequate sample size for cross-sectional studies that aims at estimating the prevalence of an unknown parameter,35,36 in this case, the MAF of the study SNP. Formulas provided by Cochran20 and Yamane37 are two of the most popular methods for finding the same for qualitative variables, the first one being appropriate for large populations, whereas the second one is suitable for finite populations.38 That is why we used the Cochran formula for determining the minimum necessary sample size for our study. In the formula, “P” is the expected prevalence, which can be obtained from similar studies or a pilot study. According to the 1000 Genome Project,21 the MAF for the study SNP in the Bangladeshi population is 0.064, and the same for the South Asian population is 0.043 according to the ALFA project,39 which provides aggregate allele frequency from the database of Genotypes and Phenotypes. To reduce the margin of error, we used the higher value between the two (0.064) as “P” in the said formula, which returned 160 as the required sample size. Nonetheless, we genotyped 184 samples in total, which further reduced the margin of error to ±4.66%.
The estimated frequencies of C and T alleles in the Bangladeshi population were 0.924 and 0.076, respectively. This rare allele (T) frequency is similar to those reported in the neighboring populations, especially South Asians and East Asians. However, the T allele frequency in the Bangladeshi population is significantly different from those observed in the African, American, and European populations as well as the overall global estimate. Observed CC (NUDT15 *1/*1), CT (NUDT15 *1/*3), and TT (NUDT15 *3/*3) genotype frequencies in the Bangladeshi population were 84.81%, 15.19%, and 0.00%, respectively.
While allele frequencies are often compared between larger populations, a recent study has revealed the difference in T allele frequency between Natives and Mestizos from Mexico (0.100 and 0.065, respectively), implying considerable variation of rare allele frequency across cohabiting ethnic groups.40
To best represent a much larger Bangladeshi population, participants were randomly enrolled in the study ensuring unbiased inclusion of individuals in terms of their settlement location and status. As the study population has a much higher proportion of female participants, presumably, it does not emulate the male-to-female ratio observed in the Bangladeshi population. However, being a germline variant, rs116855232 is more likely to be independent of sex-specific distribution.
The limitation of this study included the lack of validation of the assay against homozygous mutant (TT) samples, due to its rarity in our population. It warrants the need for large-scale future studies to capture the TT genotype and validate the assay against it.
Overall, it is hoped that the introduction of this inexpensive and reliable genetic testing for the rs116855232 risk allele will ensure greater patient safety and better treatment outcome in lower-middle income countries like Bangladesh.
Conclusion
We anticipate that our genotyping scheme will provide clinically actionable genetic information to help clinicians determine effective therapies and appropriate dosing of thiopurines. Indicative of a transforming healthcare system, the practice of timely NUDT15 pharmacogenetic assessment on a case-by-case basis will prevent any undesired outcome of the thiopurine-based treatment regimen. We sincerely hope that our study will thereby pave the way for precision medicine in Bangladesh.
Footnotes
Acknowledgments
The study was supported by the National Science and Technology (NST) fellowship from the Ministry of Science and Technology, Government of the People’s Republic of Bangladesh.
