Abstract
Pharmacogenetics is the study of inherited variation in drug response. The goal of pharmacogenetics is to develop novel ways of maximizing drug efficacy and minimizing toxicity for individual patients. Personalized medicine has the potential to allow for a patient's genetic information to predict optimal dosage for a drug with a narrow therapeutic index, to select the most appropriate pharmacological agent for a given patient and to develop cost-effective treatments. Although there is supporting evidence in favour of pharmacogenetics, its adoption in clinical practice has been slow because of sometimes conflicting findings among studies. This failure to replicate findings may result from a lack of high-quality pharmacogenetic studies, as well as unresolved methodological and statistical issues. The objective of this review is to discuss the benefits of incorporating pharmacogenetics into clinical practice. We will also address outstanding methodological and statistical issues that may lead to heterogeneity among reported pharmacogenetic studies and how they may be addressed.
DECLARATIONS
GP reports receiving consulting and speaker fees from Sanofi-Aventis, Bristol-Myers Squibb and Boehringer- Ingelheim, and research grant support from Bristol- Myers-Squibb and Sanofi-Aventis. SSA reports receiving lecture fees from Bristol-Myers Squibb
GP holds the Canada Research Chair in Genetic and Molecular Epidemiology. SSA holds the Michael DeGrotte and Heart and Stroke Foundation of Ontario Endowed Chair in Population Health and the May Cohen Eli Lily Endowed Chair in Women's Health
Not applicable
GP
SR and GP wrote the first draft. SSA and PJ provided comments on all drafts and supplied additional content and relevant references
Introduction
It is widely recognized that there is interindividual variability in drug response, where subgroups of patients experience either adverse drug reactions or do not respond properly to treatment. 1 While the definition of individualized response to drug treatment is not yet fully understood 2 and there is uncertainty as to whether certain patients are consistent non-responders or simply inconsistent responders, this variability may be attributed to biological factors (i.e. age, sex, nature of disease), behavioural factors (i.e. smoking, drug interactions) or genetic factors (i.e. genetic variants). Furthermore, the lack of patient adherence is also recognized as an important contributor to variability of response. For example, the discontinuation of antiplatelet therapy is the strongest risk factor for stent thrombosis in percutaneous coronary intervention. 3 Nonetheless, it is estimated that genetic factors can account for 20-95% of individual variation in drug response; 4 however, the amount of explained variation depends on the class of drugs.
The wide variability in drug response emphasizes the need for a more ‘personalized’ approach to medical treatment. It is possible that pharmacogenetics can address this need by providing a better understanding of how genetic variants influence drug response. 5 This review will focus primarily on pharmacogenetics, which assesses how genetic variants influence drug metabolism and effect. The ultimate goal of pharmacogenetics is to develop novel ways of minimizing harmful drug effects and optimizing care for individual patients. More specifically a patient's genetic information may be used to predict the optimal dosage for a drug with a narrow therapeutic index, to select the most appropriate pharmacological agent and to develop cost-effective treatment plans.
Despite the promise of personalized medicine, there has been little methodological consistency among pharmacogenetic studies. This may be due to modest effect sizes, heterogeneity among study designs and patient populations, as well as a lack of standardization among biological and phenotypic measures.6-8 Holmes et al. 9 performed a systematic review and a field synopsis of pharmacogenetic studies. They reported that the lack of consistency among studies may be a result of the preponderance of reviews over primary research, small sample sizes, mainly a candidate gene approach, surrogate markers, an excess of nominally positive to truly positive associations and paucity of meta-analyses. Therefore, there is an urgent need for properly designed pharmacogenetic studies to advance the discovery and development of medical strategies for individualized treatment. The objective of this review is to discuss the potential benefits of incorporating pharmacogenetics into clinical practice, as well as methodological and statistical challenges faced in pharmacogenetic studies. In this review, we will first identify potential clinical applications of pharmacogenetics and illustrate these with promising contemporary examples. In the second part, we will summarize some of the major methodological challenges facing pharmacogenetic studies.
Potential applications of pharmacogenetics
Personalized medicine has the potential to improve drug safety and efficacy for a specific individual. Adoption of pharmacogenetics in clinical practice promises more effective decision-making with regard to diagnostic testing, drug selection and dosing. In this section, we describe some future applications of pharmacogenetics and provide contemporary examples that reflect how these topics may be applied to a clinical setting. It should be noted by readers that, in many instances, further research is needed to unequivocally recommend pharmacogenetic testing. It is widely believed that a better understanding of the genetic mechanisms in drug response has the potential to help clinicians predict an individualized drug dosage; however, to date, there are few examples that illustrate this hypothesis with improved clinical outcomes.
Individualized drug dosage
The genetic variants that influence the observed differences in drug response can be classified into two groups: pharmacokinetics (PK) and pharmacodynamics (PD). The genes that influence the PK properties of a drug affect the mechanisms of how the drug is absorbed, distributed, metabolized and excreted by the body. The genes that influence the PD of a drug affect the mechanism of the drug's target and how it affects the body. One underlying principle of individualized drug dosage is that it must be faster and more effective than the use of a PK or PD assay alone. In other words, genetic testing may not be required if the therapeutic level of the drugs or a surrogate can be measured, and it is rapidly available and widespread, such as the case with certain antibiotics (not withstanding the genetic susceptibility of the pathological agent).
For example, warfarin has a narrow therapeutic index, and inadequate or excessive anticoagulation can lead to an increased risk of adverse cardiovascular events or bleeding complications. Thus, warfarin therapy dosage is complicated by individual variability and requires regular monitoring to achieve proper anticoagulation effects. Initial warfarin therapy is administered by a fixed dosage, or by an estimated regimen based on the patient's clinical characteristics, with further adjustments based on the patient's anticoagulation response measured by laboratory assays, such as the international normalized ratio (INR). However, it may be more beneficial to use both genetic factors and clinical covariates as opposed to frequent INR monitoring because genetic polymorphisms account for 30-35% of the variability in warfarin metabolism and clinical factors account for 17-21% of variation in warfarin dosing. 10 Therefore, an algorithm that incorporates a combination of these factors would ultimately improve the time required to establish a stable maintenance dose.
The principal genes involved in the metabolism of warfarin are the cytochrome P450 (CYP) 2C9 enzyme and the vitamin K epoxide reductase complex, subunit 1 (VKORC1) gene. Carriers of at least one or more variant alleles of the CYP2C9 genotype are associated with overcoagu-lation and an increased risk of bleeding while on warfarin therapy11-13 whereas those who possess the variant VKORC1 genotype experience warfarin treatment resistance and an increased risk of adverse cardiac events. 14 The International Warfarin Pharmacogenetics Consortium developed a pharmacogenetic algorithm for an appropriate warfarin dosage. 15 The study reported that, among 5000 participants, the pharmacogenetic algorithm identified a larger proportion of patients who required a lower dose (≤21 mg per week) of warfarin and those who required a higher dose (≥49 mg per week) to maintain stable therapeutic anticoagulation. The genetically guided treatment benefited 46.2% of the entire cohort, specifically those for whom the standard dosage of warfarin would not be appropriate. It is important to properly identify this proportion of patients because some (i.e. who require ≤21 mg per week) are at risk of excessive anticoagulation, whereas others who require a higher dose of warfarin (i.e. ≥49 mg/week) are at risk of inadequate anticoagulation. Data on adverse events such as thromboembolic events or bleeding were not collected for this study.
In another study, patients who were treated using a pharmacogenetic algorithm had 28% less hospitalizations after six months of warfarin therapy compared with a control group (18.5% versus 25.5%, P < 0.001). 16 The ability to increase the accuracy of dose prediction may help to enhance drug efficacy and drug safety associated with under-dosing or over-dosing patients. Although promising, it should be emphasized that the implementation of pharmacogenetic testing ultimately depends on clear evidence of improved clinical outcomes.
Individualized drug selection
Personalized medicine can help to guide individualized treatment when the clinical effect of a drug is expected to vary according to genotype. Under these conditions, the risk-benefit balance of a drug might depend on the variant allele carrier status. This balance can be affected by pharmacogenetic effects on safety, efficacy, or both. For example, the incidence of adverse clinical events may differ according to genotypic groups, if for instance, slow metabolizers accumulate a toxic metabolite. Thus, prior knowledge of a patient's genotype may be used to guide clinical decision-making because the patient may benefit from an alternative pharmacological regimen, such that they receive a reduced dose of a standard therapy or a different drug altogether. Conversely, patients who are classified as fast metabolizers may experience increased drug efficacy from a higher dose if their genetic status results in accelerated clearance of the active metabolite.
Chronic infection with hepatitis C virus (HCV) is treated with a combined therapy of peginterferon-α-2a (PegIFN-α-2a) or PegIFN-α-2b and ribavirin. However, less than half of treated patients achieve a sustained virological response (SVR). 17 A genomewide association study of 1671 chronic HCV patients reported that a genetic polymorphism in the IL28B gene (rsl2979860) region was strongly associated with SVR. 18 The authors reported that the polymorphism was associated with a two-fold change in treatment response among Caucasians (P = 1.06 × 10−25) and African-Americans (P = 2.06 × 10−3). Interestingly, the differences in allelic frequency of the IL28B genetic variant may explain about half of the difference in treatment response between these two ethnic groups.
Another study assessed whether accounting for the human leukocyte antigen C (HLA-C) and the killer immunoglobulin-like receptors improved the predictive value of the IL28B genotype. 19 The authors found that the carriers of the variant IL28B genotype were associated with absence of treatment-induced HCV infection clearance and absence of spontaneous HCV infection clearance.
Furthermore, carriers of the variant HLA-C genotype were associated with failed treatment but not spontaneous HCV infection clearance. Thus, the prediction of treatment failure among HCV patients was improved from 66% using the IL28B genotype to 80% with the use of the IL28B and HLA-C genotypes. Incorporating this information can help clinicians to improve the clinical management of patients infected with chronic HCV because they will be able to better predict those who will respond the best to PegIFN treatment, which will help to reduce the adverse side-effects associated with this treatment.
Pharmacoeconomy
Pharmacogenetics has the potential to reduce the costs associated with inappropriate drug treatments or serious adverse drug reactions that require hospitalization. 20 Pharmacoeconomic considerations are especially important given the moderate effects of genetic determinants typically reported in pharmacogenetic studies. In other words, if a more expensive drug has a slightly decreased benefit in individuals with a certain genotype, then careful evaluation of the costs associated with an alternative therapy or the cost of genotyping is necessary before recommending further pharmacogenetic testing.
One example of utilizing genetic testing to improve cost-effectiveness is the treatment of HIV-positive patients with abacavir, a nucleotide reverse-transcriptase inhibitor. Abacavir hypersensitivity syndrome (AHS) is a potentially lethal side-effect, affecting 5-8% of patients in the first six weeks of treatment. 21 It presents with a constellation of symptoms such as fever and rash; and rechallenge with abacavir, after initial therapy, may result in worsening AHS symptoms with an increased risk of mortality. 22 Patients who experience AHS are strongly associated with the variant histocompatibility complex class I allele (HLA-B)*5701 genotype, which is present in 2-6% of Caucasians. 23
Mallal et al. 23 observed that, in a double-blind prospective randomized study, Prospective Randomized Evaluation of DNA screening in a Clinical Trial (PREDICT-1), selective abacavir use informed by HLA-B*5701 testing reduced the risk of AHS. The authors of this study reported that screening eliminated AHS (0% in the prospective-screening group versus 2.7% in the control group, P < 0.001), and had a negative predictive value of 100% and a positive predictive value of 47.9%. This led to the recommendation that prospective HLA-B*5701 screening should be adopted in clinical care.24,25
Furthermore, several studies have evaluated the cost of prospective HLA-B*5701 screening.26-28 Kauf et al. 28 analysed the cost-effectiveness of HLA-B*5701 screening by assessing the cost of prior genetic screening and the cost of using an alternative medication, tenofovir, within short-term and lifetime models. The authors reported that the short-term costs of prospective screening were dependent on the cost of the genetic test, the cost associated with AHS treatment and screening performance. The lifetime models showed that genetically guided abacavir treatment was more effective and less costly than alternative treatment with tenofovir. Furthermore, as of 2009, the patent for abacavir has expired in the United States. Thus, the cost-effectiveness of HLA-B*5701 screening prior to abacavir-based treatment is now highly dependent on the prevalence of the HLA-B*5701 genotype, the cost of prescribing a generic medication compared with a non-generic one, screening costs and the method of healthcare funding.
Methodological issues in pharmacogenetics
Although pharmacogenetics has the potential to address variability in drug response and improve drug efficacy and safety, the adoption of pharmacogenetics in clinical practice has been slow. This resistance may stem from sometimes conflicting findings among pharmacogenetic studies. The failure to replicate these findings may result from a lack of high-quality studies and unresolved methodological issues. In this section, we address methodological issues pertaining to pharmacogenetic study design and provide specific examples of pharmacogenetic studies that illustrate potential challenges that the reader may encounter.
Study design
Table 1 provides a brief description of each study design.
Study designs for pharmacogenetic studies and their main strengths and limitations
Randomized controlled trials
Randomized controlled trials (RCTs) remain the ‘gold standard’ in epidemiological study design. In the field of pharmacogenetics, there are two ways in which RCTs can be used to establish pharmacogenetic determinants of drug safety and efficacy. First, patients can be randomized to a genetically guided therapy versus standard care. While this design offers the best level of evidence to support the use of genetic data, it may be impractical in some situations. For example, the speed of genotyping may cause delays in treatment or randomization for trials that require known pharmacogenetic determinants. Alternatively, if the genotype of interest is rare and the aim of the study is to compare response between two or more therapeutic regimen among carriers, participants may be stratified based on their genotype and then randomized to the intervention or control group.
Substudies within RCTs can be used to determine the impact of genetic variants in response to drug outcomes. In these studies, stored biological samples from pre-existing clinical trials are genotyped with power comparable with that of a prospectively planned pharmacogenetic cohort study. This appears to be an optimal design to discover and characterize pharmacogenetic determinants prior to an evaluation of gene-guided therapy versus standard care.
Pharmacogenetic RCTs are able to measure the independent effects of the genotype, the drug response and the gene-drug interaction in the active drug and placebo/control groups. With this approach, it is then possible to distinguish the differences between simple markers of disease progression and true pharmacogenetic markers, whose effect on disease progression is only seen in the presence of a drug. This can also be assessed by developing a ‘gene score’ (i.e. combining information from many single nucleotide polymorphisms [SNPs]) and testing for a drug-gene interaction.
One major limitation of pharmacogenetic RCTs is the cost and time required to conduct the study. These studies require a large sample size to be powered enough to detect a modest effect size. Furthermore, a post hoc analysis of an RCT may be inappropriate for a pharmacogenetic study because the initial cohort was designed using a specific null hypothesis, estimated effect size and study power, and may underestimate the true gene-drug interaction.
An example of a genetically guided RCT is the Clarification of Optimal Anticoagulation through Genetics (COAG) trial. 29 The COAG trial is a randomized, double-blinded clinical trial that compares genotype-guided dosing and clinical-guided dosing for the initiation of warfarin treatment. The objective of the trial is to determine whether genetic information improves drug treatment. This trial is ongoing and final results of the study are yet to be published.
Another example of a pharmacogenetic RCT is the Statin Response Examined by Genetic Haplo-type Markers (STRENGTH) study. 30 The purpose of the STRENGTH study was to explore the association between genetic polymorphisms and low-density lipoprotein cholesterol (LDLc) lowering in statin-treated patients. The STRENGTH study was a 16-week, randomized, open-label study of three statins in 509 outpatients with hypercholesterolaemia. Study participants were initially randomized to eight weeks of 10 mg/ day atorvastatin, 20 mg/day simvastatin or 10 mg/day pravastatin followed by eight weeks of 80 mg/day atorvastatin, 80 mg/day simvastatin and 40 mg/day pravastatin. Voora et al.30,31 reported that carriers of the ABCA1 variant (rsl2003906) were associated with a reduced LDLc-lowering effect and carriers of the loss-of-function SLCO1B1 allele were associated with increased risk of statin therapy discontinuation. The use of pharmacogenetic RCTs will be instrumental in the understanding of how genetic variants contribute to drug therapy and lay a solid foundation for tailored medical therapy.
Another recent RCT example is the effect of the CYP2C19 genotype on the safety and efficacy of Clopidogrel. Dual antiplatelet therapy of Clopidogrel and aspirin has been shown to reduce adverse vascular events among patients with acute coronary syndromes.32,33 Several studies have observed that carriers of the loss-of-function allele are associated with a reduced response to Clopidogrel and an increased risk of adverse cardiovascular outcomes.34,35 Based on these findings, in 2010, the Food and Drug Administration put a boxed warning for the prescription of Clopidogrel, which may require dose adjustment or use of a different drug. 36 However, a genotyped subgroup from the Clopidogrel in Unstable angina to prevent Recurrent Events (CURE) study showed that carrier status of the loss-of-function CYP2C19 allele did not differ in the safety and efficacy of Clopidogrel. 37 These findings were also replicated in a subgroup of the Atrial Fibrillation Clopidogrel Trial with Irbe-sartan for Prevention of Vascular Events (ACTIVE) A trial. While patients in the CURE study were mostly noninvasively managed, another distinguishing feature of the analysis is the inclusion of the placebo group. The addition of the placebo group provides evidence of the efficacy of the experimental treatment. It also helps to reduce sources of confounding, such as potential pleiotro-pic genetic effects and population stratification. The results of this study have also been confirmed by a systematic review and meta-analysis consisting of 32 studies and 42,016 patients. 38 The authors reported a significant association between loss-of-function carrier status and risk of cardiac events using ‘treatment-only’ studies. However, the authors failed to report a significant association when using ‘effect-modification’ studies or studies with more than 200 cardiovascular events. These analyses show the importance of using large RCTs with both placebo and drug arms to guide validated recommendations on pharmacogenetic findings and medication use.
Prospective cohort studies
Prospective cohort studies follow a group of participants who are self-selected into a drug treatment group and assess how genetic distribution corresponds to the risk of developing the study outcome. Prospective cohort studies are able to examine causality through the temporal affects of drug exposure and genetic variants on disease risk. 39 However, prospective cohort designs are expensive and time-consuming because they often require a large sample size to detect a relatively modest drug-gene interaction. Moreover, this study design is more subject to confounding because the assignment of drug therapy is subject driven rather than randomly allocated.
Selection bias occurs in prospective cohort studies if loss to follow-up is differential by drug exposure or by genotype. For example, loss to follow-up and drug use may vary by age, and loss to follow-up and genetic polymorphisms may vary by ethnic group. Furthermore, if individuals who were lost to follow-up tended to have different risks associated with the study outcome compared with those who remained for the entire length of the study, then the overall incidence estimates would be biased.
Prospective cohort studies are more subject to non-differential misclassification as compared with case-control studies. Non-differential misclassification occurs when exposure measurement errors are independent of the outcome and result in dilution of the measure of association and bias estimates towards the null. This may occur if drug use is not collected at multiple timepoints throughout the study. During the study, participants may begin a new medication or discontinue their current treatment because of adverse drug events. An increase in data collection over the study period will help to ensure improved accuracy of patient behaviour and improved measurements.
As mentioned previously, subgroups of participants from prospective cohorts can be analysed in nested case-control studies. These studies select participants who experienced the study outcome, and compare them with randomly selected controls from the original study cohort. The advantages of using this design are that the cases are compared with the same comparison group, which helps to reduce bias and confounding. Furthermore, this design allows researchers to use small sample sizes and allows for a more cost-effective approach.
One such example in pharmacogenetics is the examination of the CYP2D6*4 allele in tamoxifen-treated patients from the Rotterdam Study. 40 The CYP2D6 gene is involved in the formation of endoxifen from tamoxifen, which is used for the treatment of oestrogen receptor-positive breast cancer in post-menopausal women. 41 The objective of the study assessed the association between carriers of the CYP2D6*4 allele and breast cancer mortality among all incident users of tamoxifen. The study reported that the hazard ratio of breast cancer mortality in patients with the *4/*4 genotype was 4.1 (95% CI 1.1-15.9; P = 0.041) compared with those with the wild-type genotype. Although these results are subject to potentially more bias as the exposed and unexposed groups were not randomized, there is greater generalizability in this study as compared with an RCT This represents a trade-off between optimal internal validity in the RCT design compared with external applicability in the prospective cohort design. It would be wiser to report the more robust estimates of the RCT and use subsequent studies to explore the generalizability of these findings than rely on the estimates from a prospective cohort study.
Case-control studies
Case-control studies are the most common study design in pharmacogenetics. Under this model, cases are defined as those who have had a specific adverse drug event or a poor therapy outcome. The genetic variant frequencies in the cases are compared with the controls who have a comparable level of drug exposure but are also free of the study outcome. These studies are able to measure the effect of the gene-drug interaction but the independent effects of the genotype and drug response cannot be ascertained.
Case-control studies can be performed quickly and they are more cost-effective than large prospective studies. Case-control studies may be the only feasible study design when it is not possible to conduct an RCT. For example, it may not be possible to use a prospective study design to assess rare adverse drug outcomes or rare variants because they require a very large sample size. 42 Furthermore, it is unethical to conduct an RCT with a priori unequivocal knowledge of severe drug-gene interaction, in which carriers of a variant allele are known to be susceptible to adverse events.
The retrospective design of case-control studies makes it more prone to confounding, selection bias and information bias. Selection bias is the product of inappropriate choice of study controls and differential participation rates between cases and controls. Ideally, controls should represent cases with respect to potential exposures and have the same risk of developing the outcome phenotype. For example, pooled hospital-based controls may include participants whose allelic frequencies correspond to another underlying disease, which will distort the exposure-disease association. Selection bias may also result from differential non-participation among cases and controls if those who failed to participate were related to genotype or drug exposure.
Information bias in case-control studies is most likely to result from differential misclassification. Differential misclassification occurs when there is systematic error in the degree of misclassification between cases and controls, which will distort the true magnitude of association in any direction. One common type of information bias in case-control studies is recall bias. Recall bias occurs when cases remember past exposures differently than controls. For example, cases may recall past drug exposures better than those who did not experience the outcome because they have more motivation to identify possible causes of their disease. It is important to note that there is no recall bias with genetic exposure because participants’ genotypes are fixed.
Genetic epidemiology considerations
Phenotype definition
In pharmacogenetic studies, the selection of the study endpoint and the patient response phenotype are crucial for interpreting drug efficacy. 43
However, since many pharmacogenetic studies use data from prospective studies, the study end-points and patient population may not be precise enough to identify functional genes that are associated with the drug response. For example, it may be more appropriate to measure clinical outcomes, such as adverse bleeding events, when studying the association between safety measures and genetic markers. Nevertheless, physiological and biochemical measures may be more appropriate phenotypes to represent the underlying gene function in the drug-gene interaction, 44 such as platelet count or clotting time. These phenotypes represent stronger biological or causal evidence of the functional activity of the gene or protein in question.
However, across studies, there is great heterogeneity in the biological measurement and definitions of outcomes or phenotypes. For example, the reported prevalence of aspirin resistance ranges from 5% to 45%, 45 which is thought to result from small sample sizes and heterogeneity within the methodologies used to measure the biochemical and functional components of aspirin resistance.46,47 Goodman et al. 48 performed a systematic review of all the genetic studies of aspirin resistance, and observed that the effect of the PIA1/PIA2 polymorphism in the GPIIIa receptor appears to differ according to the technique used to measure aspirin resistance. The lack of standardization among laboratory tests leads to imprecise effect estimates of the polymorphism and drug response. Therefore, to decrease heterogeneity among studies and for more reliable estimates of pharmacogenetic associations, there must be consistent and functionally relevant phe-notypic definitions.
Genetic polymorphisms
The associated genetic variants are either directly functional or they are indirectly correlated with another variant that is the actual cause of the drug response. Linkage disequilibrium (LD) ‘is the tendency for a pair of alleles at two linked loci to be associated with each other in the population more than would be expected by chance’. 49 LD is useful in genetic association studies because high LD allows for a smaller subset of marker SNPs to be genotyped while capturing most of the genetic information. However, LD varies among ethnic populations and this may affect cross-subpopulation comparisons when causal SNPs are not directly genotyped but rather captured by ‘proxy’ SNPs.50,51
Population stratification
A source of confounding within-population-based pharmacogenetic studies can result from population stratification. 52 Population stratification occurs when ethnic subpopulations within the entire study population differ in terms of genotype frequency and risk of disease. 53 Population stratification confounds pharmacogenetic associations when differences in the prevalence of an allele parallels the incidence of study outcomes 52 and may bias both the strength of the association and estimates of precision of the genetic variant-outcome association. In other words, clinical outcomes might vary among genetically distinct populations for reasons other than the variant being tested and thus bias pooled drug-gene interaction effects. 54 Stratification can also occur in apparently homogeneous populations, for example, Davey Smith et al. 55 observed an increasing north-south gradient in the frequency of the variant allele for lactase persistence across Britain.
One approach to minimizing the confounding effects of population stratification is to match participants based on geographical region and by markers of ethnic origin. 56 Stratifying the study sample by ethnic groups allows for fair comparisons among homogenous groups; however, depending on the amount of stratification, too many groups will decrease the power able to detect an effect within each stratum. Genetic principal components are also widely used to minimize confounding by stratification. This method corrects for spurious associations in traits that differ among populations and have different allelic frequencies for the genotype of interest. Most differences in allelic frequencies are thought to have occurred because of genetic drift and may not represent functional variants. 57 Thus, the principal component technique is used to detect and correct for the population heterogeneity to minimize false-positive associations. 58 Variance component methods have also been recently developed to adjust for population stratification. 59 Importantly, randomized studies are immune to this bias since equal numbers of individuals of each population strata will be randomized to the drug of interest or placebo group.
Genetic pleiotropy
Genetic pleiotropy is the phenomenon in which a single gene is responsible for a number of distinct and seemingly unrelated phenotypic traits. 60 This phenomenon is of special importance to pharmacogenetics because it may confound the pharmacogenetic association. For instance, if the gene of interest is associated with multiple outcomes or intermediate phenotypes, the reported drug-gene interaction may be a result of the underlying gene mechanism and not a product of the drug response. 61 For example, the SH2B3 gene has been associated with multiple phenotypic traits, such as blood pressure,62,63 blood eosinophil number, 64 myocardial infarction, 64 celiac disease, 65 type I diabetes, 66 LDLc, 67 asthma, 64 blood platelet number, 68 haemoglobin concentration 69 and haematocrit. 69
Several large trials have observed that lowering LDLc levels decreases the risk of atherosclerosis events, which can be achieved through statin therapy. 70 The proprotein conver-tase subtilisin/kexin type 9 (PCSK9) gene degrades the LDL receptor, which helps to increase the clearance of LDLc from circulation. Gain-of-function carriers of the PCSK9 genotype are associated with mild to severe hypercholes-terolaemia, while loss-of-function carriers are associated with decreased LDLc and decreased risk of cardiovascular events.71,72 The loss-of-function carriers are also associated with more pronounced decrease in LDLc with statin therapy, 73 and it is difficult to determine to which extent the observed relationship is driven by a pharmacogenetic effect or by the gene effect. Therefore, to distinguish if there is an independent relationship and true effect modification, it is essential to use an RCT design with a control group to see if the effect occurs in the treatment group alone.
Statistical issues in pharmacogenetics
A major issue in pharmacogenetics is the lack of replication among population-based studies. Possible explanations for the sometime inconsistent findings are modest effect sizes, small sample sizes and multiple hypothesis testing. In this section, we will discuss sample size and multiple testing issues, and how to address them.
Sample size
The ability to determine whether there is a clinically significant difference between groups is dependent on the study sample size. Pharmacogenetic studies must be large in order to have enough statistical power to detect a gene effect, a treatment effect and a drug-gene interaction. 74 The power to detect a statistical interaction depends on the number of SNPs, the allelic frequencies of each SNP and the type of study design. 75 It is unlikely that a common genetic variant will have a large effect in a complex trait, such as drug response. 76 Studies should thus be powered to detect a common or rare variant with a modest or very large effect size, respectively.76,77
Table 2 shows the approximate sample sizes needed to detect a significant gene-drug interaction (assuming 80% power and α= 0.05) by effect size and allelic frequency (among controls). Under these conditions, it is assumed that the genetic variant is causal; however, it is possible that the variant allele is in LD with the actual causal variant, which may require a larger sample size.79,80 If a rare genetic variant is anticipated with a small or modest effect, a sample size of more than 900,000 participants would be required. However, if a common variant with a large effect was expected, then a sample size of approximately 900 participants is needed. These results suggest that the majority of pharmacogenetic studies are underpowered, which may give rise to false-negative or false-positive estimates.
Sample size required to detect a drug-gene interaction in a pharmacogenetic study based on minor allele frequency *
Sample sizes have been calculated based on a drug-gene interaction assuming an additive genetic model. These estimates assume a type-l error rate of 0.05, a power of 80% and a baseline risk of an adverse drug reaction among exposed subjects to be 10%. Sample sizes were calculated using QUANTO 78
For some pharmacogenetic questions, the required sample sizes may be difficult to obtain. The need for large datasets has led to the creation of international consortia where data between investigators are pooled or analysed together, or large population-based biobanks, which store biological materials (i.e. blood or DNA) and demographic information, including drug use. 81 In addition, RCTs now incorporate genetic add-on studies which have the same high internal and external validity and large sample size of the parent RCT, while remaining cost-effective. 82
Multiple testing
Multiple testing refers to the repeated use of a statistical test and the risk of an overall type I error. 83 Multiple testing arises when there are multiple comparisons in statistical models that contain multiple genes, multiple exposures and multiple interactions. 84 Within these models, it is inappropriate to use the standard P value of 0.05 because as the number of tests increases so does the frequency of type I errors.
The most common approach to correct for multiple testing is to use the Bonferroni correction, in which the P value that is used for one test is divided by the total number of tests in the analysis. However, the use of the Bonferroni correction may be considered too conservative because many SNPs are in LD, which may mask their effects and increase type II errors. Furthermore, since many of the pharmacogenetic studies are underpowered to detect a drug-gene interaction, the Bonferroni correction may null the study results. 85 Another possible approach to adjust for multiple testing is to use the false discovery rate (FDR), which is less conservative than the Bonferroni correction. 86 The FDR estimates the expected proportion of false-positives among associations that are declared significant, which is expressed as a q value.
Conclusions
Pharmacogenetic studies offer a promising future yet have a challenging present. Personalized medicine has the potential to maximize drug efficacy and minimize the toxic effects; however, there are many issues in study design and analysis that need to be addressed. Large collaborative efforts across biostatisticians, epidemiologists, pharmacologists and clinicians are needed to provide robust evidence to support individualized treatment for improved drug efficacy and safety.
