Abstract
A critical evaluation of several recent regulatory risk assessments has been undertaken. These relate to propyl paraben (as a food additive, cosmetic ingredient or pharmaceutical excipient), cobalt (in terms of a safety-based limit for pharmaceuticals) and the cancer Threshold of Toxicological Concern as applied to food contaminants and pharmaceutical impurities. In all cases, a number of concerns can be raised regarding the reliability of the current assessments, some examples being absence of data audits, use of single-dose and/or non-good laboratory practice studies to determine safety metrics, use of a biased data set and questionable methodology and lack of consistency with precedents and regulatory guidance. Drawing on these findings, a set of recommendations is provided to reduce uncertainty and improve the quality and robustness of future regulatory risk assessments.
Keywords
Introduction
Regulatory authorities and other expert bodies generate a steady stream of toxicological risk assessments, most of which achieve widespread acceptance. In some cases, however, there may be valid outstanding questions regarding the reliability and robustness of one or more aspects of such assessments, relating, for example, to the methodology employed, choice of metrics and/or data integrity. This article is intended to focus firstly on a number of examples where it is believed there are valid concerns, and secondly through a ‘lessons-learned’ approach to identify a number of precautionary principles that should be taken into account when conducting a risk assessment that has regulatory applicability.
Propylparaben
Propylparaben (PP) is the n-propyl ester of p-hydroxybenzoic acid (PHBA; Figure 1; CAS no. 94-13-3; E 216). An alternative form is sodium PP (CAS no. 35285-69-9; E 217) which will revert to the phenolic form at stomach pH. Other parabens used as pharmaceutical/cosmetic preservatives are the methyl (MP), ethyl (EP) and n-butyl esters (BP) of PHBA. Parabens act synergistically in terms of preservative efficacy and a common combination is MP/PP, concentrations in oral pharmaceutical products generally ranging from 0.015% to 0.2% for MP and 0.02% to 0.06% for PP.

Propylparaben.
The current regulatory status of PP is complex depending on type of application and region/country. In the United States and Canada, there have been no changes in the status of PP for many years. In the United States, PP is affirmed as generally recognized as safe in terms of food use (US Code of Federal Regulations, Title 21 Section 184.1670 1 ) with a maximum level of 0.1% in food. In 2007, the Food and Drug Administration (FDA) issued a statement affirming the safety of parabens (including PP) when used in cosmetics. 2 In Canada, PP is considered to be an acceptable pharmaceutical excipient via the non-medical ingredients list. 3 PP is also a permitted class 2 food preservative in Canada. 4
In 2006, ADIs (acceptable daily intakes) for both forms of PP (E216 and E217) were withdrawn in European Union (EU) member states and they are no longer included in the list of authorised food preservatives. 5 Over the period 2001–2004, a Japanese university researcher (Oishi) published a series of single-author papers on MP/EP, 6 PP 7 and BP. 8 Potential adverse effects on male reproduction in juvenile rats at high oral dietary doses of PP and BP were claimed, but no similar effects were reported for MP or EP. In 2004, the European Food Safety Authority (EFSA) evaluated the Oishi data and concluded that PP (and BP) should no longer be part of the group ADI of 0–10 mg/kg/day for MP, EP, PP and BP. 9 At its 2006 meeting, the FAO/WHO Joint Expert Committee on Food Additives (JECFA) came to a similar conclusion which was published in 2007. 10
A number of subsequent reviews by EU expert committees are available: the Scientific Committee on Consumer Products (SCCP) and Scientific Committee on Consumer Safety (SCCS) evaluated the use of PP in cosmetics, taking into account comments by Industry and additional new data: SCCP, 2005a 11 ; SCCP, 2005b 12 ; SCCP, 2006 13 ; SCCP, 2008 14 ; SCCS, 2010 15 ; SCCS, 2011 16 ; and SCCS, 2013. 17
Introductory comments on paraben safety assessment
PP possesses low oestrogenic potency. In a typical assay, the magnitude of the oestrogenic response of various parabens increased with alkyl group size, and it has been reported that oestrogen receptor-binding affinity is 10,000-, 30,000-, 150,000- and 2,500,000-fold weaker (for BP, PP, EP and MP, respectively) than that of the natural ligand, 17β-oestradiol. 9
Parabens are rapidly hydrolysed in vivo by carboxylesterases (mainly in skin, plasma and liver) to PHBA, a common metabolite for all parabens, which is non-toxic, ubiquitous in human nutrition and possesses no or negligible oestrogenic activity.
18
Concerns over potential carcinogenic activity of parabens have been consistently ruled out in the various expert reviews cited above. Kinetic investigations on several parabens have been undertaken in rats,
16,19
and the results are summarized in the most recent SCCS review, as follows: Systemic exposure to parent parabens is extremely low after all routes of administration (oral, SC and dermal). Bioavailability is high following oral and SC administration but low following dermal exposure. PP and BP are present in vivo as both free and conjugated entities. PP and BP show broadly similar systemic exposure (area under the curve (AUC) in terms of total radioactivity) following oral and SC administration.
Regulatory assessments have focused on reproductive safety end points.
PP as a food additive
A 4-week study undertaken by Oishi 7 provides data on the effects of PP resulting from exposure to PP at concentrations of 0.01%, 0.1% and 1.0% in the diet of male juvenile rats (eight animals/group), commencing shortly after weaning when the animals weighed about 50 g and were aged 19–21 days (postnatal day (PND) 19–21). The parameters examined were body weight, weights of the male sexual organs, sperm count in the cauda epididymis, sperm count in the testes and testosterone concentration in the blood serum. The dietary concentrations correspond to average body weight-related doses of 10, 100 and 1000 mg/kg/day at the midpoint of the study, whereas at the start of the study, the 50 g rats were receiving close to 20, 200 or 2000 mg/kg/day. The effect of PP treatment on the various parameter values (apart from body weight) is reported at one time point only – namely at the time of sacrifice after 4 weeks of consecutive daily administration of PP at PND 47–49.
None of the male sexual organs showed any decrease in absolute or relative weight that reached statistical significance at any dose. No histopathological evaluation was undertaken on any tissues however. Three main potentially adverse effects were reported as follows: Sperm counts in the cauda epididymis showed a trend towards a treatment-related decrease, although the reduction for the 10-mg/kg dose did not reach statistical significance, mainly because there was considerable variability in the control group. There was an apparent 30% reduction in daily sperm production (DSP) in the testes. However, there was no dose–response, sperm counts being almost identical at all three dose levels. There was an apparent dose-related trend to reduction of circulating testosterone from means of 9.08 ng/mL in control rats to 8.20 ng/mL at 10 mg/kg PP, 7.17 ng/mL at 100 mg/kg PP and 5.86 ng/mL at 1000 mg/kg PP, although statistical significance was achieved only at the highest dose.
On the basis of these findings, EFSA and JECFA withdrew the ADI for PP since no NOAEL could be determined for DSP and other reproductive parameters were adversely affected. However, a critical assessment of the findings casts doubt on their plausibility mainly because the concurrent control values for the affected parameters were significantly higher than those reported by Oishi in studies on MP and EP and well above published historical control values. 20 (The absence of a dose–response with respect to DSP, and the fact that values in test groups were almost identical to control values in the MP/EP studies and to historical control values, should have rung alarm bells at the time of the EFSA/JECFA reviews.) In addition, the mechanism of the alleged adverse effects proposed by Oishi (direct effect on sperm and/or an indirect effect based on oestrogenic effects and/or testosterone reduction) lacks plausibility when subjected to a detailed assessment, for example, based on the fact that the spermicidal concentration of 3.0 mg/mL for PP reported by Song et al. 21 would never be achieved in vivo. When these concerns were raised with SCCS/SCCP, a request was made for access to the raw data from the Oishi studies, but this was denied. The results of an 8-week study on PP in juvenile male rats are now available, the raw data being made available to SCCS for audit purposes. This study, which employed oral gavage administration to PND 21 male rats, higher animal numbers, inclusion of histopathology, toxicokinetics (TKs) and a 26-week reversibility segment, was sponsored by the French drug regulatory agency and showed no adverse effects at doses of 3, 10, 100 and 1000 mg/kg/day. 18 In SCCS’s judgement, the NOAEL is 1000 mg/kg/day and the results of the Oishi study 7 on PP can thus be refuted.
Although the EFSA and JECFA evaluations of PP were predicated almost entirely on the now-discredited Oishi data, neither body has so far indicated any intention of reassessing PP’s food-additive status.
Similar considerations apply to BP in that a re-evaluation by Hoberman et al. 22 found no adverse effects in juvenile male rats (PND 22) at oral doses of up to 1088 mg/kg/day over 8 weeks. SCCS was critical of this study and refused to accept it as a basis for regulatory action. However, all of the criticisms levelled at the study are considered highly unreasonable (particularly in comparison to the deficiencies with the corresponding Oishi study on BP) and have been comprehensively rebutted by Scialli 20 who suggested a conservative NOAEL of 10 mg/kg/day based on the toxicology database as a whole.
PP as a cosmetics ingredient
SCCS has consistently employed the results of a single-dose, non-good laboratory practice (GLP) screening study (Fisher et al. 23 ) on BP as the pivotal safety metric for evaluation of both PP and BP in cosmetics. The study in question involved the SC administration of BP at 2 mg/kg/day to neonatal rats from PND 2–18 with assessment of a variety of reproductive parameters (testis weight, distension of the rete testis and efferent ducts, epithelial cell height in the efferent ducts and immunoexpression of the water channel aquaporin-1). SCCS 17 considers 2 mg/kg/day to be a NOEL rather than a NOAEL in spite of the small number of end points monitored (excluding histopathology) and the absence of TK measurements.
SCCS’s risk assessment (based on a margin of safety of 100 between milligrams per kilogram exposure and the Fisher et al.’s no-effect dose of 2 mg/kg/day) includes two assumptions in relation to estimating worst-case exposure:
a high dermal absorption value of 3.7% and
a cumulative human exposure value of 17.4 g/day to cosmetic products containing lipophilic parabens.
As a consequence, the use of PP and BP as preservatives in cosmetic products is considered as safe to the consumer as long as the sum of their individual concentrations does not exceed 0.19% (SCCS 17 ).
NC: not calculable; PND: postnatal day; PP: propylparaben.
The GLP study on reproductive toxicity (of PP) has been well conducted and is considered appropriate to refute the study of Oishi which reported reproductive toxicity in juvenile male rats. The toxicokinetic data indicate a rapid and effective metabolism of propylparaben after oral exposure due to rapid and effective hydrolysis of the substance by carboxylesterases. Inactivation of propylparaben by conjugating enzymes plays a minor role.… The study does not cover the potentially sensitive period after birth until PND 21.
SCCS also believed that the relevance of the study for human risk assessment of dermally administered PP is limited because its rapid and effective metabolism in rats is unlike that in humans.
The latter remark is based on a conclusion by SCCS that dermal absorption of intact PP is significantly greater in humans than in rats – a view not shared by other bodies, such as the US Cosmetic Ingredient Expert Panel that estimated human dermal absorption of no more than 1% of unmetabolized parabens. 24 SCCS’s perceptions appear to be based mainly on data from a human TK study involving 26 young adult males with dermal repeated exposure to BP at a daily dose of 10 mg/kg body weight together with two phthalate esters each at the same dose for 5 days. 25 Under these extreme conditions (cream applied at 2 mg/cm2 body surface area at a total dose of 34–48 g/subject each day over the entire body except for scalp and genitalia and left undisturbed for 20 min before the subjects dressed), the AUC for free BP was estimated at approximately 1600 ng·h/mL. Scialli 20 noted that these conditions are completely unrepresentative of human exposure patterns and conclusions on free BP dermal bioavailability are flawed owing to the presence of high concentrations of phthalates that compete for the same esterases as BP. Moreover, a rough estimate of systemic exposure to free BP in the Fisher et al.’s study (2 mg/kg/day SC) can be made, given that the PK of PP and BP are remarkably similar using either oral or SC administration. At PND 21, free PP is ≤1.7% of total AUC which can be estimated at 300 ng·h/mL at an SC dose of 2 mg/kg/day (Table 1). Assuming in the worst case a ratio of 10% free:total BP by making an allowance for the fact that younger animals were employed in the Fisher et al.’s study, the AUC for free BP was extremely unlikely to have exceeded 30 ng·h/mL during any part of the study (representing <2% of the systemic exposure to free BP found in the human dermal TK study). On the other hand, up to 87 times (2600/30) this exposure to free PP produced no adverse effects (admittedly in slightly older animals) when PP was administered by oral gavage. 18
Overall the data provided in the study reported by Gazin et al. 18 did not alter the SCCS evaluation in any way, and Commission Regulation (EU) No. 1004/2014 26 has now been amended (coming into force in April 2015) specifying a maximum applied concentration of 0.19% PP + BP (equivalent to 0.14% expressed as the equivalent PHBA concentration). In addition, PP and BP are prohibited in leave-on cosmetic products designed for application on the nappy area of children below 3 years. MP and EP are restricted individually to a concentration of ≤0.4% and in total to ≤0.8%. The Gazin study on PP showed no effects at doses up to 500 times higher than that used in the pivotal Fisher et al.’s study on BP, when administered in juvenile rats from PND 21 (which corresponds to a human age of 2 years). Thus the regulation seems unduly and unnecessarily restrictive of PP and BP use levels for children (and adults) ≥2 years.
A further concern relates to the much relaxed requirements in young infants and neonates for MP and EP in that neither of these compounds has been evaluated in neonatal rats commencing with animals aged <PND 21 (Oishi 6 (PND 25 for MP and EP) and Hoberman et al. 22 (PND 22)). If the lack of early-stage male reproductive toxicity data (pre-PND 21) is not considered to impact on the SCCS assessment of MP and EP, why is a similar criterion not applied to PP and BP? Conversely, if the Fisher et al. data are believed to apply to MP and EP, which is questionable since such shorter chain esters are hydrolysed more slowly than PP and BP, then why are there no restrictions with respect to concentration and age similar to those for PP and BP? Overall, there appears to be a less than even-handed treatment of MP/EP compared to PP/BP, and no proof of safety for the former in terms of freedom from male reproductive effects in neonatal patients appears to be available based on SCCS assessments.
PP as a pharmaceutical excipient
EFSA’s withdrawal of the ADI for PP has impacted on its used in pharmaceutical products. For example, the following comment was made in the European Public Assessment Report (2008) for a syrup formulation of lacosamide
27
containing MP + PP: the toxicological information available regarding propyl parahydroxybenzoate showed some inadequacies and uncertainties. Detection of effects on sex hormones and the male reproductive organs in juvenile rats led to concluded recently (sic) that no Acceptable Daily Intake can be recommended for propyl paraben because of the lack of a clear NOAEL. Although this concerns only one study and the human relevance of the adverse effects in juvenile rats is unknown (but cannot be excluded), the intake of propyl paraben when using lacosamide oral solution with the proposed formulation (0.20 mg/mL) cannot be considered as safe. Therefore, the CHMP requested the applicant to initiate a development program to remove the preservative propyl parahydroxybenzoate sodium from the formulation of the lacosamide syrup. The applicant committed to submit in an agreed time frame the necessary regulatory application as a post-approval Follow-Up measure in order to register the reformulated syrup.
A further example is a rufinamide syrup 28 for which the applicant was asked to make a post-approval commitment in respect of reformulating the drug product.
MP and PP are the most frequently employed parabens for preservation of orally administered liquid pharmaceutical products, and in 2013 the European Medicines Agency (EMA) released a draft reflection paper (prepared by the Safety Working Party (SWP)) recommending safe patient exposures. 29 The reflection paper notes that intact parabens bind to oestrogen receptors with an affinity that is at least four orders of magnitude lower than that of the natural ligand, 17β-oestradiol, and that PHBA and other downstream paraben metabolites are considered to be essentially devoid of oestrogenic activity. MP is considered to produce no effects on male reproductive organs in the rat (based on the publications of Oishi 6 and Hoberman et al. 22 ), and use of the EFSA ADI of 0-10 mg/day is proposed. For risk assessment of PP, it is acknowledged that results from the comprehensive GLP-compliant oral toxicity study in PND 21 male rats undertaken by Gazin et al. 18 supersede those of the earlier evaluation by Oishi, 7 with a NOAEL of 1000 mg/kg/day. However, a permitted daily exposure (PDE) (of 5 mg/kg/day) for PP has been determined on data from a non-GLP study in PND 21 female rats (Vo et al. 30 ) in which PP was administered by oral gavage at 62.5, 250 and 1000 mg/kg/day for 20 days. SWP concluded that PP ‘seemed to induce myometrial hypertrophy without any effect on uterus weight with a NOEL of 250 mg/kg’. The SWP document asserts that studies of PP on embryofetal development are lacking, but this statement is contraindicated by information contained in the Hazardous Substances Data bank, 31 and moreover BP (considered to be a more potent surrogate for PP) at oral doses up to 1000 mg/kg/day produced no adverse effects on reproductive parameters in a conventional rat developmental toxicity study with dosing from gestational days 6 to 19. 32 (SCCS determined a NOAEL of 100 mg/kg/day for this study – which applies to maternal animals.) The Vo et al.’s study is mentioned in the most recent SCCS review 17 in which the following comment is made: ‘No guideline study. Recent toxicokinetic data indicate low systemic exposure to BP even at high doses and raise doubt on the relevance of the study’. (No TK evaluation was included in the Vo et al.’s study.) In addition, the PP PDE of 5 mg/kg/day does not apply to neonates (age < 2 years) owing to concerns regarding the immature metabolic capacity in such individuals, whereas no such exclusion is made for MP. (Such concerns can be questioned since the available data indicate that carboxylesterase activity in infants reaches adult levels by age 3 months. 33 )
PP overview of risk assessments
The four main PP studies relating to potential effects on male reproductive function in neonatal/juvenile animals are shown in Table 2. It is striking that the most comprehensive study (Gazin et al. 18 ) has not been used as a basis of risk assessment for PP when used as a food additive, cosmetic ingredient or pharmaceutical excipient. The food-additive evaluation is still based on a discredited study (Oishi 7 ), which has been superseded by the more recent Gazin et al.’s study. The SCCS assessment of PP as a cosmetic ingredient is based on a single-dose, non-GLP screening study on a different paraben (BP), and the SWP draft assessment has recommended a PDE based on a non-GLP study that is considered by SCCS to be of doubtful relevance. The SCCS and SWP assessments contain an exclusion of use for PP (and BP but not MP or EP) in human neonates, the reasoning for which is difficult to understand since no evaluation of MP or EP has been undertaken in neonatal rats (<PND 21) whereas BP (and by read-across PP) produced no adverse effects in such animals (admittedly at a low dose of 2 mg/kg/day). In addition, none of the assessments has taken account of the context of exposure to compounds with weak oestrogenic activity, particularly in young children. Nohynek et al. 34 point out that the oestrogenic potency of a typical phytoestrogen such as genistein (an isoflavone) is around 200 times greater than that of PP. According to Soni et al., 35 around 75% of exposure to parabens arises from the use of personal-care products/cosmetics. Gosens et al. 36 have estimated that systemic exposure to PP in children aged 0–3 years is 0.41 mg/kg/day, equivalent to approximately 5 mg/day in a typical child in this age group. On the other hand, the intake of isoflavones in infants fed soy-based formula is reported to be 40 mg/day 37 which in terms of oestrogenic potency is at least 1600 times greater than that related to PP exposure. In a recent publication Sasseville et al. 69 debunk a variety of myths on parabens (particularly in relation to topical use). They point out that “the food, pharmaceutical, and cosmetic industries are under pressure from scare campaigns in the media and are responding by replacing parabens with other biocides that cause multiple cases, and even worldwide epidemics, of allergic contact sensitization”, thus illustrating how a lack of balance and context in relation to toxicological risk assessments can have unintended adverse consequences.
Principal studies in neonatal/juvenile rats relating to regulatory risk assessment of PP.
GLP: good laboratory practice; TK: toxicokinetics; NOAEL: no observed adverse effect level; PND: postnatal day; Y: yes; N: no; PP: propylparaben; BP: n-butylparaben; JECFA: Joint Expert Committee on Food Additive; SCCS: Scientific Committee on Consumer Safety; SWP: Safety Working Party.
Cobalt
Cobalt (Co) is a transition metal that exists in oxidation states +2 and +3, compounds of biological interest being bivalent. Co is an essential element for humans as a component of cyanocobalamin (vitamin B12). The diet is the main source of Co exposure in the general population, dietary Co intake in the United States being estimated as 5–40 µg/day, 38 although levels as high as 82 µg/day have been reported. 39 Co dietary supplements, which are marketed generally in liquid form, have recommended daily doses ranging from 0.2 to 1 mg Co/day. 34
ICH Q3D 40 contains guidance on acceptable levels of metal impurities in pharmaceuticals. In this guidance, Co is classified as a class 2A element, and metal elements in this category are considered to have ‘a relatively high probability of occurrence in drug products and thus require risk assessment across all potential sources of elemental impurities and routes of administration’. The oral PDE for Co is set at 50 µg (parenteral PDE 5.0 µg), and the derivation of the limit for Co can be summarized as follows (slightly edited version):
The oral PDE is based on the available human data. Polycythemia was a sensitive end point in humans after repeated oral exposure to 150 mg of cobalt chloride (CoCl2) for 22 days (approximately 1 mg Co/kg/day). Polycythemia or other effects were not observed in a study of 10 human volunteers (5 men and 5 women) ingesting 1 mg Co/day as CoCl2 for 88–90 days (Tvermoes et al. 41 ). The oral PDE was determined on the basis of the NOAEL of 1 mg/day and a modifying factor of 20.
PDE = 1/1×10×2×1×1 = 0.05 mg = 50 µg, assessment factors being 10 for F2 (general safety factor) and 2 for F3 (short-duration human study).
Previous expert assessments
Limits considerably higher than 50 µg Co/day have been determined by other expert groups. For example, 600 and 700 µg Co/day (Agency for Toxic Substances and Disease Registry 41 and EFSA, 42 respectively) based on polycythemia and 1.4 mg/day (Expert Group on Vitamins and Minerals) 43 based on minor testicular effects in animals. Most recently, an extremely thorough review by Finley et al. 34 recommended an oral reference dose (RfD), based on a composite set of endocrine-related end points, of 0.03 mg/kg/day, equivalent to 1.5 mg Co/day (using the standard patient body weight of 50 kg specified in ICH Q3D).
Biokinetic studies and blood co/toxicity relationships
A number of studies have investigated blood concentrations following repeated oral administration of soluble Co salts. Finley et al. 44 reported that, at a dose of 1 mg/day over 31 days, mean blood maximum plasma concentration (C max) values were 16/33 µg Co/L for males and females, respectively. (There was a 70% reduction in blood Co following 2 weeks off-treatment.) In a longer study (90 days) using the same oral dose, mean blood C max was 20/53 µg Co/L for males and females, respectively. 45
According to Paustenbach et al., 46 with particular relevance to Co levels in hip-implant patients, only if blood Co is >300 µg/L, would toxic effects be suspected. A similar threshold level has been cited by Finley et al. 40 (based on dose–response relationships) and Paustenbach et al. 47
Derivation of oral PDE for Co
In the case of the ICH Q3D assessment for Co (see above), the effect level (i.e. LOAEL), from the 1958 human study by Davis and Fields, 48 is considered to be 1 mg/kg/day. In fact, 150 mg CoCl2 is equivalent to 68.08 mg Co, which corresponds to a dose of 0.97, 1.13 and 1.36 mg/kg in a patient weighing 70, 60 and 50 kg, respectively.
If the NOAEL (from a different study, Tvermoes et al. 41 ) is considered to be 1 mg/day (mean daily dose = 0.0125 mg/kg/day) then the dose interval between the LOAEL and NOAEL = 68.8/1.0 = 68.8, that is, approximately 70-fold. This is an extremely large dose interval and is considered a highly inappropriate method of determination of the NOAEL, and the true NOAEL is considered to lie between 1 and 70 mg/day. Tvermoes et al.’s study should therefore be looked upon as biochemical/kinetic confirmation of a safe human exposure. The principal author states that ‘our cobalt supplement studies were not designed to determine a NOAEL as we only used one dose’. 49
An additional anomaly with the current ICH Q3D assessment is the use of an F3 factor (which accounts for toxicity studies of short-term exposure) of only 2. F3 factors range from 1 to 10 and are based on fractions of the lifespan of the toxicity species. In rodents, use of an F3 factor of 2 is associated with a study duration of at least 6 months and in non-rodents of at least 3.5 years. Thus, employing an F3 factor of 2 for a 3-month human study appears inconsistent with the guidance in appendix 1 of ICH Q3D, 40 and no justification for this significant guideline deviation is provided. The correct F3 factor is considered to be 10, leading to an oral PDE of 10 µg.
Overview of ICH Q3D derivation of an oral PDE for Co
There are multiple reasons for considering the ICH Q3D oral PDE of 50 µg unduly conservative mainly in relation to dietary exposure in some populations, use of dietary supplements, previous expert assessments, biokinetic data and case reports from patients with metal-containing joint replacements. Furthermore, the key benchmark study employed in the ICH assessment was not designed to determine a NOAEL and employed only one dose level which was some 70-fold lower than a LOAEL determined previously. It is ironic that the study from which this misused NOAEL is derived was reported by the same expert group (Cardno ChemRisk LLC) that produced, based on a highly detailed assessment of the whole database, a much higher oral RfD of 1.5 mg Co. Moreover, the current ICH Q3D evaluation relies on the use of an inappropriate F3 factor of 2, whereas employing what is considered to be the correct F3 factor yields an even lower oral PDE of 10 µg, well within the normal range of dietary intake.
An additional criticism of the ICH Q3D assessment can be made in relation to the parenteral PDE of 5 µg/day which is based on an assumed oral bioavailability for oral Co of 10%. By contrast, the EVM evaluation of Co (2003) states that very low oral doses of Co are almost completely absorbed. The LOAEL mentioned above is based on administration of CoCl2, a highly water-soluble compound, and its use would no doubt optimize oral bioavailability. Consequently, it is considered that PDEs for oral/parenteral Co of 1500/750 µg are much more realistic than those indicated in ICH Q3D.
Cancer threshold of toxicological concern
Increasing numbers of substances present (as contaminants or impurities) at low concentrations in foodstuffs and pharmaceuticals are now detectable due to improved analytical methods, but for many such substances, little or no toxicological data are available. The threshold of toxicological concern (TTC) approach has been developed in terms of a variety of structure-based limits to make an initial assessment of a substance to determine whether a comprehensive risk assessment is required. In the regulatory context, any substance of interest whose potential consumer/patient exposure is below the relevant TTC limit is considered to be of little or no concern in relation to human safety. TTCs have been derived for both non-cancer and cancer end points, and this section focuses on the latter.
The TTC for cancer end points (0.15 µg/day at an increased cancer risk of 1 in 106) was developed originally in the context of food contaminants, the starting point being the compilation of carcinogenic potency data (median toxic doses (TD50s) in mg/kg/day; the cancer equivalent of the median lethal dose (LD50) corresponding to an increased rodent carcinogenicity risk of 1 in 2) derived from an assembly of ‘carcinogenic’ compounds listed in the CPDB 50 (Carcinogenic Potency Database). Kroes et al. 51 used linear extrapolation of the TD50 values (i.e. dividing the TD50 by 500,000 to accommodate extrapolation of cancer risk from 1 in 2 to 1 in 10 6 ) for around 730 substances that produced the default limit noted above in relation to most carcinogens except for those in the ‘cohort of concern’ – aflatoxin-like, azoxy- and N-nitroso compounds. Structural-alert considerations related to carcinogenicity.
Subsequently, a default TTC of 1.5 µg/day (using a risk adjustment to 1 in 10 5 , and an assumed body weight of 50 rather than 60 kg) was adopted by the EMA 52 /FDA 53 as a standard lifetime limit in pharmaceuticals for mutagenic impurities or those with a structural alert for mutagenicity. (Thus, the cancer TTC limits adopted for mutagenic impurities in foods and pharmaceuticals correspond to TD50s of 1.25 and 1.5 mg/kg/day, respectively.) The same default cancer TTC is now part of the ICH M7 guidance 54 (at step 4 at the time of writing), structural alerts being judged in relation to mutagenicity in bacterial reverse mutation assays.
Critique of TTC derivation
Three key elements contributing to the derivation of the cancer TTC are considered to be: data set composition, determination of mutagenicity/carcinogenicity status and carcinogenic potency of compounds in the data set and linear extrapolation of carcinogenic potency values for such compounds.
Data set composition
The data set employed by Kroes et al. 47 claimed to contain 730 ‘carcinogens’, as noted above, was not included in the publication and seems to have been ‘lost’. A presumably similar data set from an earlier publication (Cheeseman et al. 55 ) is available however and contains 706 substances (709 claimed, 706 listed). The Cheeseman et al. data set contains at least 20% of compounds in the ‘cohort of concern’ and around another 30% contain a variety of structural alerts (e.g. polycyclic amines, strained heterocyclic rings, and hydrazines). The remaining 50% are tagged as ‘non-structurally alerting,’ although in fact a significant number contain classical alerts for mutagenicity such as aromatic amines, aromatic nitro compounds and C-halo compounds. The CPDB is considered to be biased in its makeup since the majority of the compounds included were tested at high doses on a ‘for-cause’ basis and are thus skewed towards expected carcinogens. Fung et al. 56 distinguished between suspect chemicals (mutagenic or containing a structural alert) and those tested on the basis of being produced in high volumes. In their data set, suspect chemicals accounted for 86% of chemicals with at least one positive result in carcinogenicity bioassays, and suspect chemicals also comprised 90% of chemicals testing positive in two species. Delaney 57 has challenged the representativeness of the Cheeseman TTC data set indicating that it contains ‘many classes of potent carcinogens of historic concern’.
Dewhurst and Renwick 58 claim that the ‘CPDB database is broadly representative of the world of chemicals’ and reference a publication by Bassan et al. 59 demonstrating that the (trimmed) database, containing a homogeneous set of 579 structures, occupied a broadly similar molecular space to a random data set of 502 structures obtained from the EPA DSSTox database. The terminology used by Dewhurst and Renwick can be confusing because the ‘CPDB database’ generally does not refer to the CPDB database as a whole, but to a particular subset, for example, a subset of 609 structures taken from an updated version of the Cheeseman et al. data set. 51 Since the CPDB structures evaluated by Bassan et al. were a subset of an all-carcinogen data set (containing both genotoxic and non-genotoxic carcinogens), it seems highly implausible that they could be the representative of the ‘world of chemicals’, an extremely vague term that is not defined in any of the relevant publications. Overall, the all-carcinogen TTC data set has been acknowledged to be significantly skewed towards a relatively few structural classes of (potent) carcinogens, and the world-of-chemicals argument seems to be a retrospective attempt to defend the data set from accusations of lack of representativeness.
A further confirmation of the skewed nature of the cancer TTC data set is provided by Galloway et al. 60 They surveyed over 100 diverse synthetic routes for pharmaceutical APIs and concluded that ‘The most commonly used classes are alkylating agents and aromatic amines. These cover a wide range of carcinogenic potencies, but most are substantially less potent than the carcinogens from which the TTC was derived’.
Determination of carcinogenicity status and potency of compounds in the data set
The lowest oral TD50 at a statistical significance of p ≤ 0.01 was selected by Kroes et al. for each compound in the data set, a process described by Dewhurst and Renwick as conservative, which on the face of it seems to be a prudent-yet-benign approach. (TD50s cited in the standard CPDB are harmonic-mean values.) However, the profound consequences of taking such an approach, which are not acknowledged or discussed, are twofold: Creation of many false carcinogens – these are compounds that are reported in the CPDB as producing no positive test as judged by the investigators using normal interpretation criteria including a dose–response relationship. There are numerous examples of non-carcinogens thus transformed into ‘carcinogens’, often of high potency, and information on two such compounds (sulphisoxazole and sodium chloride) indicates that actual safe doses are 4-6 orders of magnitude higher than those predicted by linear extrapolation. An increase in the perceived carcinogenic potency for many compounds. For example, for o-toluenesulphonamide and methyl methanesulphonate, there are 10600- and 180-fold differences, respectively, between the Cheeseman et al. and CPDB TD50 values.
Linear extrapolation of carcinogenic potencies
Model-free linear extrapolation of TD50 values at a cancer risk level of 1 in 106 was applied by Cheeseman et al. 51 and Kroes et al. 47 to all compounds in the data set irrespective of their genotoxicity status. This approach is inappropriate for non-genotoxic carcinogens, for which the normal regulatory approach is to assume a threshold exists. 61 The extrapolation technique is acknowledged by Cheeseman et al. 51 as being highly conservative, in that it is likely to exaggerate risk by up to two orders of magnitude, particularly for non-genotoxic carcinogens. The fact that non-genotoxic carcinogens are considered to be thresholded is also mentioned by EFSA, 62 but there is no further discussion regarding the bias created by using linear extrapolation of carcinogenic potency values for such compounds.
Another EU expert committee (Scientific Committee on Occupational Exposure Limits) has recently updated its guidance
63
in which it is stated: There is growing recognition that carcinogenic risk extrapolation to low doses (and standard setting) must consider the mode of action of a given chemical. So far, there is agreement to distinguish between genotoxic and non-genotoxic chemicals, yet further differentiations seem appropriate … four main groups of carcinogens and mutagens in relation to setting OELs:
So it is considered that the application of linear extrapolation to non-genotoxic carcinogens (actually non-mutagenic based on the results of bacterial reverse mutation assays according to the CPDB) is an additional source of bias.
ICH M7 (step 4)
In ICH M7,
50
‘Assessment and control of DNA-reactive (mutagenic) impurities in pharmaceuticals to limit potential carcinogenic risk’, the use of a default TTC of 1.5 µg/day, applying to lifetime exposure, is justified as follows: A Threshold of Toxicological Concern (TTC) concept was developed to define an acceptable intake for any unstudied chemical that poses a negligible risk of carcinogenicity or other toxic effects. The methods upon which the TTC is based are generally considered to be very conservative since they involve a simple linear extrapolation from the dose giving a 50% tumor incidence (TD50) to a 1 in 106 incidence, using TD50 data for the most sensitive species and most sensitive site of tumor induction. For application of a TTC in the assessment of acceptable limits of mutagenic impurities in drug substances and drug products, a value of 1.5 μg/day corresponding to a theoretical 10-5 excess lifetime risk of cancer, can be justified.
Although not explicitly stated in ICH M7 (but mentioned in the prior EMA
48
and FDA
49
guidance documents), the derivation of the default TTC relies on the methodology of Kroes et al.
47
In note 5 of ICH M7, monofunctional alkyl chlorides are discussed as follows with reference to a publication by Brigo and Müller
64
: Compared to multifunctional alkyl chlorides the monofunctional compounds are much less potent carcinogens with TD50 values ranging from 36 to 1810 mg/kg/day (n = 15; epichlorohydrin with two distinctly different functional groups is excluded). A TD50 value of 36 mg/kg/day can thus be used as a still very conservative class-specific potency reference point for calculation of acceptable intakes for monofunctional alkyl chlorides. [In fact a lifetime limit of 15 µg/day for monofunctional alkyl chlorides is stated in ICH M7.]
Not mentioned is the fact that Brigo and Müller employed methodology significantly different to that of Kroes et al.
51
in that: Their original data set contained 27 compounds (obtained from the CPDB). Twelve compounds were eliminated on the basis of being non-mutagenic and/or non-carcinogenic. Harmonic mean TD50s were employed.
So two markedly different methodologies have been used for the determination of limits in ICH M7, and in an Addendum to ICH M7, 65 various other approaches have been employed in the determination of compound-specific limits for commonly used reagents.
Overview of TTC derivation
As described above, there are multiple sources of bias that form an integral part of the Kroes et al. 51 publication on cancer TTC derivation, the main factors being an unrepresentative and non-transparent data set, use of the lowest statistically significant TD50 value leading to the phenomenon of false carcinogens and exaggerating carcinogenic potency in many cases and not distinguishing between mutagenic and non-mutagenic carcinogens before performing linear extrapolation of potency data. It is perplexing that the highly flawed Kroes et al.’s derivation has been allowed to stand given the increasing importance of the TTC in regulatory toxicology. Moreover, no data audit is possible and no independent confirmation has been undertaken using more appropriate methodology (such as that employed by Brigo and Müller).
Discussion
In Table 3, a summary of the various flaws and shortcomings in the three regulatory case studies is provided. Based on this limited exercise involving three examples, the following provisions are considered likely to contribute to best practice in regulatory risk assessments:
Questionable aspects of featured regulatory risk assessments.
✓: applies; ✗: does not apply; GLP: good laboratory practice; TTC: threshold of toxicological concern; PP: propylparaben; BP: n-butylparaben; Co: cobalt; NOAEL: no observed adverse effect level; JECFA: Joint Expert Committee on Food Additive; EFSA: European Food Safety Authority; FDA: Food and Drug Administration; MP: methylparaben; EP: ethylparaben.
Any study considered relevant to a particular risk assessment should be critically assessed for inconsistencies, flaws and discrepancies, particularly in relation to literature data on key parameters.
The available evidence should be ranked and rated in relation to its robustness, for example, using GLP, numbers of animals and so on, in a similar manner to the evaluation of clinical evidence. 66
Use of a key study that is non-GLP or deficient in any other aspects (e.g. screening study, low numbers of animals and limited critical end points) should be carefully justified.
Before a key study used to support a regulatory standard a raw-data audit should be performed, and the findings should be subjected to independent confirmation.
Employing data from a single-dose study should be avoided wherever possible in line with comments by Lewis et al. 67 on adverse and non-adverse effects in relation to NOAEL determination.
Assessments should be consistent with appropriate precedents and applicable regulatory guidance.
Had the above principles been followed much more reliable evaluations of PP, Co and the cancer TTC would have ensued.
Curiously, the key features mentioned above are not mentioned in a draft EFSA document 68 concerning guidance on dealing with uncertainty in scientific assessments. This document focuses on terminology and statistical issues that are considered to be secondary to such fundamental issues as establishing data integrity and using appropriate toxicity metrics.
Footnotes
Conflict of interest
The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author received no financial support for the research, authorship, and/or publication of this article.
