Abstract
Full kinetic modeling of dynamic PET images requires the measurement of radioligand concentrations in the arterial plasma. The unchanged parent radioligand must, however, be separated from its radiometabolites by chromatographic methods. Thus, only few samples can usually be analyzed and the resulting measurements are often noisy. Therefore, the measurements must be fitted with a mathematical model. This work presents a comprehensive analysis of the different models proposed in the literature to describe the plasma parent fraction (PPf) and of the alternative approaches for radiometabolite correction. Finally, we used a dataset of [11C]PBR28 brain PET data as a case study to guide the reader through the PPf model selection process.
Introduction
The accurate measurement of parent radioligand concentration in plasma is a major challenge of quantitative PET imaging. Radiolabeled compounds injected into the blood stream are exposed to a complex and unpredictable chemical environment and thus may break down in one or more metabolites. At least one of these metabolites would contain the radioisotope and is therefore named radiometabolite. To correctly quantify the binding of a radioligand, the amount of radiometabolites should be taken into account. 1
Depending on the chemical characteristics of the radiometabolites and on the transport mechanism between blood and tissue, the radiometabolites may remain confined to the vascular compartment, migrate into the tissue along with the parent radioligand or even be created inside the tissue. Radiometabolites are often less lipophilic than their parent, and therefore are less likely to cross the blood–brain barrier and enter the brain. Thus, if the radiometabolites are confined to the blood compartment, only the concentration of parent radioligand should be used as input for modeling the tissue kinetics. By contrast, radiometabolites that cross the blood–brain barrier or originate directly inside the tissue 2 must be incorporated into the model as a second input or as an additional compartment, respectively.
Serial arterial blood samples are usually drawn during the PET scan, in order to assess the concentration of parent radioligand over time. Blood samples may be drawn manually or with an automated blood sampling system equipped with an online detector or with a fraction collector. The online detector allows the best definition of the peak, by continuously measuring arterial whole-blood concentrations. However, some manual blood samples are still required to obtain the plasma concentration and to separate the parent from its radiometabolites. The fraction collector instead provides discrete blood measurements as in the manual sampling, but with a higher frequency and more precise timing.
The fraction of unchanged radioligand in plasma (the Plasma Parent fraction or PPf) is measured with techniques such as high-performance liquid chromatography (HPLC), thin layer chromatography or other chromatographic methods. The fast decay of radioactivity, especially with 11C-labeled tracer, limits the total number of samples that can be analyzed by chromatography. Therefore, for kinetic modeling, PPf data points are generally fitted with a mathematical function, with the purpose of obtaining a smooth and continuous PPf curve from a series of discrete noisy samples. Although PPf measurements are sometimes linearly interpolated,3,4 the use of a model is preferable to minimize the impact of measurement errors.
5
The choice of the PPf model is a crucial step for kinetic modeling. Indeed, a carefully selected PPf model allowed Parsey and colleagues
6
to nearly halve the retest variability of the total volume of distribution (VT) of [11C]DASB compared to the results obtained with the PPf model commonly used in the literature. Furthermore, Wu et al.
7
showed that different models can lead to significant differences in both binding potential (
The aims of this review are 1) to overview the most common modeling approaches to correct the plasma input function for radiometabolites and 2) to define guidelines for selecting the optimal PPf model. A dataset of 11 brain PET studies done with [11C]PBR28, a radioligand for the translocator protein, is used as a case study. The approaches developed to obtain a full input function from a limited number of blood samples are discussed in detail. Finally, we will outline some alternative modeling strategies for radiometabolite correction.
Plasma parent fraction modeling
Obtaining a radiometabolite-corrected arterial input function is a multistep procedure (Figure 1) that usually involves:
Measurement of the whole blood activity – Separation of plasma from whole blood by centrifugation and measurement of the total activity in plasma – Analysis of plasma samples by chromatographic methods to determine the fraction of activity that is due to the parent tracer – Blood samples processing for the derivation of the input function. (a) Blood samples are drawn at various time points and their radioactivity is measured with a γ-counter. The samples are then centrifuged to separate the blood cells from plasma (b) The whole blood activity –

Because of their limited number and the presence of noise in the measurements, PPf data points are rarely used directly. A PPf model is generally fitted to the data in order to extrapolate the missing values and to minimize the impact of measurement errors. The parent concentration in plasma is calculated as
Plasma Parent fraction models used in the literature.
Power models
First proposed by Watabe and colleagues
9
for [11C]MDL 100,907 and then extended by both Meyers and colleagues
11
for [18F]CPFPX and by Hinz and colleagues
12
for [11C]MDL 100,907 again, Power models are characterized by the following general expression
Hill models
A Hill function was firstly used by Gunn and colleagues
13
to describe the radiometabolite fraction of [carbonyl-11C]WAY-100635. This model was subsequently used to fit the PPf kinetic of many different radioligands such as (R)-[11C]Verapamil,
14
[11C]flumazenil
16
or [11C]NOP-1 A.
46
A general expression for the Hill model is given by
The parameters vector is still
Compared to equation (3), this formulation presents one extra parameter
Exponential models
These models are characterized by a (multi)exponential decay. With minor variations, they have been used for [11C]NNC 756,
33
[11C]flumazenil
31
and [11C]-(R)-PK11195,
37
among others. A general formulation is
All parameters of the vector
Modeling the appearance of radiometabolites and the parameters of injection
The aforementioned models can be extended to include a delay term, t0, which represents the interval before radiometabolites appear in plasma.8,11,30,31,43 The models
Moreover, the models might start from an initial value (
Finally, the duration of bolus injection may impact on the initial phase of the PPf curve, because a mixture of newly injected and recirculating radioligand might be present in the first blood samples.
8
To account for the length of bolus injection, Tonietto and colleagues
8
recently proposed to convolve the model
Accounting for measurement errors
When information on measurement error is available, parameters can be estimated by weighting each data point according to the inverse of its variance. 49 The measurement error of the PPf samples is assumed to be additive, uncorrelated, with zero mean and unknown variance. In most studies, the variance is assumed equal for all samples, which is equivalent to not weighting the data.
However, some studies assumed that the PPf variance is based on Poisson statistics of the area-under-the-curve of parent peaks (
Other studies used a full error propagation of HPLC AUCs for both parent
Another formulation of PPf variance was derived by Wu and colleagues
7
for a HPLC equipped with a fraction collection system. The activity (vi, where i stands for the i-th fraction) and the associated standard deviation (
Model selection
The optimal model must be selected among the existing alternatives on the basis of the quality of data description and reliability of results. 50 The standard approach for model selection involves testing the performance of the various models by calculating parsimony indices such as the Akaike Information Criterion (AIC) 51 or the Bayesian Information Criterion (BIC). 52 These indices balance the accuracy of the fit against the complexity of the model (i.e. they statistically penalize models with more parameters). In fact, models with too many parameters tend to fit also the measurement errors (overfitting) and therefore they may poorly interpolate the missing PPf values. The parsimony indices can be compared with a repeated-measures ANOVA, where the information index for the different models represents the repeated measure. In its simplest form, the comparison of just two models reduces to a paired t-test. 53
Parsimony criteria are informative and straightforward to calculate, but one should not base the model choice solely on them. For example, if the coefficient of variation (CV), which represents the precision of the parameter estimates and it is calculated as the ratio between the estimated standard deviation and the expected value of the parameter, is too high (e.g.
Furthermore, once the model is fitted, the prior assumptions on measurement errors must be verified by analyzing the weighted residuals. The weighted residuals can be tested for randomness (using for example the runs test), for normality (Anderson-Darling or Kolmogorov–Smirnoff test) and, if known, for variance (Chi-square test for the variance). However, given the small number of samples available, these tests have low statistical power and might not detect violations of the error distribution assumptions. It is therefore more convenient to visualize the residuals of all the subjects together, in order to detect possible polarizations, as shown in Wu et al. 7
Population approaches
Fitting a model to the PPf data is often hampered by the limited number of available samples. When the measurements are too sparse and noisy, the fit is uncertain and the PPf can be poorly estimated. An input function with an erroneous shape might entail quantification errors. 54 One possible strategy to solve the problem of sparse sampling is the use of population approaches. These methods postulate that different subjects share similar PPf curves after a bolus injection.
An average radiometabolite curve (naïve average data) would be the easiest approach. This method was successfully applied for [11C]raclopride, 55 [11C]flumazenil, 56 2[18F]F-A-85380 5 7 and [18F]FLT. 58 However, this type of radiometabolite correction is rarely possible 32 and must be validated for each tracer. 59 Furthermore, the population used to calculate the average curve should be determined from a group that is comparable to the population under study in terms of age, sex, body weight and clinical condition. [18F]FDPN, for example, displays significant gender-related metabolic differences. 60 [18F]FLT is metabolized in the liver via glucuronidation, 61 so that any disease or therapeutic agent that affects hepatic function is likely to affect the amount of radiometabolites, making impossible to use data across different cohorts of subjects. Moreover, even within groups of similar subjects, inter-subject variability is almost never negligible 12 and occasional outliers should be expected. Notably, clinical PET protocols are usually performed on a limited number of subjects (about a dozen). So, even the presence of one or two outliers in this small population might significantly influence the results.
A more elegant population approach for radiometabolite correction is the Non Linear Mixed Effect Modeling (NLMEM).
62
Unlike the simple average of population data, NLMEM accounts for both intra- and inter-subject variability. Briefly, NLMEM assumes that the model parameters are characterized by some attributes that do not vary within the population of M subjects (fixed effects, i.e. values that are common to all subjects) and some others that do (random effects, i.e. values typical of a specific subject). Mathematically, this can be written as:
Unconventional approaches
Compartmental models
PPf models are empirical functions whose purpose is to describe plasma parent data, without necessarily taking into account the underlying physiological processes. Accounting for the physiology of the radioligand could nevertheless be possible by implementing compartmental models. Huang and colleagues 63 developed a generalized compartmental model to describe the conversion of an injected radiotracer into its radiometabolites. The model uses the total activity in plasma as input and the concentration of each radiometabolite as output. By identifying the model parameters, the full, noise-free time course of the parent concentration in plasma can be estimated. The main limitation of this approach is that the concentration of each radiometabolite in plasma must be measured. In general, HPLC analyses are optimized to separate the parent from the radiometabolites and not to isolate multiple different radiometabolites. Carson and colleagues 44 simplified this approach by lumping all the radiometabolites in a single compartment. Another variation included a compartment for the red cells. 64 However, due to their complexity and the lack of clear advantages over standard PPf models, 7 compartmental models have been used only for few radioligands, i.e. [18F]FDOPA,64,65,66 [15O]O2, 63 [11C]raclopride 44 and [11C]Thymidine. 67
Radiometabolite correction without metabolite measurements
The measurement of radiometabolites requires adequate facilities, special care and technical expertise. Therefore, alternative modeling approaches to estimate the input function without actually performing any radiometabolite measurement have been proposed.68,69 These approaches work by estimating simultaneously a modeled metabolite-corrected input function and the tissue parameters using both the tissue and the measured whole-plasma concentrations. Although these methods were successfully applied to [11C]iomazenil 68 and [11C]flumazenil 69 datasets, the authors themselves suggested to limit their use to single tissue compartmental analyses 68 or to rescue studies where metabolite measurements are unavailable. 69
Notably, Shields and colleagues 1 corrected for radiometabolites the input functions of [18F]FLT by using a logarithmic interpolation between a single blood sample measured at 60 min and the value 1 at time zero.
Radiometabolite correction with venous sampling
In theory, all PPf models described above can be applied to venous, rather than arterial, samples, provided that a suitable arteriovenous equilibrium for the radioligand under study exists. Venous sampling is easier, less invasive and would promote a more widespread use of fully quantitative PET studies. As a rule, however, arterial concentrations of a given compound, be it radioactive or not, are never fully consistent with venous concentrations (for reviews, see literatures70,71).
After a bolus injection, arteriovenous equilibrium is present only during a transient phase when the net uptake in the tissue is zero. The length of this phase varies among the compounds, may be absent for the duration of the analysis, and surely never lasts for the whole duration of the arterial input function. 59 In particular, the early arterial peak, when the compound distributes in the tissue, cannot possibly be replicated in the venous blood, whose concentrations rather reflect the uptake and extraction ratio of each particular tissue. By consequence, venous concentrations heavily depend on the sampling site. 59 For instance, the radioligand concentrations found in the vein of the arm would change as function of the uptake and extraction ratio of the tissue of the hand, which may be different from that of the brain. Finally, inter-subject differences in arteriovenous concentrations are commonly found in both animal and human studies.59,72 Therefore, venous samples should not be used to perform studies under non-steady-state conditions.
Case study: [11C]PBR28
In this section, we present an example of PPf model selection on a dataset of [11C]PBR28 brain PET scans. After selecting the optimal model, we evaluated the accuracy of population methods by progressively reducing the number of available blood samples per subject.
Dataset
Eleven healthy subjects, taken from a previous protocol,
73
were injected intravenously over 1 min with an activity of 680 ± 14 MBq of [11C]PBR28. The protocol was approved by the Ethics Committee of the National Institutes of Health and the study was conducted according to the Declaration of Helsinki. Blood samples (1.0 mL each) were drawn from the radial artery at 15 s intervals until 150 s, followed by 3-mL samples at 3, 4, 6, 8, 10, 15, 20, 30, 40, 50, 60, 75, 90, and 120 min. The PPf was measured with an HPLC on almost each plasma sample, as previously described.
74
In summary, the dataset consisted of 11 PPf curves, each composed of
PPf modeling
AIC scores for the model selection of [11C]PBR28.
Since the PPf value of the first sample (taken 15 seconds after injection) was smaller than 1 (0.95 ± 0.03) in each subject, all equations were implemented considering an initial PPf value (
The estimation of the parameters was performed using a maximum-likelihood non-linear estimator with relative weights. These were defined as the inverse of the variance of each data point, and the variance was calculated using equation (8). A nonlinear constraint was imposed on the extended Hill model in order to ensure its positivity at late times.
Model selection
As a first step, we assessed whether the inclusion of the delay and the injection length significantly improved the fitting of the models, by comparing the AIC indices with paired t-tests.
Including the delay term significantly improved the fitting (i.e. lower AIC) of the exponential models (p = 0.002 against the standard exponential model and p = 0.007 for the version convoluted with injection length), but resulted in a significantly higher AIC for the power models (both standard and convoluted: p = 0.023 and p < 0.001, respectively). No significant differences were found for the Hill functions. Adding the injection length significantly reduced the AIC indices only for the exponential model (p = 0.013). Therefore, t0 and T were added only to the exponential model.
Among the four remaining models (exponential with t0 and T, power, Hill and Hill extended), the extended Hill model had the lowest mean AIC (Table 2) albeit statistical significance (Bonferroni corrected) was reached only against the basic Hill model (p = 0.014). Notably, the exponential model showed unreliable estimates (CV > 100%) while the power model tended to overestimate the PPf curve around the 20th min (Figure 2a), thus producing a polarization of the weighted residuals. Therefore, both these models were rejected. The basic Hill model did not fit correctly the tails of the PPf curves (Figure 2b) because these did not reach a plateau level but decreased slowly and constantly. Among the tested alternatives, the extended Hill model showed the most reliable estimates (CV: a = 6 ± 9%, b = 0.04 ± 0.06%, c = 10 ± 19%, d = 6 ± 6%, PPf0 = 1 ± 1%, ). The model adequately fitted the data and the weighted residuals were consistent with the measurement error hypotheses (both runs and Anderson-Darling tests did not detect assumptions violation) (Figure 2c). Therefore, the extended Hill model was selected to fit [11C]PBR28 PPf curves. Figure 3 shows an example of fit and weighted residuals for a representative subject.
Weighted residual comparison of the PPf models. Mean (black line) and between-subject variance (grey area) of the weighted residuals obtained by fitting the different PPf models to [11C]PBR data: (a) power model with t0 constrained to 0. (b) Hill model with t0 constrained to 0. (c) Extended Hill model with t0 fixed to 0. (D) Convoluted exponential model with estimated t0. The Extended Hill model provided the best description of the data and was the only one to yield random residuals. Plasma parent fraction modeling with the extended Hill model. (a) Fitting of the [11C]PBR28 PPf measurements (open circles) with the extended Hill model (black line) in a representative subject and (b) weighted residuals over time for the same subject.

Tissue estimates were obtained as described in Rizzo et al.
75
using as PPf model the four alternatives selected in the first step. The impact on VT varied according to the PPf model and ranged from
Population approaches
In this section, we compared population-based curves to the individual fitting of the PPf performed with the optimal model selected in the previous section. In particular, we implemented the NLMEM approach as described in Veronese et al. 55 and the naïve average method by averaging the PPf samples of all subjects and then by fitting the average curve with the extended Hill model. For each subject, the PPf curve was thus estimated using the individual and the two population approaches and the estimation was repeated by progressively reducing the number of samples available (i.e. frequent, standard and sparse sampling). For each approach and each sampling frequency, we calculated the residual sum of squares (RSS) between the PPf curve estimated and all the available PPf datapoints, regardless of the samples used in the estimation. The individual fit obtained with frequent sampling was taken as reference. Notably, the number of samples obtained with sparse sampling was not sufficient to estimate the model parameters individually.
NLMEM could reliably recover the individual PPf with all sampling frequencies. The fit obtained with frequent sampling was not statistically different from that obtained with standard sampling (p < 0.05, paired t-test) and only borderline different (p = 0.046) from the fit obtained with sparse sampling (Figure 4).
Comparison between individual, NLMEM and average fitting for the estimation of the PPf curve when the number of samples is reduced. Boxplots of the residual sum of squares (RSS) between the PPf curve estimated with the individual, NLMEM and average approach and all the PPf data available, regardless of the samples used in the estimation. Stars indicate statistical differences (p < 0.05) tested with pair t-tests. The robustness of the methods was tested by progressively reducing the number of data points: frequent sampling: with all the available PPf measurements of each subject (
On the other hand, the model descriptions of the individual method were statistically different between standard and frequent sampling (which is expected, because the individual method does not use any additional information other than the data to analyze).
Finally, the PPf profile obtained from the average approach significantly differs from the profile measured individually (Figure 5). In general, the higher the population variability, the higher the error introduced by the average approach.
Fit obtained with the individual, NLMEM and average approaches in a representative [11C]PBR28 subject for different sampling frequencies. The PPf data points (open circles) of a representative [11C]PBR28 subject were fitted using the individual model (black dashed line), NLMEM (black solid line) and the average approach (grey solid line) with frequent (a), standard (b), and sparse (c) samples.
Conclusion
Correcting the input function for radiometabolites is a crucial step to obtain reliable estimates of tissue parameters. This is usually done by measuring the fraction of activity due to the parent tracer in plasma and then by fitting the data points with a mathematical model. In order to select the optimal plasma model among the many that are available in the literature, one should first preselect a suitable subset of PPf models, tailored to the characteristics of the radiotracer under analysis and to the experimental setting, and then carry out comprehensive comparisons.
In this review, we used a dataset of [11C]PBR28 as case study to guide the reader through the process of model selection and we found that an extended Hill model provided the most accurate description of the PPf shape of this radioligand.
Finally, NLMEM can be used to obtain a robust radiometabolite correction when the number of samples available is too small for standard modeling approaches. Population approaches based on averaged PPf curves are not recommended because they can heavily bias arterial input shapes and therefore the tissue parameters.
Footnotes
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported in part by the Intramural Research Program, National Institute of Mental Health (ZIAMH002795) and by UK Medical Research Council programme (grant no. G1100809/1).
Acknowledgements
The authors are grateful to Dr Robert Innis for making the [11C]PBR28 scans available for this review. Portions of this work were presented in preliminary form at Neuroreceptor Mapping (Amsterdam, 2014).
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Authors’ contributions
MT, GR, MV, and AB contributed significantly to the overall work design. PZ-F, MF and SSZ provided the PET data that were then analysed by MT. SSZ contributed to radiochemistry data interpretation. MT prepared the manuscript with input from MV, GR, PZ-F, and AB.
