Sage Journals: Discover world-class research

Abstract

Current methods using a single PET scan to detect voxel-level transient dopamine release—using F-test (significance) and cluster size thresholding—have limited detection sensitivity for clusters of release small in size and/or having low release levels. Specifically, simulations show that voxels with release near the peripheries of such clusters are often rejected—becoming false negatives and ultimately distorting the F-distribution of rejected voxels. We suggest a Monte Carlo method that incorporates these two observations into a cost function, allowing erroneously rejected voxels to be accepted under specified criteria. In simulations, the proposed method improves detection sensitivity by up to 50% while preserving the cluster size threshold, or up to 180% when optimizing for sensitivity. A further parametric-based voxelwise thresholding is then suggested to better estimate the release dynamics in detected clusters. We apply the Monte Carlo method to a pilot scan from a human gambling study, where additional parametrically unique clusters are detected as compared to the current best methods—results consistent with our simulations.

Keywords

Denoising dopamine lp-ntPET Monte Carlo PET

Introduction

Detection of stimulus- or task-related dopamine (DA) release is relevant for investigating neurodegeneration or response to reward/learning in both healthy subjects and those with pathological conditions such as addiction. A double [¹¹C]raclopride (RAC—a D2 receptor antagonist) scan protocol, with one scan performed at baseline and the second after intervention, is often used to estimate the intervention-induced DA release.^1,2 This approach, however, requires the dopaminergic system to be in steady-state and is not well-suited to determine transient release. An alternative approach using the linear parametric neurotransmitter PET model (lp-ntPET)³ has been developed for this purpose: here a subject is scanned in a baseline state during tracer uptake, then presented with a task or stimulus approximately mid-scan; a transient increase in synaptic DA concentration corresponds to a transient decrease in the RAC time activity curve (TAC), which is modeled using lp-ntPET. This approach has been applied on a voxel-level,^4,5 where voxel TACs are fit with two kinetic models: the multilinear reference tissue model (MRTM),⁶ which does not incorporate a DA-induced perturbation in its modeling, and lp-ntPET, which does. A voxelwise comparison of the two fits using an F-test followed by significance thresholding yields a binary (not necessarily contiguous) significance mask indicating voxels of significant release. To further control for false positives, a voxel cluster size threshold (CST)^4,5,7 is imposed, such that all clusters containing a number of voxels lower than the threshold size are removed. The CST thus defines the smallest detectable DA release structure.

The resultant significant clusters constitute data-defined regions of interest (ROI). The ROI-level transient release patterns elucidate unique release dynamics in the corresponding subregions of the brain; one may observe release associated with anticipation of a reward in the ventral striatum,^8–10 or in the dorsal striatum from learning stimulus-action-reward associations.^11–13 Previous work verified the efficacy of the voxelwise method in a smoking study, where sex differences in the location, timing, and magnitude of the DA response to smoking were found.¹⁴

As we shall show, this approach is well-suited to detecting relatively large clusters containing relatively high DA release. However, smaller clusters containing lower release are often not detected by either the initial significance thresholding or the subsequent CST; the small signal changes are often obscured by noise and biased by partial volume effects (PVE), resulting in a subset of true release voxels having F-values below the significance threshold.

This motivated the design of a novel Monte Carlo (MC) approach that allows detection of DA release in voxels below the significance threshold—provided that the voxel behavior meets a priori criteria incorporated into a cost function. Through simulating short-lived DA release of various magnitudes, we compare the detection sensitivity (i.e. true positive rate) and parametric accuracy of the standard F-test-based method (denoted “standard” method) and the MC method. We demonstrate that the MC method can be tailored to increase detection sensitivity without increasing the CST. Additionally, since the MC method does not modify the image data, the voxel-level parametric accuracy is preserved by definition.

In such a voxelwise approach, the aim is to detect neurophysiological changes on the smallest possible volume—the voxel—yet the CST reintroduces a regional constraint on detectability. Relatively large regions with low release may also not satisfy the CST requirement, since only a portion of the region affected by DA release may be detected. It has been demonstrated that image denoising can be applied to increase detection sensitivity; however, this further increases the CST through correlating neighbouring voxels, hence extending the size of pre-existing false positive clusters.^5,15 Thus, minimizing the CST is essential for ensuring regions small in size and/or having low release are reliably detected. Parametric accuracy within these clusters is also paramount to accurately compare the release dynamics subregionally, between different study cohorts, or in response to different stimuli.

In this work, we begin by assessing the CST and sensitivity in the standard method through simulation of two RAC scans. We then apply the MC method to the same simulations to demonstrate its improved performance over the standard method. Finally, we verify the efficacy of the MC method beyond our simulations by comparing the results of the standard and MC methods applied to a pilot scan from a human DA release-inducing gambling study. With the ground truth unknown in this case, the results are evaluated in terms of consistency with observations from the simulations.

Theory

Kinetic modelling

In a single-scan protocol to detect transient DA release, the voxel-level RAC TACs are fit with two kinetic models. The first, MRTM, assumes the system is in steady-state, and thus does not model system responses to transient DA release

C_{T} (t) = R_{1} C_{R} (t) + k_{2} \int_{0}^{t} C_{R} (u) d u - k_{2 a} \int_{0}^{t} C_{T} (u) d u

(1)

where C_{T} (t)

is the concentration in the target voxel and

C_{R} (t)

is the average concentration in the reference region.

R_{1}

is the ratio between the rate constants describing the tracer transport from plasma to tissue in the target and reference regions,

k_{2}

[s⁻¹] is the rate constant describing the efflux of tracer from the non-specific binding compartment to the plasma, and

k_{2 a}

[s⁻¹] is the effective rate constant describing the efflux of tracer from the non-specific and specific binding compartments to the plasma.

k_{2}

and

k_{2 a}

combine to evaluate the non-displaceable binding potential¹⁶

B P_{N D} = \frac{k_{2}}{k_{2 a}} - 1

(2)

The second model, lp-ntPET, adds an additional term accounting for transient decreases in the TAC due to DA release

\begin{array}{l} C_{T} (t) = R_{1} C_{R} (t) + k_{2} \int_{0}^{t} C_{R} (u) d u \\ - k_{2 a} \int_{0}^{t} C_{T} (u) d u - γ \int_{0}^{t} h (u) C_{T} (u) d u \end{array}

(3)where

γ

[s⁻¹] is a proxy for the release strength and

h (t)

is a gamma variate function describing the release dynamics

h (t) = θ (t - t_{D}) {(\frac{t - t_{D}}{t_{P} - t_{D}})}^{α} \exp [- α (\frac{t - t_{P}}{t_{P} - t_{D}})]

(4)where

θ (t)

is the Heaviside function,

t_{D}

and

t_{P}

denote the onset and peak of DA release, respectively, and

α

controls the “sharpness” of the curve.

The decrease in RAC binding (and hence $B P_{N D}$ ) is transient, and thus we define a time-dependent $B P_{N D} (t)$ and PET-estimated DA release level, $DAR (t)$

\begin{array}{l} B P_{N D} (t) = \frac{k_{2}}{k_{2 a} + γ h (t)} - 1, \\ D A R (t) = 100 \times (1 - \frac{B P_{N D} (t)}{B P_{N D} (0)}) \end{array}

(5)

Together, these metrics serve as physiological proxies for the magnitude and temporal characteristics of DA-induced system responses.

Significance and cluster size thresholding

After fitting with MRTM and lp-ntPET, a voxelwise F-test is computed using the weighted residual sum of squares for each fit. TACs indicative of DA release are assumed to be significantly better fit by lp-ntPET. The probability that the additional transient term in lp-ntPET provides a better fit is given by the cumulative distribution function evaluated at the F-value. Significance thresholding at some probability, corresponding to some $F_{crit}$ , provides an initial control for false positives. Further control is provided by cluster size thresholding: after simulating many baseline scans and applying the initial significance thresholding, cluster sizes are computed for each scan, and cluster sizes smaller than a given threshold (the CST) are rejected. Following previous work, we define the CST as the cluster size threshold that ensures 90% of baseline simulations are completely free of false positives.^4,5

A novel Monte Carlo method for improving detection sensitivity

All voxels not passing the F-test significance threshold or the CST are classified as baseline voxels. The F-test assumes the F-values of all baseline voxels follow an F-distribution. In simulations of scans with both baseline and release voxels where the ground truth was known, the distribution of F-values from true baseline voxels closely follows the theoretical F-distribution, but the distribution of F-values from voxels classified (not necessarily correctly) as baseline by the detection algorithm do not, due to a large number of false negatives (Figure 1(a)). While relatively high voxel-level noise is primarily responsible for these false negatives, the problem is exacerbated by PVE correlating release voxels with nearby baseline voxels. In such voxels, the temporal pattern is contaminated by the pattern from the neighbouring baseline voxels—diminishing the DA-induced decrease in the TAC—ultimately making them harder to detect.

Figure 1.

(a) Histogramming the F-values of ground truth baseline (i.e. non-release) voxels from simulations containing clusters of DA release, as well as the F-values of the voxels classified (not necessarily correctly) as baseline voxels by the standard detection algorithm. (b) Two representative coronal slices of the parametric $γ$ images for the cluster release simulation. Striatal boundaries are defined by white lines, and original cluster boundaries are defined by black lines. PSF filtering induces a gradient in the $γ$ values within each cluster, as well as spillover of release beyond the initial cluster boundaries. (c) Upper pane: simulated noiseless TACs of the putamen, with release levels quantified by various $γ$ . Lower pane: the corresponding simulated noisy voxel TACs after reconstruction and denoising. (d) The corresponding noiseless $DAR (t)$ for the TACs of (c).

The problem, then, is to find a binary local clustering of release voxels with the global classification state of baseline voxels obeying an F-distribution. To solve this, we borrow the mathematics of MC approaches used in Ising spin models of ferromagnets.^17–19 Theoretically, such models are similar: in the Ising model, clustering of neighbouring spins is promoted under the constraint of the global states obeying a Boltzmann distribution. Here we replace the spins with voxels and the Boltzmann distribution with an F-distribution. Let us denote a release voxel as having the state $s = 1$ , and a baseline voxel as having the state $s = - 1$ . Also let $\hat{X (F)}$ be the observed pseudo-continuous F-distribution of baseline voxels and $X (F)$ be a true F-distribution. Then we can define a cost function, $E$ , composed of a parity term, $P$ , that rewards local clustering, and a regularizer term, $R$ , that ensures the classified baseline voxels approximate a true F-distribution

\begin{array}{l} E = - \sum_{〈 i j 〉} s_{i} s_{j} + \frac{β}{F_{\max}} \int_{0}^{F_{\max}} \frac{{[X (F) - \hat{X (F)}]}^{2}}{X (F)} d F \\ = - P + β R \end{array}

(6)where

〈 i j 〉

indicates each voxel’s six nearest neighbours in 3D,

F_{\max}

is the maximum observed F-value in the region, and

β

is a hyperparameter controlling regularization.

We use the following iterative method, leveraging the MC Metropolis algorithm,^18,19 to find the minimum of E. The method begins by initializing each voxel in a brain region randomly with $s = \pm 1$ , forming a random global state. From here, the algorithm proceeds as follows:

Choose a voxel at random¹ and set $s \to - s$ .

Compute $Δ E$ based on this change.

If $Δ E < 0$ , always accept the change. If $Δ E \geq 0$ then:

If $s = 1 \to - 1$ (i.e. release to baseline) then the acceptance probability² of the change is set to the value of the probability density function for F at the voxel, $PDF (F) .$

If $s = - 1 \to 1$ (i.e. baseline to release), then the acceptance probability of the change is set to $1 - PDF (F) .$

Repeat for all other voxels.

This constitutes one iteration of the algorithm. We accept changes according to $PDF (F)$ (instead of $CDF (F)$ , as in the standard method) as per our observations that the global baseline voxel distribution follows the F-distribution; this is encouraged in two ways: (i) changes resulting in $Δ E < 0$ contain explicit F-distribution regularization through $Δ R$ , and (ii) the acceptance frequency of changes resulting in $Δ E > 0$ follows the expected F-distribution, $PDF (F)$ . This means that over several changes, the distribution of F-values for voxels classified according to the described criteria will approximate an F-distribution.

As we shall show, the algorithm reaches a stable minimum cost function value after a few iterations. In subsequent iterations, likely global states—under the constraints of the cost function—are shuffled through probabilistically (introducing small oscillations in the value of the cost function), so that the percentage of iterations a voxel has $s = 1$ can be interpreted as a confidence, $c$ , that DA release exists in that voxel. Unlike F, this new metric $c$ incorporates knowledge of both the global baseline voxel distribution and local spatial neighbourhood.

Since there is no theoretical confidence interval for $c$ , more flexibility exists for choosing a critical value for significance thresholding, $c_{crit}$ . Once selected, the same significance thresholding followed by cluster size thresholding applied in the standard method is applied in the MC method. For instance, $c_{crit}$ can be chosen such that the resulting CST computed is equivalent to the standard method.

Methods and materials

Simulation design

To compare the MC method to the standard method, we devise two RAC simulations (Figure 1(b) to (d)):

Baseline: all voxels are free of DA release; used to compute the CST.

Cluster release: six clusters of release throughout the striatum, modulating low/high release and various cluster sizes (Table 1).

Table 1.

Detection algorithm and cluster naming conventions.

Cluster name	Location	Release level $γ$ [s⁻¹] × 10⁴	Size [voxels]	Size [cm³]
Large size/high release	Left putamen	6.6	500	0.91
Small size/low release	Left putamen	3.3	300	0.54
Medium size/high release	Left caudate	6.6	400	0.72
Medium size/low release	Right caudate	3.3	400	0.72
Small size/high release	Right putamen	6.6	300	0.54
Large size/small release	Right putamen	3.3	500	0.91
Detection Algorithm	Reconstruction	Post-processing (kernel size)	Average voxelwise sensitivity [%]	CST [voxels]
SM-CST	HYPR-AU-OSEM	HYPR (5 mm)	34.9	108
SM-sensitivity	OSEM	Post-filter (5 mm); HYPR (5 mm)	48.5	235
MC-CST ( ${β, c_{crit}} = {8 \times 10^{5}, 70 %}$ )	HYPR-AU-OSEM	HYPR (5 mm)	46.0	104
MC-sensitivity ( ${β, c_{crit}} = {8 \times 10^{4}, 65 %}$ )	HYPR-AU-OSEM	HYPR (5 mm)	56.0	173

We generate the simulations from measured data of a healthy control RAC scan. The subject was scanned in the baseline state for 75 min after a 16 mCi bolus injection on the Siemens High Resolution Research Tomograph (HRRT).²⁰ An anatomical MRI was acquired on a Philips Achieva 3 T scanner, which was segmented into 41 regions using Freesurfer (surfer.nmr.mgh.harvard.edu).^21–24 The segmented MRI image and the RAC image were co-registered using SPM12 (Wellcome Centre for Human Neuroimaging, https://www.fil.ion.ucl.ac.uk/spm/software/spm12), allowing us to compute regional TACs. Each TAC was fit using SRTM2²⁵ and the obtained kinetic parameters were used to generate noiseless TACs assigned to the voxels of each region. To simulate voxel-level TACs with DA release, we modified the baseline TACs using lp-ntPET with $t_{D} = 35$ min, $t_{P} = 40$ min, $α = 2$ , and various values of $γ$ .

The release and baseline TACs were discretely sampled to 27 frames (1 min × 3, 3 min × 24), forming a noiseless dynamic dataset. These images were then filtered using a 2.5 mm isotropic Gaussian to simulate the point spread function (PSF) of the HRRT. These filtered noiseless images serve as our ground truth, as no PSF-modeling is included in our reconstructions. From here the images are forward-projected while including the effects of attenuation (from the attenuation map of the healthy control), detector normalization, and frame duration. Poisson noise is added to the obtained noiseless sinograms, which are then reconstructed and post-processed; 50 noisy realizations are generated for each simulation.

Previous work generated noisy simulations by adding Gaussian noise scaled to the noise levels observed on the HRRT scanner to noiseless TACs (i.e. directly in image space, thus not simulating the effects of the PSF, attenuation, normalization and reconstruction).^4,5 We also add noise to our noiseless dataset in this manner (hereafter: “Gaussian noise” simulations) to evaluate consistency with published results and the effect of our more accurate data modeling on detection sensitivity outcomes.

Reconstruction and denoising protocols

Previously,¹⁵ we found that the standard method can be optimized for sensitivity (SM-sensitivity) when the input images are reconstructed with Ordinary Poisson OSEM²⁶ (OP-OSEM: 16 subsets, 6 iterations), followed by a 5 mm Gaussian filter and 5 mm HighlY constrained backPRojection (HYPR) post-processing²⁷ for noise reduction. This maximized sensitivity comes with the trade-off of high CST and low parametric accuracy. Conversely, the standard method can be optimized for CST (SM-CST) by using a denoised reconstruction applying the HYPR operator after each update of OSEM, HYPR-AU-OSEM,²⁸ with a 2.5 mm kernel size followed by 5 mm HYPR post-processing, producing images with high parametric accuracy, but low detection sensitivity.

We thus apply the MC method to the HYPR-AU-OSEM images only, in an effort to maintain high parametric accuracy and low CST, while increasing the sensitivity to levels comparable to SM-sensitivity. In summary, there are two reconstruction method used for the standard method, and a single for the MC method.

Optimizing the MC method

In optimizing $β$ , we histogrammed the cost function term contributions and found $β = 8 \times 10^{4}$ results in $P$ being weighted more than $β R$ , whereas $β = 8 \times 10^{6}$ results in $β R$ dominating. Thus, we test $β \in {8 \times 10^{4}, 4 \times 10^{5}, 8 \times 10^{5}, 4 \times 10^{6}, 8 \times 10^{6}}$ as hyperparameter candidates.

Since the algorithm employed in the MC method is non-deterministic, we want to ensure that the metric of interest (detection sensitivity) has low variability after multiple trials of the same input (parametric F image). This was done by running 100 trials of the MC method applied to a single noisy realization of the cluster release simulation and taking the mean and standard deviation of the sensitivities found.

Analysis workflow

Analysis of the simulated noisy baseline data is first performed to compute the CST for each method. We adopt the previously published workflow^4,5: each striatal voxel TAC is fit with MRTM and lp-ntPET, with 819 basis functions for lp-ntPET constructed from permutations of $t_{D} \in [30, 50]$ min and $t_{P} \in [t_{D} + 1.5, t_{end} - 5]$ in 1.5 min intervals, and $α \in {0.25, 1, 4}$ . Note that the exact ground truth basis function values are intentionally not provided; the likelihood of guessing them correctly a priori is extremely small.

Fits are applied using weighted linear least squares. Since HYPR post-processing uses a composite image formed from a sum of the decay-uncorrected dynamic frames to denoise each frame, we assume each decay-uncorrected denoised frame should have roughly equal noise levels. Taking this information into consideration, we use an “uncorrected” weighting scheme which performs ordinary least squares fitting on the uncorrected TACs.

Consistent with the previously published workflow, we continue by computing voxelwise F-values.³ For the standard F-test method, we threshold at $p < 0.05$ ( $F_{crit} = 2.866$ ) to generate a binary significance mask, from which the CST is computed. For the MC method, the parametric F images are fed to the MC algorithm to generate a parametric $c$ image from 100 iterations. Thresholding at a chosen $c_{crit}$ generates the binary significance mask, from which the CST is computed. In other words, the MC method modifies the standard method by replacing F with $c$ in the significance thresholding step.

With the CST determined, the same analysis is applied to the cluster release simulations to determine significant clusters of release. When analyzing each of the six clusters, we compute the voxelwise sensitivity: the voxel-level true positive rate, and binary sensitivity: the percentage of realizations in which any portion of the true release cluster is detected.⁵ We then assess parametric accuracy and precision in the detection of the release dynamics. First, the mean absolute error (MAE) of $t_{D}$ and $t_{P}$ is computed within each cluster, reflecting the voxel-level accuracy in the timing parameters. Second, for all detected true release voxels, $B P_{N D} (t)$ and $DAR (t)$ are computed to 1.5 min temporal resolution. At each time point, the median and median absolute deviation is taken over all detected voxels within a cluster. These estimated curves are visually compared to the ground truth.

Further parametric thresholding

Anticipating that voxel-level $B P_{N D} (t)$ and $DAR (t)$ estimates may contain outlying values, we devise a voxelwise parametric thresholding within each detected cluster to include only the voxels with locally similar release dynamics for computing a representative median $B P_{N D} (t)$ and $DAR (t)$ for each cluster. Using a sliding window approach, we take the $i$ th detected voxel in the parametric $t_{D}$ image and compute the average absolute deviation (AD) of $t_{D}^{i}$ from its up to 124 neighbouring $t_{D}$ values in 3D (since we assume the timing characteristics will be relatively homogenous over this volume) which form the set $N_{i}$

A D (t_{D}^{i}) = \frac{1}{| | N_{i} {| |}_{0}} \sum_{j \in N_{i}} | t_{D}^{i} - t_{D}^{j} |

(7)

The same operation is applied to the parametric $t_{P}$ image to yield $A D (t_{P})$ . The $A D (t_{D})$ and $A D (t_{P})$ images are converted to binary masks by thresholding for $A D (t_{D}) \leq 4$ min and $A D (t_{P}) \leq 6$ min, followed by a voxelwise AND operator being applied to the two binary masks to yield a single binary mask. These thresholds are taken from $MAE (t_{D})$ and $MAE (t_{P})$ computed in our simulations (presented in the Results). In doing this, only voxels performing approximately equal to or better than average in estimating the release dynamics will survive for quantitatively analyzing $B P_{N D} (t)$ and $DAR (t)$ —the other voxels, while still successfully detected, remain informative on the spatial extent of DA release, but do not offer sufficiently high parametric accuracy to warrant their inclusion in a quantitative investigation of the release dynamics.

This approach may additionally be able to separate parametrically unique clusters that have been merged by PSF- or post-processing-induced PVE; in the “bridge” connecting the two clusters, there will exist a spatial gradient in timing parameters. Consequently, the absolute average deviation will be higher along the bridge, making these voxels more likely to be rejected by the parametric threshold.

Application to a human gambling study

To verify the performance of these methods on real human data, we analyze a pilot scan for a human study with the Vancouver Gambling Task²⁹ as the mid-scan stimulus. The subject was given a 20 mCi bolus RAC injection and was presented with the task 36 min into a 75 min scan on the HRRT. The same reconstruction methods were used as discussed, and the reconstructed data were corrected for detector normalization and dead-time, attenuation, scatter, randoms, and radioactive decay. Framing was 1 min × 5, 2 min × 5, 3 min × 20. An anatomical MRI was acquired on the Philips Achieva 3 T. A rigid inter-frame realignment was performed on the PET images using SPM12, before resampling the MRI to PET space and segmenting the striatum using Freesurfer.

Results

Determining $β$ and $c_{crit}$ in the MC method

For each tested value of $β$ , we plot the average voxelwise sensitivity over all clusters in the cluster release simulation versus CST, for decreasing values of $c_{crit} \in [65 %, 90 %]$ as one moves left to right along a curve (Figure 2(a)). The sensitivity and CST for the standard methods and the “Gaussian noise” simulation are included in the same plot for comparison (single values). The “Gaussian noise” simulation results, compared to those of the more realistic noise simulation, demonstrate that not accounting for the realistic effects of attenuation, normalization, and reconstruction affects the simulations outcomes—it artificially decreases the estimate of CST and increases the estimate of detection sensitivity.

Figure 2.

(a) Voxelwise sensitivity (averaged over all clusters in the cluster release simulation) as a function of CST for various values of $β$ in the MC method. The voxelwise sensitivity and CST for the standard methods and Gaussian noise simulation are provided for reference. Moving along the $β$ curves from left to right, $c_{crit}$ is decreasing from 90 to 65. $c_{crit} = 70 %$ is denoted by an open circle for each $β$ . (b) Spatial distributions of $c$ for a single realization of the cluster release simulation, for various $β$ . The $c$ maps are thresholded at a constant $c_{crit} = 70 %$ to derive binary sensitivity masks. (c) Thresholding the $c$ map for $β = 8 \times 10^{5}$ from (c) at various $c_{crit}$ to derive binary sensitivity masks. (d) Upper pane: A typical progression of the cost function in the MC method. Lower pane: histogrammed cost function term contributions over 100 iterations of the MC method for the two $β$ used in optimizing the MC method.

The effect of $β$ alone on sensitivity versus CST is negligible; all curves have near-identical trajectories. This is because $β$ and $c_{crit}$ are coupled: if we wish to use the MC method at a given CST, all values of $β$ can yield similar sensitivity, but the $c_{crit}$ at which this sensitivity is achieved is different for each $β$ . For a fixed $c_{crit} = 70 %$ , increasing $β$ reduces over-clustering (Figure 2(b)), but at the cost of reduced sensitivity. The same trade-off is found for a fixed $β = 8 \times 10^{5}$ while increasing $c_{crit}$ (Figure 2(c)). As a result, the binary sensitivity masks for ${β, c_{crit}} = {8 \times 10^{4}, 70 %}$ and ${β, c_{crit}} = {8 \times 10^{5}, 65 %}$ are nearly identical. We see from the distributions of $- Δ P$ and $β Δ R$ in Figure 2(d) that the parity term is dominant for $β = 8 \times 10^{4}$ , which will promote over-clustering unless a high $c_{crit}$ is chosen. For larger $β$ , the parity term contribution becomes much smaller, and over-clustering is avoided for most $c_{crit}$ tested.

Thus, we proceed by optimizing the MC method in a manner similar to the standard method: by using ${β, c_{crit}} = {8 \times 10^{4}, 65 %}$ , we can obtain maximal sensitivity (56.0%), which occurs at a CST of 173 voxels; thus, this is the MC method optimized for sensitivity (MC-sensitivity). However, if we wish to maximize sensitivity without increasing the low CST of SM-CST (108 voxels), we can use the parameters ${β, c_{crit}} = {8 \times 10^{5}, 70 %}$ , which produces a sensitivity of 46.0% at a CST of 104 voxels. Thus, this is the MC method optimized for CST (MC-CST). The details of the two standard methods and these two MC methods are summarized in Table 1.

Demonstrating cost function equilibrium and reproducibility of the algorithm

The cost function tends to reach a constant minimum value by the third iteration (Figure 2(d)). We thus use 100 subsequent iterations to calculate parametric $c$ images.

After running 100 trials of MC-CST on a single noisy realization of the cluster release simulation, we found a voxelwise sensitivity of $57.2 \pm 0.3 %$ . The range of sensitivities across all noisy realizations was $46.0 \pm 10.9 %$ , demonstrating that the variance due to noise greatly exceeds the variance from the non-deterministic characteristics of the MC algorithm.

Cluster release simulation: Sensitivity

The MC methods outperform the standard methods in terms of voxelwise and binary sensitivity (Figure 3(a) to (c)). The gains are visually (Figure 3(a)) and quantitatively (Figure 3(b)) highest in the low release clusters—regardless of size; an increase of 57% is observed from SM-sensitivity to MC-sensitivity, and 43% from SM-CST to MC-CST. In the high release clusters, these improvements are 5% and 16%, respectively. As well, MC-sensitivity can be compared to SM-CST, since they use the same input images: here the improvements are 149% in low release clusters and 38% in high release clusters (Figure 3(b)), though with the trade-off of a 60% increase in CST. However, this is less than the 118% increase in CST when comparing SM-sensitivity versus SM-CST—both of which perform worse in terms of sensitivity.

Figure 3.

(a) Two representative coronal slices of the voxelwise sensitivity maps for each method in the cluster release simulation, alongside the ground truth maps of $γ$ for reference. (b) Comparing the average voxelwise sensitivity for each cluster between methods. (c) Comparing the binary sensitivity for each cluster between methods. (d) MAE (computed over all voxel-level estimates) in $t_{D}$ (lower bars) and $t_{P}$ (upper bars).

More importantly, as compared to the standard method, the MC method raises binary sensitivity in the lower release clusters to an average of 82% from an average of 60% in the putamen and to 50% from 26% in the caudate when comparing MC-sensitivity with SM-sensitivity (Figure 3(c)), meaning the probability of completely missing a low release cluster in any given scan is markedly decreased. The cluster size or striatal location (i.e. putamen or caudate) does not appear to have an effect on the degree of improvement in the MC methods.

Cluster release simulation: Parametric accuracy

Over all methods, $MAE (t_{D})$ and $MAE (t_{P})$ average to 3.2 min and 5.1 min, respectively, in high release clusters, and 3.9 min and 5.8 min in low release clusters (Figure 3(d)). These values motivate the thresholds of $A D (t_{D}) \leq 4$ min and $A D (t_{P}) \leq 6$ min incorporated into our suggested parametric thresholding. The MC methods have slightly higher $MAE (t_{D})$ and slightly lower $MAE (t_{P})$ in most clusters. Since the MC methods use the same images as the SM-CST, the additionally detected voxels have slightly different biases on average.

$B P_{N D} (t)$ and $DAR (t)$ for representative low and high release clusters are presented in Figure 4 for all methods. Especially in $B P_{N D} (t)$ , the negative bias in SM-sensitivity introduced by the relative wide post-filter is noticeable—producing severely underestimated median curves and baseline $B P_{N D} (0)$ quantification. In $DAR (t)$ , the effect is less noticeable; the negatively biased pre-intervention and post-intervention values cancel when taking the ratio described in equation (5). The MC methods, using the less parametrically-biased HYPR-AU-OSEM images, maintain similar levels of bias as MC-CST; any further bias is only realized by the additionally detected voxels having poorer parametric accuracy.

Figure 4.

$B P (t)$ estimates in (a) the small size/low release cluster and (b) medium size/high release cluster in the cluster release simulation, for each method tested. The corresponding $DAR (t)$ estimates are shown in (c) and (d), respectively. Solid lines indicate the median value over all detected voxels at each time point (sampled to the temporal resolution of the pre-specified timing parameter values in the set of basis functions, 1.5 min). Shaded regions indicate $\pm$ one median absolute deviation. The ground truth clusters also have a deviation because the PSF filtering generates a distribution of $B P (t)$ and $DAR (t)$ . Baseline $B P$ (i.e. $B P (0)$ ) is compared to the ground truth for each method.

After applying voxelwise parametric thresholding, overall the new $B P_{N D} (t)$ and $DAR (t)$ better represent the ground truth transient behaviour (Figure 5, only $B P_{N D} (0)$ and $DAR (t)$ shown for brevity). In the medium size/high release cluster, the peak release magnitude is better estimated after parametric thresholding. However, in the small size/low release cluster, parametric thresholding causes the peak magnitude to be overestimated. This is likely a selection effect; since a distribution of release levels exist within the cluster, voxels at the higher release end of this distribution are easier to detect, skewing the median estimate. Nevertheless, for both clusters, the magnitude estimates of release level during decay back to baseline are significantly improved (see arrows in Figure 5).

Figure 5.

$DAR (t)$ and $B P (0)$ estimates for a representative low and high release cluster in the cluster release simulation, with and without parametric thresholding. Arrows point out the most notable improvements in using the parametric thresholding approach in one of the detection methods, MC-sensitivity.

Again, the MC methods have similar biases to SM-CST and still outperform SM-sensitivity, in terms of overall accuracy in $B P_{N D} (t)$ and $DAR (t)$ across time, with the latter method tending to extend the decaying tail of $DAR (t)$ .

Application to a human gambling study

In our single human dataset, provided the CST has completely removed all false positive clusters (a 90% probability of being true), the standard and MC methods followed by parametric thresholding detect seven clusters of release in response to the gambling task (Figure 6(a)): four peaking around the time of intervention (clusters 1 (right caudate), 2 (right putamen), 3 (left putamen), and 4 (caudate)); two initiating at 50 min—the end of the task—and peaking ∼5 min later (clusters 5 and 6 (both left putamen)); and one peaking ∼10 min after the intervention (cluster 7 (right ventral striatum)). All methods except SM-CST detect the regions encompassing all seven clusters. However, SM-sensitivity merges clusters 1 and 7, and cluster 5 fails the CST after parametric thresholding. Furthermore, MC-sensitivity merges clusters 5 and 6, and SM-CST does not detect clusters 5-7. Even though the ground truth is unknown, and thus direct verification of the results is not possible, these results are consistent with our simulations, where SM-sensitivity and both MC methods provide higher detection sensitivity.

Figure 6.

(a) A summary of the results from applying all methods with parametric thresholding to a human pilot scan with a gambling task presented at 36 min into a 75 min scan. Identified clusters are colour-coded on representative coronal slices, and baseline-classified voxels are coloured in dark gray. Voxels that were detected but failed parametric thresholding are coloured with light gray. $DAR (t)$ for each method is presented for each cluster, when applicable. (b) Demonstrating the ability of the parametric thresholding to separate two parametrically-unique clusters (3 and 5) that were initially detected as a single cluster by SM-sensitivity and the MC methods.

Parametrically, comparing $DAR (t)$ from SM-CST with those from the MC methods, the shapes are quite similar and the peak magnitudes are slightly lower in the MC methods for some clusters. $DAR (t)$ from SM-sensitivity consistently has lower peak magnitudes and extended tails. Both observations are consistent with the results obtained in the simulations. As well, some clusters were initially merged, only to be identified as parametrically unique after parametric thresholding (c.f. Figure 6(b)).

Discussion

MC method

In this work, we proposed a Monte Carlo-based modification to the standard F-test analysis method for voxelwise detection of transient DA release. Through simulation, we designed two optimizations of the MC method: one for sensitivity (MC-sensitivity) one for CST (MC-CST) and compared them to the standard method optimized for either sensitivity (SM-sensitivity) or CST (SM-CST). MC-sensitivity improves sensitivity by up to 83% over SM-sensitivity with a 26% reduction in the regional constraint on detectability (i.e. the CST). Additionally, MC-CST improves sensitivity by up to 50% over SM-CST with a similar CST. Thus, the MC methods more reliably detect DA release in response to a task or stimulus. SM-sensitivity applies a post-filter to the raw OSEM images to increase sensitivity through noise-reduction, but in a manner that also increases the CST and decreases parametric accuracy. Since the MC method does not modify the original input image, applying it to the denoised reconstruction HYPR-AU-OSEM estimates transient dopamine release levels with higher parametric accuracy than post-filtered OSEM images while achieving similar or better sensitivity. SM-CST also uses the same HYPR-AU-OSEM images, but MC-sensitivity improves sensitivity by up to 180%.

While applying the MC method to the post-filtered OSEM images also improved sensitivity without increasing its relatively large CST, we do not present these results here since similar sensitivities were achieved with lower CST and better parametric accuracy when using the MC-sensitivity. Instead of trying to improve the sensitivity through image blurring and thus sacrificing parametric accuracy, the MC method replaces the standard F-test-based significance thresholding and does not modify the denoised TACs—preserving the initial quality of the images.

The MC method can be likened to applying a spatial filter to the binary significance mask, though the kernel would be unique for each voxel, taking into account both the local parity and the global adherence to a correct F-distribution for all baseline voxels; simply filtering the binary significance mask would also improve sensitivity but simultaneously increase CST by isotropically increasing the size of both true clusters and false positive clusters, whereas the MC method will only probabilistically increase cluster size if the global deviation from a true F-distribution suggests a subset of false negative voxels exists in that region. This explains why the MC methods are able to achieve higher sensitivity and lower CST than the standard methods.

The assumption that the true baseline voxels should follow an F-distribution—incorporated into the cost function of the MC algorithm—is borne out in our simulations: a histogram of the known baseline voxels in Figure 1(a) follows the theoretical distribution closely. The F-test assumes normality in the weighted residuals. The noise in PET is Poisson in the sinogram domain, but after reconstruction and post-processing, the temporal noise in the TACs is approximately Gaussian. True release voxels will not follow an F-distribution, since they tend to violate the null hypothesis of the F-test. However, when the ground truth is unknown, release voxels with F-values below the significance threshold or part of a cluster with a size below the CST (i.e. false negatives) are added to the set of true baseline voxels, distorting the F-distribution. The MC method then aims to remove as many false negatives as possible, so that the set of remaining baseline-classified voxels follows an F-distribution.

False negatives exist primarily due to noise in the voxel TACs; the DA-induced change in the TAC is comparable to the noise level (Figure 1(c)), making detection difficult. This problem is exacerbated by PVE, which mixes the temporal patterns of release and baseline voxels to decrease the observed DA-induced change, ultimately reducing sensitivity. Evidence for this claim and a general discussion of the PVE-sensitivity relationship can be found in the Supplementary Information.

Parametric thresholding

To enhance cluster-level estimates of release dynamics, we also suggested a voxelwise parametric thresholding that excludes detected voxels in a cluster having timing parameters significantly deviating from the local neighbourhood mean. The subset of surviving voxels was shown to be a more parametrically accurate sample to calculate $B P_{N D} (t)$ and $DAR (t)$ —useful metrics for quantifying the system response to a DA-inducing stimulus or intervention. As well, in human data, the parametric threshold was able to separate nearby clusters with different dynamics that were initially detected as a single cluster.

Improved simulation methodology

When performing this study, we also made an important methodological observation that is extendable to other simulation studies: while designing our simulations, we demonstrated that not including the effects of attenuation, normalization, and reconstruction in the simulation resulted in a significantly decreased estimate of the CST and an increased estimate of sensitivity when compared to previous simulations that just add Gaussian noise.^4,5 Therefore, simulations that do not include these effects may overestimate image quality and the predictive power of a given analysis algorithm when applied to real data.

Application to a human dataset

Applying these methods to a human scan with a gambling task as the stimulus, SM-sensitivity and the MC methods are able to detect several clusters of DA release in the striatum, though the former cannot parametrically distinguish some of these clusters; its large spatial kernel homogenizes neighbouring TACs and thus subtle parametric differences in magnitude and timing cannot be resolved. MC-sensitivity also merges two clusters found separately in MC-CST, but because both methods use the same initial images, either: (i) the two clusters are indeed one, and the lower sensitivity of MC-CST results in the single cluster being divided in two; or (ii) the two clusters are indeed unique, and this is an example of over-clustering in MC-sensitivity. However, the two clusters found by MC-CST are similar parametrically to each other and to the single cluster found by MC-sensitivity. Therefore, subsequent analysis using either method would be similar.

Other applications

Within the realm of existing applications of lp-ntPET, it may be possible to speculate that the proposed method may be more sensitive in detecting task-related differences in DA release dynamics between subject populations, such as the proposed preliminary finding related to the existence of loci of faster responses to smoking in women than men in dorsal putamen.¹⁴

More generally however, while the theory presented in this work was tailored to detecting DA release, the mathematics of the MC approach can be adapted to any voxelwise binary classification method constrained by a statistical distribution. For instance, the F-test approach here could be extended to other tracers where nested models are compared to determine physiological responses or to fMRI analysis where “activated” voxels are determined through t- or z-tests.

Limitations

In constructing our simulated dataset, we used the lp-ntPET model itself to compute the noiseless release TACs, whereas more sophisticated models exist to describe the relationship between elevations in DA concentration and the resulting perturbations to the RAC TACs.³⁰ We decided to use the simpler lp-ntPET model in order to have direct knowledge of the ground truth $γ$ values to assess the parametric accuracy through $B P_{N D} (t)$ and $DAR (t)$ ; the relationship between $γ$ and the elevated DA concentrations in the more complex models is difficult to robustly quantify.

We assume the CSTs from our optimized ${β, c_{crit}}$ pairs—determined through baseline simulations—apply to real datasets, since the noise levels are matched to our centre’s scanner, as was done in previous work using the standard method.^4,5 While this is a reasonable assumption, future work could test this approach on real human baseline RAC datasets to ensure the CST is controlling false positive rates as expected.

Finally, in our cluster release simulations, we did not assess true negative rates; the PSF filtering in our simulation design made it impossible to reliably classify voxels outside the cluster boundaries as release or non-release. Thus, we rely solely on sensitivity in determining the accuracy of each algorithm.

Supplemental Material

JCB905613 Supplemental Material1 - Supplemental material for A Monte Carlo approach for improving transient dopamine release detection sensitivity

Supplemental material, JCB905613 Supplemental Material1 for A Monte Carlo approach for improving transient dopamine release detection sensitivity by Connor WJ Bevington, Ju-Chieh (Kevin) Cheng, Ivan S Klyuzhin, Mariya V Cherkasova, Catharine A Winstanley and Vesna Sossi in Journal of Cerebral Blood Flow & Metabolism

Footnotes

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by an NSERC Discovery grant (240670-13), an NSERC Create grant (448110-2014), a CIHR grant (PJT-156012).

Acknowledgements

CWJB thanks Richard Carson for suggesting the use of alternative weighting schemes for HYPR post-processed data. TRIUMF’s contribution to radiotracer production is gratefully acknowledged.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Authors’ contributions

CWJB designed the MC method, constructed the simulations, performed the analysis, and wrote the manuscript. JKC provided code for the reconstructions, guidance throughout the project, and edited the manuscript. ISK aided in simulation design and edited the manuscript. MVC and CAW contributed the pilot subject data. VS conceptualized the project, provided guidance throughout, and edited the manuscript.

ORCID iDs

Connor WJ Bevington

Vesna Sossi

Supplemental material

Supplemental material for this article is available online.

Notes

References

Floel

Garraux

, et al. Levodopa increases memory encoding and dopamine release in the striatum in the elderly. Neurobiol Aging 2008; 29: 267–279.

Sossi

Dinelle

Schulzer

, et al. Levodopa and pramipexole effects on presynaptic dopamine PET markers and estimated dopamine release. Eur J Nucl Med Mol Imaging 2010; 37: 2364–2370.

Normandin

Schiffer

Morris

ED.

A linear model for estimation of neurotransmitter response profiles from dynamic PET data.

Neuroimage 2012; 59: 2689–2699.

Kim

Sullivan

Wang

, et al. Voxelwise lp-ntPET for detecting localized, transient dopamine release of unknown timing: sensitivity analysis and application to cigarette smoking in the PET scanner. Hum Brain Mapp 2014; 35: 4876–4891.

Wang

Kim

Cosgrove

, et al. A framework for designing dynamic lp-ntPET studies to maximize the sensitivity to transient neurotransmitter responses to drugs: application to dopamine and smoking. Neuroimage 2017; 146: 701–714.

Ichise

Liow

, et al. Linearized reference tissue parametric imaging methods: application to [11C]DASB positron emission tomography studies of the serotonin transporter in human brain. J Cereb Blood Flow Metab 2003; 23: 1096–1112.

Holmes

Blair

Watson

, et al. Nonparametric analysis of statistic images from functional mapping experiments. J Cereb Blood Flow Metab 1996; 16: 7–22.

Steeves

Miyasaki

Zurowski

, et al. Increased striatal dopamine release in Parkinsonian patients with pathological gambling: a [11C] raclopride PET study. Brain 2009; 132: 1376–85.

de la Fuente-Fernández

Phillips

Zamburlini

, et al. Dopamine release in human ventral striatum and expectation of reward. Behav Brain Res 2002; 136: 359–363.

10.

Pappata

Dehaene

Poline

, et al. In vivo detection of striatal dopamine release during reward: a PET study with [(11)C]raclopride and a single dynamic scan approach. Neuroimage 2002; 16: 1015–1027.

11.

Haruno

Kawato

Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulus-action-reward association learning.

J Neurophysiol 2006; 948–955.

12.

Monchi

Strafella

AP.

Striatal dopamine release during performance of executive functions: a [(11)C] raclopride PET study.

Neuroimage 2006; 33: 907–912.

13.

Bäckman

Nyberg

Soveri

, et al. Effects of working-memory training on striatal dopamine release. Science 2011; 333: 718.

14.

Cosgrove

Wang

Kim

, et al. Sex differences in the brain’s dopamine signature of cigarette smoking. J Neurosci 2014; 34: 16851–16855.

15.

Bevington

CWJ

Klyuzhin

Aceves

, et al. Denoising and DA release: effect of denoising on the ability to identify voxel-level neurophysiological response. In: IEEE medical imaging conference record, Sydney, Australia, 13–18 November 2018, pp.1–3. Piscataway, NJ: IEEE.

16.

Innis

Cunningham

Delforge

, et al. Consensus nomenclature for in vivo imaging of reversibly binding radioligands. J Cereb Blood Flow Metab 2007; 27: 1533–1539.

17.

Lenz

Beiträge zum Verständnis der magnetischen Eigenschaften in festen Körpern. Physikalische Zeitschrift 1920; 21: 613–615.

18.

Metropolis

Rosenbluth

, et al. Equation of state calculations by fast computing machines. J Chem Phys 1953; 21: 1087–1092.

19.

Newman MEJ and Barkema GT. Ising model and the metropolis algorithm. In: Monte Carlo methods in statistical physics. Oxford: Oxford University Press, 1999, pp.43–84.

20.

de Jong

van Velden

Kloet

, et al. Performance evaluation of the ECAT HRRT: an LSO-LYSO double layer high resolution, high sensitivity scanner. Phys Med Biol 2007; 52: 1505–1526.

21.

Fischl

Salat

Busa

, et al. Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron 2002; 33: 341–355.

22.

Fischl

Salat

van der Kouwe

AJW

, et al. Sequence-independent segmentation of magnetic resonance images. Neuroimage 2004; 23: S69–S84.

23.

Han

Fischl

Atlas renormalization for improved brain MR image segmentation across scanner platforms.

IEEE Transac Med Imag 2007; 26: 479–486.

24.

Ségonne

Dale

Busa

, et al. A hybrid approach to the skull-stripping problem in MRI. Neuroimage 2004; 22: 1160–1175.

25.

Carson

RE.

Noise reduction in the simplified reference tissue model for neuroreceptor functional imaging.

J Cereb Blood Flow Metab 2002; 22: 1440–1452.

26.

Comtat

Bataille

Michel

, et al. OSEM-3D Reconstruction strategies for the ECAT HRRT. In: IEEE nuclear science symposium conference record, Rome, Italy, 16–22 October 2004, pp. 3492–3496. Piscataway, NJ: IEEE.

27.

Floberg

Mistretta

Weichert

, et al. Improved kinetic analysis of dynamic PET data with optimized HYPR-LR. Med Phys 2012; 39: 3319–3331.

28.

Cheng

Matthews

Sossi

, et al. Incorporating HYPR de-noising within iterative PET reconstruction (HYPR-OSEM). Phys Med Biol 2017; 62: 6666–6687.

29.

Cherkasova

Clark

Barton

JJS

, et al. Win-concurrent sensory cues can promote riskier choice. J Neurosci 2018; 38:10362–10370.

30.

Morris

Yoder

Wang

, et al. ntPET: a new application of PET imaging for characterizing the kinetics of endogenous neurotransmitter release. Mol Imaging 2005; 4: 473–489.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.50 MB

A Monte Carlo approach for improving transient dopamine release detection sensitivity

Abstract

Keywords

Introduction

Theory

Kinetic modelling

Significance and cluster size thresholding

A novel Monte Carlo method for improving detection sensitivity

Methods and materials

Simulation design

Reconstruction and denoising protocols

Optimizing the MC method

Analysis workflow

Further parametric thresholding

Application to a human gambling study

Results

Determining β and c crit in the MC method

Demonstrating cost function equilibrium and reproducibility of the algorithm

Cluster release simulation: Sensitivity

Cluster release simulation: Parametric accuracy

Application to a human gambling study

Discussion

MC method

Parametric thresholding

Improved simulation methodology

Application to a human dataset

Other applications

Limitations

Supplemental Material

JCB905613 Supplemental Material1 - Supplemental material for A Monte Carlo approach for improving transient dopamine release detection sensitivity

Footnotes

Funding

Acknowledgements

Declaration of conflicting interests

Authors’ contributions

ORCID iDs

Supplemental material

Notes

References

Supplementary Material

Determining $β$ and $c_{crit}$ in the MC method