Systematic underestimation of uncertainties by widespread neutron-scattering data-reduction software

Abstract

Data-reduction software used at neutron-scattering facilities around the world, Mantid and Scipp, ignore correlations when propagating uncertainties in arithmetic operations. Normalization terms applied during data-reduction frequently have a lower dimensionality than the quantities being normalized. We show how the lower dimensionality introduces correlations, which the software does not take into account in subsequent data-reduction steps such as histogramming, summation, or fitting. As a consequence, any uncertainties in the normalization terms are strongly suppressed and thus effectively ignored. This can lead to erroneous attribution of significance to deviations that are actually pure noise, or to overestimation of significance in final data-reduction results that are used for further data analysis. We analyze this flaw for a number of different cases as they occur in practice. For the two concrete experiments that are comprised in these case studies the underestimation turns out to be of negligible size. There is however no reason to assume that this generalizes to other measurements at the same or at different neutron-scattering beamlines. We describe and implement a potential solution that yields not only corrected error estimates but also the full variance-covariance matrix of the reduced result with minor additional computational cost.

Keywords

Software neutron data-reduction

1. Introduction

The number of neutron counts measured in a detector element during a neutron-scattering experiment follows a Poisson distribution. In practice this is typically handled with a Gaussian approximation, setting the standard-deviation $σ = \sqrt{counts}$ , which is a good approximation for a large number of counts. After acquisition of the raw detector counts, data-reduction is performed with the goal to compute, e.g., a momentum-transfer dependent elastic differential cross section $P (Q)$ . An important part of the result are the statistical uncertainties of $P (Q)$ . Software targeted at this data-reduction step such as Mantid [1] and Scipp1

¹
Disclaimer: The authors of the present paper are the principal authors of Scipp and have previously contributed to Mantid for several years.

[12,23] handle propagation of uncertainty automatically [7]. Error propagation in event-mode [19,20] is also supported and allows for delaying lossy histogramming and rebinning procedures. The software’s implicit or explicit promise to the user – usually a scientist – is that the automatic propagation of uncertainties will ensure they can rely on the statistical significance of their result. As a concrete example, they may rely on the significance of a fit to a peak in

P (Q)

, which is linked to the size of the error bars of

P (Q)

One caveat in both Mantid and Scipp is that correlations are not taken into account in the propagation of uncertainties. This is generally not considered a problem since the raw data are assumed to be uncorrelated. However, as we will show, ubiquitous operations in neutron-scattering data-reduction introduce such correlations in intermediate computation steps but subsequently fail to take these into account. This leads to a systematic underestimation of uncertainties in final results such as $P (Q)$ . The issue we describe is not hinging on any neutron-scattering-specific aspect and would therefore pertain to any code or software implementing similar step-wise propagation of uncertainties. The main examples in our case are (1) normalization of counts measured in detector pixels to, e.g., a neutron monitor spectrum, (2) normalization in event-mode, where neutron event weights are scaled with the value in the corresponding bin of the normalization term histogram,2

The normalization term could be, e.g., a neutron monitor spectrum or a reduced vanadium spectrum.

and (3) a combination of (1) and (2). We will show that this is not simply an issue of an inaccurate approximation. Whether this underestimation is of relevant size depends on the experiment design, in particular on whether all normalization terms have sufficient statistics.

The remainder of this work is organized as follows. We describe the problem in Section 2. In Section 3 we analyze a couple of concrete examples, in particular for small-angle neutron scattering (SANS) and powder diffraction. Furthermore, we derive full expressions for the variance-covariance matrices of the reduced data, which turn out to be simple and computationally tractable. We confirm the correctness of the improved estimates for the uncertainties using a bootstrapping [9] approach. The results are discussed in Section 4 and we draw conclusions in Section 5.

2. Problem description

For neutron-scattering data-reduction the most relevant operations that require handling of uncertainties are addition, multiplication, and division. Normalization operations can be implemented as division or as multiplication with the inverse. The Taylor series expansion (truncated after first order) for uncertainty propagation [3] gives $\begin{array}{l} (1) & var (x + y) = var (x) + var (y) + 2 cov (x, y), \\ (2) & var (x - y) = var (x) + var (y) - 2 cov (x, y), \\ (3) & var (x \cdot y) \approx {(x y)}^{2} [\frac{var (x)}{x^{2}} + \frac{var (y)}{y^{2}} + 2 \frac{cov (x, y)}{x y}], \\ (4) & var (\frac{x}{y}) \approx {(\frac{x}{y})}^{2} [\frac{var (x)}{x^{2}} + \frac{var (y)}{y^{2}} - 2 \frac{cov (x, y)}{x y}], \end{array}$ where var and cov denote the variance and covariance, respectively. The variance is given by the square of the standard deviation, e.g., $var (x) = σ_{x}^{2}$ . The uncertainty propagation in both Mantid and Scipp (current-software, abbreviated CS from here on) is based on this expansion. For the raw input data such as neutron counts measured in a detector pixel or in a neutron monitor, CS always assumes that the inputs are uncorrelated, i.e., that any covariance term $cov (x, y) = 0$ .3

³
Whether this assumption is always correct is not the topic of this manuscript.

However, correlations are introduced in operations that broadcast [5], i.e., given an array

a = [a_{1}, \dots, a_{n}]

and

b a = [a_{1} b, \dots, a_{n} b]

with a scalar b, the elements of the array

b a

are correlated with

\begin{array}{l} (5) & cov (a_{i} b, a_{j} b) & = a_{i} a_{j} var (b) . \end{array}

Depending on the operation, i and j might label detector pixels, events within a bin, or both. Subsequent operations frequently include summation over i, i.e., the sum of all elements of

b a

(or subsets of elements), a fit to an i-dependent function, or in general a computation of any function

f (a, b)

. The CS implementations ignore covariance terms as the one given in Eq. (5). This is obviously incorrect, but in widespread use. The remainder of this manuscript offers a more detailed analysis of the consequences in a number of cases occurring in practice, but adds no further fundamental insights beyond the above statement.

The most straightforward example of the incorrect treatment of correlations is the computation of a histogram of event data after normalization. As described in [19], CS event-mode normalization first histograms the normalization term. Then, for every event of the data to normalize, the corresponding bin in the normalization histogram is identified and the event weight is scaled with the inverse of the value of that bin. We consider a single bin with n events, their weights $a_{i}$ , and a bin-specific (but not event-specific) normalization term b. The value of the histogram in the bin is $\begin{array}{l} (6) & f (a, b) = \sum_{i = 1}^{n} \frac{a_{i}}{b} . \end{array}$ Using Eq. (4), and subsequently Eqs (1) and (5), we obtain4

⁴

Here we chose to compute the same result in two steps using correlations of summands since it aids in highlighting the difference to the CS approach. A direct and likely simpler way to compute this is using $Σ^{f} = J Σ^{a b} J^{T} = J \cdot diag (var (a_{1}), \dots, var (a_{n}), var (b)) \cdot J^{T}$ with Jacobian matrix $J = [\begin{matrix} \frac{\partial f}{\partial a_{1}} & \dots & \frac{\partial f}{\partial a_{n}} & \frac{\partial f}{\partial b} \end{matrix}] = [\begin{matrix} \frac{1}{b} & \dots & \frac{1}{b} & - \frac{1}{b^{2}} \sum_{i} a_{i} \end{matrix}]$ . The diagonal of $Σ^{f}$ yields the variances $var (f (a, b))$ .

\begin{array}{l} (7) & var (f (a, b)) & = var (\sum_{i = 1}^{n} \frac{a_{i}}{b}) \approx \sum_{i = 1}^{n} {(\frac{a_{i}}{b})}^{2} [\frac{var (a_{i})}{a_{i}^{2}} + \frac{var (b)}{b^{2}}] + \sum_{\begin{array}{c} i, j = 1 \\ i \neq j \end{array}}^{n} cov (\frac{a_{i}}{b}, \frac{a_{j}}{b}) \\ (8) & = \frac{1}{b^{2}} \sum_{i = 1}^{n} var (a_{i}) + \frac{var (b)}{b^{4}} \sum_{i = 1}^{n} a_{i}^{2} + \sum_{\begin{array}{c} i, j = 1 \\ i \neq j \end{array}}^{n} a_{i} a_{j} cov (\frac{1}{b}, \frac{1}{b}) \\ (9) & \approx \frac{1}{b^{2}} \sum_{i = 1}^{n} var (a_{i}) + \frac{var (b)}{b^{4}} \sum_{i = 1}^{n} a_{i}^{2} + \frac{var (b)}{b^{4}} \sum_{\begin{array}{c} i, j = 1 \\ i \neq j \end{array}}^{n} a_{i} a_{j} \\ (10) & = \frac{1}{b^{2}} \sum_{i = 1}^{n} var (a_{i}) + \frac{var (b)}{b^{4}} \sum_{i = 1}^{n} \sum_{j = 1}^{n} a_{i} a_{j} . \end{array}

The key aspect of this result is the double sum multiplying the

var (b)

term. By contrast, without taking the covariance term into account, CS computes

\begin{array}{l} (11) & {var}_{CS} (\sum_{i = 1}^{n} \frac{a_{i}}{b}) = \frac{1}{b^{2}} \sum_{i = 1}^{n} var (a_{i}) + \frac{var (b)}{b^{4}} \sum_{i}^{n} a_{i}^{2} ≉ var (\sum_{i = 1}^{n} \frac{a_{i}}{b}), \end{array}

with only a single sum in the

var (b)

term. Essentially CS is replacing the square of the sum over

a_{i}

by the sum over the squares. If all

a_{i}

are of similar magnitude then ignoring correlations is thus suppressing the contribution of b’s uncertainty

var (b)

(the normalization term’s uncertainty) by a factor of

1 / n

. In many typical applications

n ≫ 1

. For such n we almost always have

a_{j}^{2} ≪ \sum_{i = 1}^{n} a_{i} a_{j}

and the CS expression Eq. (11) would thus be valid only for

\begin{array}{l} (12) & \frac{var (b)}{b^{4}} \sum_{\begin{array}{c} i, j = 1 \\ i \neq j \end{array}}^{n} a_{i} a_{j} & ≪ \frac{1}{b^{2}} \sum_{i = 1}^{n} var (a_{i}) . \end{array}

From this we obtain the condition

\begin{array}{l} (13) & \frac{var (b)}{b^{2}} & ≪ \frac{\sum_{i = 1}^{n} var (a_{i})}{\sum_{\begin{array}{c} i, j = 1 \\ i \neq j \end{array}}^{n} a_{i} a_{j}} \approx \frac{1}{n - 1} \frac{var (a)}{a^{2}}, \end{array}

where the approximation is valid for

a_{i} \approx a

and

var (a_{i}) \approx var (a)

. However, fulfilling Eq. (13) implies that the second terms in Eqs (10) and (11) are negligible. That is, the CS approach is valid only when the term it approximates is effectively zero. Computing the second term in Eq. (11) is thus simply a waste of computational resources and energy, more than doubling the number of required floating point operations.5

⁵

Multiplication $a b$ : 3 multiplications (b without variance) vs. 5 multiplications and 1 addition (b with variance); Division $a / b$ : 2 divisions and 1 multiplication (b without variance) vs. 3 divisions, 3 multiplications, and 1 addition (b with variance); Numbers may vary based on implementation and when standard-deviations instead of variances are stored, as is the case in parts of Mantid.

Hereafter the zero-variance approximation (V0) will denote an approach where the variances of the normalization terms are strictly set to 0.

Concretely, for the example of histogramming event data, all neutrons typically have the same initial weight $a_{i} = 1$ and $var (a_{i}) = 1$ . Then the number of events n is equal to the number of counts C and Eq. (10) turns into $\begin{array}{l} (14) & var (\sum_{i = 1}^{C} \frac{a_{i}}{b}) & \approx \frac{C}{b^{2}} + \frac{C^{2} var (b)}{b^{4}} . \end{array}$ The second term is negligible for $\begin{array}{l} (15) & \frac{C^{2} var (b) b^{2}}{b^{4} C} & = \frac{C var (b)}{b^{2}} ≪ 1 . \end{array}$ Often the normalization term itself originates from a histogram of neutron event counts. Then $b = N$ , $var (b) = var (N) = N$ and we have $\begin{array}{l} (16) & \frac{C var (N)}{N^{2}} = \frac{C N}{N^{2}} = \frac{C}{N} = α ≪ 1 . \end{array}$ This equation states that both the CS and V0 approximations are valid only if the number of detector counts C is (much) lower than the number of normalization counts b. The ratio of the two, α, may be used as a generic indicator of the validity of the approximation.

To illustrate this effect, we show in Fig. 1 how the relative uncertainty for a histogram of normalized event data scales with the number of events. Both the CS implementation (×) and the V0 approximation (+) of event-mode normalization lead to a relative error of the histogrammed result that goes to 0 with increasing event count. Importantly, CS and V0 yield near-identical results for all values of C. In contrast, normalization after histogramming (∙) or correct handling of correlations yields a relative error limited by the relative error of the normalization factor.

Fig. 1.

Relative error $σ_{C / N} / (C / N)$ of a single histogram bin computed by CS for normalization to a fixed number of $N = 1000$ counts. The relative error of the normalization term, $σ_{N} = \sqrt{N} / N \approx 0.032$ , is shown as a dashed line and should represent the limit and lower bound of the relative error for large number C of events in the bin. We show (×) relative error obtained from histogramming after normalization (incorrect), (+) relative error obtained when forcing $σ_{N} = 0$ , i.e., when ignoring uncertainties of the normalization term, and (∙) relative error obtained from normalizing the histogrammed data (correct). In the V0 case the result is independent of the order of normalization and histogramming. The two CS implementations, Mantid and Scipp, yield results identical within floating-point rounding errors, so we are not plotting results separately.

In summary, the CS approach is thus either wasteful (since it is equivalent to V0 with additional compute overhead), or incorrect. In the next section we analyze more complex cases, namely normalization in SANS and powder diffraction workflows. These involve a transformation of coordinates6

⁶

In Mantid such coordinate transformations are referred to as unit conversion.

after the normalization step but before a histogramming/summing step. While the problem of the CS approach is conceptually the same, the concrete effects are thus slightly harder to visualize and grasp.

3. Case studies

3.1. SANS

In a time-of-flight SANS experiment, the differential scattering cross section $S (Q)$ is defined as [11] $\begin{array}{l} (17) & S (Q) = \frac{\partial Σ Q}{\partial Ω} = \frac{A_{H} \sum_{i, j, λ \subset Q} C (i, j, λ)}{A_{M} t \sum_{i, j, λ \subset Q} M_{sample}^{incident} (λ) T (λ) D (λ) Ω (i, j)}, \\ (18) & T (λ) = \frac{M_{sample}^{transmission}}{M_{sample}^{incident}} \frac{M_{direct}^{incident}}{M_{direct}^{transmission}} . \end{array}$ Here $A_{H}$ is the area of a mask (which avoids saturating the detector) placed between the monitor of area $A_{M}$ and the main detector, Ω is the detector solid angle, C is the count rate on the main detector, which depends on the pixel indices $i, j$ and the wavelength λ, t is the sample thickness, M represents the incident or transmission monitor count rate for the sample run or the direct run, and T is known as the transmission fraction, and finally, D is the direct beam function.7

⁷
The direct-beam function allows to cross-normalize the incident spectrum to that of the empty beam (without sample) seen on the main detector.

\sum_{i, j, λ \subset Q}

denotes a sum over all pixels and wavelengths that contribute to the desired Q-bin. Note that

M_{sample}^{incident}

cancels with the corresponding divisor in T.

S (Q)

is computed for a sample run and a background run, and the background-subtracted

P (Q) = S (Q) - S_{background} (Q)

is then used for further analysis.

In the SANS case the CS calculation of the normalization term itself (rather than the normalization of the detector counts) in Eq. (17) suffers from the issue described in Section 2. In contrast to the example from Fig. 1, the problem here is not caused by event-mode normalization, but by a different type of broadcast operation. Since the λ-dependent factors in the denominator do not depend on i and j, and the solid angle $Ω (i, j)$ does not depend on λ, multiplying them requires a broadcast in both λ and $(i, j)$ .8

⁸

Here we assumed quasi-elastic scattering. A correction factor for inelastic-scattering [14] would depend on i, j, and λ.

The CS implementation then sums the product over

λ, i, j

, ignoring correlations. Finally, the resulting uncertainties are combined with the uncertainties from the numerator.

Note that there are a number of other contributions to the uncertainties with similar issues, which we ignored in this analysis. Our goal here is to provide an example for illustration of the problems in CS, not a complete analysis of the entire SANS data reduction. See the discussion in Section 4 for more background.

To illustrate the issue we study the data-reduction workflow for an experiment on non-ionic surfactants that was carried out at the Sans2d instrument [22] at the ISIS facility at the STFC Rutherford Appleton Laboratory.9

⁹

The data has been published in [17], where all the details on the experimental setup can be found.

The SANS reduction workflow we used can be found online at [21]. A key factor in determining whether the CS implementation is adequate is the ratio of total detector counts to total monitor counts10

¹⁰

The monitors from the direct run, $M_{direct}^{incident}$ and $M_{direct}^{transmission}$ also play a role, but we consider them as a given invariant for this analysis.

\begin{array}{l} (19) & α = \frac{\sum_{i, j, λ} C (i, j, λ)}{\sum_{λ} M_{sample}^{transmission} (λ)} . \end{array}

This can vary in practice based on the instrument design, in particular the monitor efficiency. A larger sample volume or a sample that is a better scatterer will also lead to a larger α. For example, a

10 \times

larger sample would scatter approximately

10 \times

more neutrons and yield correspondingly higher detector counts while the monitor counts would be approximately unchanged. To highlight the influence of α, we simulated

α > α_{actual}

by drawing, for an actual monitor bin with

M (λ)

counts, from a Poisson distribution

f (k; Λ)

with

Λ = \frac{α_{actual}}{α} M (λ)

and reran the reduction workflow. In Fig. 2 we compare the uncertainties for

S (Q)

computed by CS and an improved analytical approach (which will be described below) for three values of α. The source code for reproducing these results can be found in the supplementary material [13]. In this concrete experiment

α_{actual} = 0.27

and the difference between the CS result and corrected result is negligible. For larger α we observe a clear difference.

Fig. 2.

(a) The scattering differential cross-section $S (Q)$ computed by the SANS workflow. (b) Relative uncertainty of the cross section $Σ^{S}$ for the SANS workflow. The thick black curve represents the results obtained with CS, while the coloured lines are the analytical calculations with varying α coefficients. The results obtained with CS are virtually identical for all values of α. (c) Relative difference in standard deviations between CS and the analytical calculation.

Fig. 3.

Comparison of the relative standard deviation for $S (Q)$ as computed by our improved analytical approach (blue) and the bootstrap approach (pink) at $α = 100 \cdot α_{actual} = 27$ . The small bottom panel shows the relative difference between the two. R is the number of bootstrap replicas, which influences the amount of noise on the result (see Appendix A).

As a verification of these results, we independently computed the uncertainties by using a bootstrap [9] method – a sampling technique that can be used to estimate the distribution of a statistic.11

¹¹

For related methods such as Markov-chain Monte-Carlo uncertainty estimates see, e.g., Ref. [2] and [18].

It involves repeatedly sampling from a population with replacement, and then calculating the statistic of interest for each sample. The resulting distribution of statistics can then be used to calculate a confidence interval for the population parameter. This approach is useful because it allows us to make inferences about a population even when we do not have a theoretical distribution for the statistic of interest. The biggest advantage of this approach is its simplicity, but it comes at a high cost – in our case we need to repeat the entire data-reduction for every bootstrap sample, i.e., hundreds or thousands of times. The bootstrapping approach is described in some more detail in Appendix A. We can see in Fig. 3 that there is an excellent agreement between the bootstrap results and our analytical derivation (see Appendix A, in particular Fig. 8 which shows the same result for

R = 1024

together with additional values of R).12

¹²

Including background subtraction in our bootstrap calculations, i.e., computing $P (Q)$ instead of just $S (Q)$ , yielded very similar results for the variances as well as the correlations.

The source code for bootstrapping is also included in the supplementary material [13].

The correct (first order) expression for the uncertainties for this reduction workflow can be derived in a straightforward manner. We adopt a simpler and more generic notation equivalent to Eq. (17) as follows. We define $\begin{array}{l} (20) & c_{Q} = A_{H} \sum_{i, j, λ \subset Q} C (i, j, λ), \\ (21) & p_{k} = Ω (i, j) with k = (i, j), \\ (22) & w_{λ} = A_{M} t M_{sample}^{incident} (λ) T (λ) D (λ) . \end{array}$ In vector notation this yields the input parameter vectors $c = {(c_{Q_{1}}, \dots, c_{Q_{C}})}^{T}$ (the detector counts binned into Q-bins of width $Δ Q$ ),13

¹³

We use bins of constant width to simplify the notation, but the following results are equally valid for other choices of bin widths.

p = {(p_{1}, \dots, p_{P})}^{T}

(the pixel-dependent scaling factors), and

w = {(w_{λ_{1}}, \dots, w_{λ_{W}})}^{T}

(the wavelength-dependent scaling factors such as monitor counts), where C, P, and W denote the number of Q-bins, pixels, and wavelength-bins, respectively. Typically we assume that the respective variance-covariance matrices

Σ^{c}

Σ^{p}

, and

Σ^{w}

are diagonal, e.g.,

Σ^{c} = diag (var (c_{1}), \dots, var (c_{Q_{C}}))

, but the following derivation is valid even if they are not. Aside from constant factors, Eq. (17) is then equivalent to

\begin{array}{l} (23) & S_{Q} & = \frac{c_{Q}}{p^{T} H_{Q} w} = c_{Q} n_{Q}, where {(H_{Q})}_{i λ} = \{\begin{matrix} 1 & for Q_{i λ} \in [Q, Q + Δ Q) \\ 0 & otherwise \end{matrix} . \end{array}

S_{Q}

are the components of the discrete differential scattering cross section

S

. The matrices

H_{Q}

encode the histogramming procedure, equivalent to the

\sum_{i, j, λ \subset Q}

sum in Eq. (17).14

¹⁴

Note that in software histogramming is not actually implemented via a matrix multiplication as this would be highly inefficient.

We use the usual first-order truncated Taylor expansion to compute the variance-covariance matrix

Σ^{S}

S

from the variance-covariance matrices of the inputs,

\begin{array}{l} (24) & Σ^{S} & \approx J [\begin{matrix} Σ^{c} \\ Σ^{p} \\ Σ^{w} \end{matrix}] J^{T}, with the Jacobian matrix J = [\begin{matrix} \nabla_{c}^{T} S & \nabla_{p}^{T} S & \nabla_{w}^{T} S \end{matrix}], \end{array}

where T denotes the matrix-transpose and the rows of the three submatrices are given by

\begin{array}{l} (25) & \nabla_{c} S_{Q} = n_{Q} {\hat{e}}_{Q}, \\ (26) & \nabla_{p} S_{Q} = - c_{Q} n_{Q}^{2} H_{Q} w, \\ (27) & \nabla_{w} S_{Q} = - c_{Q} n_{Q}^{2} H_{Q}^{T} p . \end{array}

{\hat{e}}_{Q}

denotes the unit vector for direction Q. This yields for the elements of

Σ^{S}

\begin{array}{l} (28) & Σ_{Q Q^{'}}^{S} & \approx n_{Q} n_{Q^{'}} Σ_{Q Q^{'}}^{c} + c_{Q} c_{Q^{'}} n_{Q}^{2} n_{Q^{'}}^{2} [w^{T} H_{Q}^{T} Σ^{p} H_{Q^{'}} w + p^{T} H_{Q} Σ^{w} H_{Q^{'}}^{T} p] . \end{array}

In current practice, the uncertainties of the pixel-dependent term (such as the solid angle of detector pixels) are typically ignored, i.e.,

Σ^{p} = 0

. Then

\begin{array}{l} (29) & Σ_{Q Q^{'}}^{S} & \approx n_{Q} n_{Q^{'}} Σ_{Q Q^{'}}^{c} + c_{Q} c_{Q^{'}} n_{Q}^{2} n_{Q^{'}}^{2} p^{T} H_{Q} Σ^{w} H_{Q^{'}}^{T} p . \end{array}

The

Σ^{w}

term involves histogramming the pixel-dependent

p

into a matrix of

(Q, λ)

-bins. This is cheap to compute, in particular since for typical SANS experiments we have only moderate bin counts,

N_{Q} = O (100)

and

N_{λ} = O (100)

. The remaining operations for this term then involve row (or column) scaling as well as matrix-matrix multiplications of the matrices computed by the histogramming procedure. We refer to Section 4.3 for a more detailed look into the performance impact of the improved analytical approach.

The diagonal component of this result is included in Fig. 2. In addition, we show the resulting Pearson correlation matrix $corr (S_{Q}, S_{Q}^{'}) = Σ_{Q Q^{'}}^{S} / \sqrt{var (S_{Q}) var (S_{Q^{'}})}$ in Fig. 4. This figure also includes the equivalent results computed using the bootstrapping approach and – as previously shown for the variances – there is excellent agreement for the covariances or correlations. Overall, there clearly are correlations and these should be taken into account when fitting a model to the data.15

¹⁵

For example, scipy.optmize.curve_fit accepts the covariance matrix as input via the sigma parameter. A full discussion for other approaches such as Bayesian statistics is beyond the scope of this paper.

Fig. 4.

Correlation matrices for the time-of-flight SANS-reduction workflow using $α = 100 \cdot α_{actual} = 27$ . As the correlation matrix is symmetric under $Q \leftrightarrow Q^{'}$ , we can show in the same figure the analytical result (above the diagonal) and the bootstrap result (below the diagonal, $R = 1024$ replicas). See Appendix B for a version of the figure with a logarithmic color scale.

3.2. Powder diffraction

We consider a simplified model of a time-of-flight powder-diffraction reduction workflow based on the workflow used at ISIS [15]. Our model workflow consists of the following steps: (1) compute wavelength coordinates for detector data and monitor data, (2) compute wavelength histograms, (3) normalize the detector data by the monitor, (4) compute d-spacing coordinate for detector data, and (5) compute d-spacing histogram, combining data from all (or a user-defined subset of) detector pixels. This yields the normalized d-spacing spectrum $S_{d}$ (for the sample) or $V_{d}$ (for a vanadium run). We will consider $V_{d}$ and the normalized propability distribution $P_{d} = S_{d} / V_{d}$ in the next section. Note that in contrast to the ISIS powder reduction workflow, the SNS workflow [24] does generally not use a monitor normalization. For our analysis we use data from the Wish beamline [6] at ISIS.16

¹⁶
The sample is NaCaAlF measured in Wish run number 52889 and the vanadium was measured in run number 52920.

In Fig. 5 we compare the uncertainties computed by CS and by an improved analytical approach which will be described below. Source code for reproducing these results can be found in the supplementary material [13]. As described in Section 3.1, we study the influence of the ratio of total detector to monitor counts, α (defined equivalently to Eq. (19)), influenced, e.g., by the sample volume. For $α_{actual} = 2.2$ as in this example, the relative underestimation by CS is below the $10 %$ level. The discrepancy is largest for low d and at diffraction peaks. For larger α we observe a much more significant difference.

Fig. 5.

Top: $S (d)$ . Middle: relative standard deviation for $S (d)$ as computed by CS and our improved analytical approach. Bottom: relative difference between standard deviations as computed by CS and our analytical approach. We show results for three different values of α. As in Section 3.1, we simulated $α > α_{actual}$ by drawing, for an actual monitor bin with $w_{λ}$ counts, from a Poisson distribution $f (k; Λ)$ with $Λ = \frac{α_{actual}}{α} w_{λ}$ and reran the reduction workflow.

Fig. 6.

Comparison of the relative standard deviation for $S (d)$ as computed by our improved analytics approach and the bootstrap approach at $α = 10 \cdot α_{actual} = 22.3$ . The small bottom panel shows the relative difference between the two. R is the number of bootstrap replicas, which influences the amount of noise on the result.

For the simplified model workflow we can derive the correct (first order) expression for the uncertainties in a straightforward manner. This also provides the covariances or correlations between different d-spacing values. We adopt the following notation. Let $i \in [1, \dots, N_{total}]$ . For an event-mode workflow i labels events with $N_{total} = N_{event}$ , irrespective of the detector pixel that recorded an event. Then the tuple $(c_{i}, d_{i}, λ_{i})$ represents the counts (usually 1), d-spacing, and wavelength for the i-th event. If not in event-mode, i labels bins across all detector pixels, with $N_{total} = N_{λ} N_{pixel}$ where $N_{λ}$ is the number of wavelength bins (per pixel) and $N_{pixel}$ is the number of detector pixels. Then the tuple $(c_{i}, d_{i}, λ_{i})$ represents the counts, d-spacing, and wavelength for the i-th bin. We define input parameter vectors $c = {(c_{1}, \dots, c_{N_{total}})}^{T}$ and $w = {(w_{λ_{1}}, \dots, w_{λ_{W}})}^{T}$ (the wavelength-dependent scaling factors such as monitor counts). Typically we assume that the respective variance-covariance matrices $Σ^{c}$ and $Σ^{w}$ are diagonal, e.g., $Σ^{w} = diag (var (w_{λ_{1}}), \dots, var (w_{λ_{W}}))$ , but the following derivation is valid even if they are not. The model workflow computes $S_{d}$ from detector counts $c$ and monitor counts $w$ , $\begin{array}{l} (30) & S_{d} & = n^{T} H_{d} c, \end{array}$ where $n$ is the element-wise inverse of $w$ , i.e., $n = {(1 / w_{λ_{1}}, \dots, 1 / w_{λ_{w}})}^{T}$ . The $H_{d}$ matrix encodes the histogramming procedure with d-spacing bin size $Δ d$ and wavelength bin size $Δ λ$ ,17

¹⁷

We use bins of constant width for simplifying the notation, but the following results are also valid for other choices of bin widths.

defined as

\begin{array}{l} (31) & {(H_{d})}_{λ i} & = \{\begin{matrix} 1 & for d_{i} \in [d, d + Δ d), λ_{i} \in [λ, λ + Δ λ) \\ 0 & otherwise . \end{matrix} \end{array}

Note that the definition of

H_{d}

differs from that of

H_{Q}

in Eq. (23) – in the latter, the column index i labels pixels whereas in the former it labels events or bins. We use the usual first-order truncated Taylor expansion to compute the variance-covariance matrix

Σ^{S}

S

from the variance-covariances matrices of the inputs,

\begin{array}{l} (32) & Σ^{S} & \approx J [\begin{matrix} Σ^{c} \\ Σ^{w} \end{matrix}] J^{T}, with the Jacobian matrix J = [\begin{matrix} \nabla_{c}^{T} S & \nabla_{w}^{T} S \end{matrix}] . \end{array}

The rows of these matrices compute as

\begin{array}{l} (33) & \nabla_{c} S_{d} = H_{d}^{T} n, \\ (34) & \nabla_{w} S_{d} = - {[diag (n)]}^{2} H_{d} c = - N^{2} H_{d} c . \end{array}

The matrix N is defined as

diag (n)

. Then the elements of

Σ^{S}

compute as

\begin{array}{l} (35) & Σ_{d d^{'}}^{S} & \approx n^{T} H_{d} Σ^{c} H_{d^{'}}^{T} n + c^{T} H_{d}^{T} N^{2} Σ^{w} N^{2} H_{d^{'}} c . \end{array}

For a diagonal

Σ^{c}

(the common case) the first term is also diagonal, i.e., this term does not contribute to the covariances. Notably, Eq. (35) is valid in event mode as well as for histogrammed data.18

¹⁸

For histogrammed data the expression may need to be adapted depending on the rebinning approach that is being used.

We will show in Section 4.3 that the computation of the variance-covariance matrix is not significantly more expensive than the computation of the cross section or the CS approach in general.

In Fig. 6 we compare the relative errors computed using the analytical approach and bootstrap (see Appendix A, in particular Fig. 9 which shows the same result for $R = 3000$ together with additional values of R). Additionally, Fig. 7 shows the correlation matrices computed by both approaches. The correlations are clear and long ranged for the displayed example with artificially reduced monitor counts which simulates, e.g., a $10 \times$ larger sample. When we use the actual monitor counts, the correlations are equally present, albeit weaker.

Fig. 7.

Correlation matrices for a powder-reduction workflow. We show the result for the sample S (left, $α = 10 \cdot α_{actual}^{sample} = 22.3$ ), vanadium V (center, $α = 10 \cdot α_{actual}^{vanadium} = 47.0$ ), and the normalized $P = S / V$ (right). As the correlation matrices $Σ^{S}$ , $Σ^{V}$ , and $Σ^{P}$ are symmetric under $d \leftrightarrow d^{'}$ , we can show in the same figure the analytical result (above the diagonal) and the bootstrap result (below the diagonal). For improved readability, the plots have a limited d-spacing range as there are few further features for higher d. See Appendix B for a version of the figure with a logarithmic color scale.

3.3. Vanadium normalization

For scattering off a vanadium sample, we define $V_{d}$ equivalently to $S_{d}$ for the sample under study above. The normalized probability distribution P is then given by $\begin{array}{l} (36) & P_{d} = \frac{S_{d}}{V_{d}} . \end{array}$ As $S_{d}$ and $V_{d}$ are assumed to be uncorrelated we obtain $\begin{array}{l} (37) & Σ_{d d^{'}}^{P} = \frac{Σ_{d d^{'}}^{S}}{V_{d} V_{d^{'}}} + \frac{S_{d} S_{d^{'}} Σ_{d d^{'}}^{V}}{V_{d}^{2} V_{d^{'}}^{2}} . \end{array}$ We can thus straightforwardly compute the variance-covariance matrix of the normalized d-spacing spectrum from the known $Σ^{S}$ and $Σ^{V}$ . The resulting correlations are displayed in the rightmost panel in Fig. 7.

3.4. The effect of smoothing

Smoothing is routinely used to reduce the effect of noise in normalization terms such as a neutron monitor spectrum or a reduced vanadium spectrum. This reduces the size of the variances of the normalization term,19

¹⁹
Note that, e.g., Mantid’s FFTSmooth (version 2) algorithm sets uncertainties directly to 0. This behavior appears to be undocumented.

and thus the relative importance of these uncertainties when combined with the variances of the detector counts. However, the smoothing procedure introduces additional correlations between bins of the normalization term which counterbalance the effect. Smoothing can be described as a convolution with a kernel function

ω

. We repeat the derivation for powder diffraction from Section 3.2 with a smoothed wavelength-dependent monitor normalization term

w

. Define

\begin{array}{l} (38) & \tilde{w} = ω * w, \\ (39) & {\tilde{w}}_{n} = {(ω * w)}_{n} = \sum_{m} ω_{n - m} w_{m}, \end{array}

where ∗ denotes the convolution. Then

\begin{array}{l} (40) & Σ^{\tilde{w}} \approx J^{w \tilde{w}} Σ^{w} {J^{w \tilde{w}}}^{T}, \\ (41) & J_{n n^{'}}^{w \tilde{w}} = \frac{\partial {\tilde{w}}_{n}}{\partial w_{n^{'}}} = \frac{\partial {(ω * w)}_{n}}{\partial w_{n^{'}}} = \sum_{m} ω_{n - m} δ_{m n^{'}} = ω_{n - n^{'}} . \end{array}

That is, the Jacobian is a diagonal-constant matrix, a Toeplitz matrix, Ω with elements

Ω_{n n^{'}} = ω_{n - n^{'}}

. As an example, for a Gaussian filter this matrix would be symmetric, and for a typical discrete approximation with small finite support it would also be band-diagonal. Equivalently to Eq. (35) we then obtain

\begin{array}{l} (42) & Σ_{d d^{'}}^{\tilde{S}} & = {\tilde{n}}^{T} H_{d} Σ^{c} H_{d^{'}}^{T} \tilde{n} + c^{T} H_{d}^{T} {\tilde{N}}^{2} Ω Σ^{w} Ω^{T} {\tilde{N}}^{2} H_{d^{'}} c . \end{array}

To compute the second term we may thus smooth the histogrammed and scaled counts

{\tilde{N}}^{2} H_{d^{'}} c

, which should result in more long-ranged correlations. Just like for the unsmoothed case this is computationally feasible in practice. For the Wish data studied in Section 3.2 we did not observe a significant impact of moderate monitor smoothing on the final variances or correlation matrices. We conjecture that the effect of smoothing may be of secondary importance for powder diffraction – after the coordinate transformation from wavelength to d-spacing, every monitor bin contributes to many d-spacing bins, even without smoothing. It is not clear if the same would also hold true, e.g., for single-crystal data. In this case only few detector pixels and wavelengths may contribute to a sharp Bragg peak. Thus, without smoothing each monitor bin might contribute only to a single peak, but smoothing could introduce correlations to other peaks.

4. Discussion

4.1. Current software: Scope and impact of incorrect uncertainty handling

We have described and demonstrated a systematic problem with the treatment of the statistical uncertainties of normalizations terms in Mantid and Scipp. This constitutes the most important result of this work.

Affected software includes Mantid (all versions at the time of writing, at least up to Mantid v6.5) and Scipp (all versions up to Scipp v22.11, inclusively) but the issue may also be present in other similar data-reduction software. Mantid is used at the ISIS Neutron and Muon Source at the STFC Rutherford Appleton Laboratory (http://www.isis.stfc.ac.uk/), the Spallation Neutron Source (SNS) and the High Flux Isotope Reactor (HFIR) at the Oak Ridge National Laboratory (http://neutrons.ornl.gov/), the Institut Laue-Langevin (ILL, https://www.ill.eu/), the China Spallation Neutron Source (CSNS, http://english.ihep.cas.cn/csns/), the European Spallation Source ERIC (ESS, http://europeanspallationsource.se/), the Forschungs-Neutronenquelle Heinz Maier-Leibnitz (FRM II, https://www.frm2.tum.de/en/frm2/home/), and SINQ at the Paul Scherrer Institute ( https://www.psi.ch/en/sinq). Scipp is used at the ESS. All experimental techniques, including powder-diffraction, small-angle scattering, reflectometry, single-crystal diffraction, spectroscopy, and imaging can be affected. Many data-reduction workflows are affected multiple times. For example, in powder diffraction we may perform normalization of sample data to monitor (problem from broadcast over pixels and event-mode), normalization of vanadium data to monitor (problem from broadcast over pixels and event-mode), normalization of reduced data to vanadium (problem from event-mode). Even this relatively simple example includes five distinct instances of the problem.

To our knowledge little or no impact of the issue described in this manuscript has been reported in practice. There are at least two likely explanations. Firstly, variations of Eq. (13) may hold in many cases. That is, in practice the normalization terms often have very good statistics – as we have seen for the concrete SANS and powder diffraction examples in Section 3.1 and Section 3.2, respectively. This can be both due to practicality (it is relatively simple to measure many vanadium counts, or have a monitor with a high count rate) and due experience (instrument scientists know how long to measure normalization terms for good results). Secondly, coordinate transformations, as well as preprocessing steps of the normalization terms such as smoothing, lead to correlations that spread over a wider range. These correlations might affect mostly non-local aspects, which could be harder to discern when, e.g., fitting a physical model to the reduced result.20

²⁰
We do not have enough background knowledge about concrete subsequent data analysis methods to tell when and where such non-local properties might be irrelevant.

In our opinion the above can however not justify dismissal of the issue. The current state-of-the-art software gives a false sense of security (of having handled uncertainties correctly) to the scientist. While we managed to compute corrected analytical results for variances and covariances in the model workflows used in this manuscript, this cannot be directly substituted in CS implementations for the operation-by-operation error-propagation mechanism. At the very least our software must thus avoid giving a false sense of security and should therefore reject normalization terms that have uncertainties. This will force the scientist to consider whether the normalization uncertainties fulfill Eq. (13). Based on this they can then either (A) confidently drop this term or (B) increase the statistics by measuring the normalization terms better.

A complete analysis for any particular technique such as SANS or powder-diffraction is beyond the scope of this manuscript. The workflows discussed in Section 3 are therefore not necessarily capturing all potential sources of uncertainties, but are meant to illustrate the key aspects. To give an idea of what a more complete analysis might involve, consider the SANS case. Our analysis in Section 3.1 covered the effects of the variances of the monitors. There are however a number of other contributions: (1) Before computation of wavelengths and Q, the beam center is determined. The uncertainty on the beam center leads to a correlation of all computed wavelength and Q values. (2) For the computation of the transmission fraction, each monitor is background-subtracted: The monitor background is computed as the mean over a user-defined wavelength range and then subtracted from the monitor. This broadcasts the mean value to all monitor bins, leading to correlations. (3) The direct beam function D itself comes with uncertainties which were computed using CS and may thus be underestimated in themselves.

Intriguingly, the SANS data reduction implemented in Scipp prior to this work (and used for this analysis) actually ignored the uncertainties of D. This provides a worrysome glimpse of the potentially cascading negative effects of the CS implementation: An interpolation is required to map D to the wavelength-binning of the monitors, and it was unclear how uncertainties should be handled during this interpolation. During development of the workflow it was determined empirically that setting D’s uncertainties to zero had no influence on the uncertainties of the final result. Looking back, we can now tell that this empiric evidence was based on the incorrect handling of uncertainties – since the broadcast of the pixel-independent D to all pixels supresses its contribution to the final uncertainties. Thus the decision to ignore them needs to be revisited. Preliminary results indicate that correct handling of the uncertainties of D may noticably impact the final uncertainties.

4.2. Improved analytical approach

Aside from highlighting the systematic problem, we have succeeded in Section 3 in deriving expressions for variance-covariance matrices of the data-reduction results. As we will show in the discussion in Section 4.3, there is likely no major difference in terms of computational cost compared to the flawed approach currently implemented in Mantid and Scipp, even though it provides access to not just the variances but also the covariances of the final result. We conjecture that our approach may be feasible also for complete and realistic data-reduction workflows. We plan to investigate this in more detail in future work, in particular to extend this to other neutron-scattering techniques and to include corrections such as absorption corrections and background subtractions.

A fundamental difference – and possibly the main challenge – is that unlike in the CS implementation, handling and computation of variances is difficult to automate. While Python packages such as JAX [4] and Uncertainties [16] implement automatic differentiation approaches, it is unlikely that this is directly applicable for the components of a neutron-scattering data-reduction workflow.21

²¹
Even if automatic differentiation could handle operations such as histogramming, the Uncertainties Python package stores arrays with uncertainties as arrays of values with uncertainties. The very high memory and compute overhead makes this unsuitable for our needs.

That is, we would need to either rely on fixed workflows with previously derived expressions for the variance-covariance matrix, or the user must derive these on their own. If a user does derive such an expression, programming skills are required to implement it. In either case, workflows that handle error propagation may become less interactive.

On the positive side, our new approach can produce data-reduction results including covariances. This may be beneficial for subsequent analysis and model fitting procedures. Furthermore, Scipp has proven flexible enough to allow for computation of the expressions for the variance-covariance matrix with very concise code and adequate performance. This further increases our confidence that the described approach may be a viable solution for handling data from day-to-day instrument operations.

4.3. Computational cost of the improved analytical approach

In Sections 3.1 and 3.2 we described an improved analytical approach for computing uncertainties and correlations for the result of the data-reduction. The computation of the variance-covariance matrices turns out to be not significantly more expensive than the computation of the cross sections. Despite yielding the full correlation matrix the overall cost is thus comparable to that of the CS implementation. This holds true even for event-mode data-reduction. To understand this, one important aspect to realize is that – in contrast to CS – it is no longer necessary to carry the uncertainties of the detector counts in intermediate steps, since for the raw input data the variances are equal to the measured counts. A full analysis of the performance characteristics is beyond the scope of this paper. We briefly touch on the main aspects and proceed to report concrete timings for an example below.

Firstly, considering the normalization operation we observe the following. In addition to the normalized counts, $x / y$ , CS computes the variance as $\begin{array}{l} (43) & var (\frac{x}{y}) & = \frac{var (x) + var (y) \frac{x^{2}}{y^{2}}}{y^{2}} . \end{array}$ This is reordered compared to Eq. (4) to avoid infinity in limit cases such as $x = 0$ . These operations for the variances are omitted for the improved approach. While this saves a lot of floating-point operations, in practice the normalization is typically bound by memory bandwidth on modern computer architectures. Cutting the required memory in half by not storing variances at this point thus has likely a more significant impact. Secondly, there are in principle savings in both floating-point operations as well as memory bandwidth from not having to handle variances during the histogramming step. In practice however, histogramming is strongly influenced by CPU cache effects and various operations with non-linear scaling in either the number of events or the number of bins. Details depend heavily on the concrete implementation. See the excellent Ref. [8] for more background on CPU architecture, memory, and their influence on performance. These saving are counterbalanced by additional computations for, e.g., Eq. (35). This involves computation of the $H_{d} c$ factors, which are obtained by computing a 2-D $(d, λ)$ -histogram. The sum over the product of resulting factors then corresponds to a medium-sized matrix-matrix product, which takes a couple of seconds for realistic wavelength and d-spacing bin counts.22

²²
Timings measured for NumPy [10] backed by a multi-threaded linear-algebra library.

We illustrate this on the example of the Wish powder-diffraction data-reduction. The input data has a file size of $930 MByte$ with 194560 detector pixels. In Table 1 we report timings for the relevant part of data-reduction: normalization, computation of d-spacing, and reduction (histogramming) using the multi-threaded Scipp implementation running on all 24 hardware threads of the test system. Times for loading data and computation of λ are not included. We observe that our improved approach – which includes computation of the variance-covariance matrix – is less than twice as costly as the CS approach, taking between $2.3 s$ and $10.9 s$ depending on the choice of parameters. The additional cost ranges between $0.5 s$ and $4.6 s$ For comparison, note that loading a Wish raw file using Mantid takes about $12 s$ on the same machine. That is, the overall impact on data-reduction performance is likely minor. The impact could be reduced further by optimizing the code. For example, Scipp does currently not have a direct 2-D histogramming algorithm but instead used a less efficient two-step approach.

Table 1

Timings for the relevant part of the Wish histogram-mode data-reduction. $N_{λ}$ is the number of wavelength bins and $N_{d}$ the number of d-spacing bins. Both columns reports timings for a Scipp-based implementation, but the mechanism for propagating uncertainties is used only for the CS column. Measured on an Intel Xeon W-2265, $19.25 MByte$ cache, $3.50 GHz$ , 12 cores (24 threads). Reported values are rounded and averaged between three runs

$N_{λ}$ , $N_{d}$	CS [s]	ours [s]
1000	1.8	2.3
2000	3.3	4.5
4000	6.3	10.9

5. Conclusion

In this paper we have shown a systematic problem with the treatment of the statistical uncertainties of normalizations terms in Mantid and Scipp. We have shared our findings with the Mantid development team. In the Scipp project, we have disabled propagation of uncertainties if one of the operands is broadcast in an operation with the release of Scipp v23.01. Support for event-mode normalization operations with normalization terms that have variances is also disabled. In both cases an exception is raised instead, i.e., the operation will fail, forcing the user to address the problem correctly. We had initially considered simply displaying a warning to the user when an operation introduces correlations. However, experience shows that warnings are almost always ignored and we now do not consider such an approach as a solution.

Aside from highlighting the systematic problem, we have succeeded in deriving expressions for variance-covariance matrices of the data-reduction results. Computationally the new approach would be feasible, but whether it is applicable and useful in practice remains to be investigated.

Footnotes

Acknowledgements

We kindly thank the authors of Ref. [] for allowing us to use their data for the SANS example. We thank Pascal Manuel for sharing Wish data, used in our powder-diffraction example. We thank Evan Berkowitz and Thomas Luu for valuable discussions about bootstrap and correlations. We thank our coworkers Celine Durniak, Torben Nielsen, and Wojciech Potrzebowski for helpful discussions and comments that helped to improve the clarity of the manuscript.

Bootstrap

This section gives a brief overview of bootstrapping as used in this work. For a more complete explanation we refer to Ref. [9]. Given a measured distribution of random variables $X$ , we aim to compute some parameter θ. We estimate θ from a sample $X$ using an estimator s, $\hat{θ} = s (X)$ .

In this work, $X$ typically corresponds to a neutron count rate per pixel, per wavelength bin, as well as other input parameters. That is, $X$ includes, e.g., both measurements at detectors D and monitors M such that, e.g., $X \equiv [X_{i_{1}}, \dots, X_{i_{n}}] = [D_{λ_{1} p_{1}}, \dots, D_{λ_{m} p_{l}}, M_{λ_{1}}, \dots, M_{λ_{m}}]$ . D is indexed by wavelength λ and pixel p while M is only indexed by the wavelength. θ can be the scattering intensity as a function of d-spacing or Q. The estimator s corresponds to the data-reduction procedure that yields the intensity for that d or Q.

We assume all $X_{i}$ to be independently normally distributed23

²³

This assumption also underlies the regular propagation of uncertainties.

with mean

μ_{i}

and standard deviation

σ_{i}

, i.e.

X_{i} = N (μ_{i}, σ_{i})

. Bootstrapping proceeds by sampling from

X_{i}

such that a single bootstrap sample is

\begin{array}{l} (44) & x^{*} = [x_{i_{1}}^{*}, \dots, x_{i_{n}}^{*}], where x_{i}^{*} \sim X_{i} = N (μ_{i}, σ_{i}) . \end{array}

The desired parameter is then computed on this sample, yielding a bootstrap replica

{\hat{θ}}^{*} = s (x^{*})

. Finally, the average and variance of

\hat{θ}

are computed from R replicas as

\begin{array}{l} (45) & μ ({\hat{θ}}^{*}) = \frac{1}{R} \sum_{r = 1}^{R} s (x_{r}^{*}), \\ (46) & var ({\hat{θ}}^{*}) = \frac{1}{R - 1} \sum_{r = 1}^{R} {(s (x_{r}^{*}) - μ ({\hat{θ}}^{*}))}^{2} . \end{array}

In addition, the covariance of two parameters

cov (θ, η)

can similarly be estimated as

\begin{array}{l} (47) & cov ({\hat{θ}}^{*}, {\hat{η}}^{*}) = \frac{1}{R - 1} \sum_{r = 1}^{R} (s_{θ} (x_{r}^{*}) - μ ({\hat{θ}}^{*})) (s_{η} (x_{r}^{*}) - μ ({\hat{η}}^{*})) . \end{array}

As an example, Figs 8 and 9 show the relative standard deviation of $S (Q)$ (SANS) and $S (d)$ (powder diffraction) computed by bootstrap with different values of R. The figures demonstrate that the bootstrap calculations converge to the analytical result. Our bootstrap code is available as a Python package at https://scipp.github.io/scippuncertainty/.

Figures for correlation matrices

For completeness, we show in this section the figures of the correlation matrices from the SANS and powder diffraction workflows using a logarithmic color scale.

References

Arnold,

J.C.

Bilheux,

J.M.

Borreguero,

Buts,

S.I.

Campbell,

Chapon,

Doucet,

Draper,

R.F.

Leal,

M.A.

Gigg,

V.E.

Lynch,

Markvardsen,

D.J.

Mikkelson,

R.L.

Mikkelson,

Miller,

Palmen,

Parker,

Passos,

T.G.

Perring,

P.F.

Peterson,

Ren,

M.A.

Reuter,

A.T.

Savici,

J.W.

Taylor,

R.J.

Taylor,

Tolchenov,

Zhou and

Zikovsky, Mantid – data analysis and visualization package for neutron scattering and μSR experiments, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 764 (2014), 156–166. doi:10.1016/j.nima.2014.07.029.

B.A.

Berg, Introduction to Markov Chain Monte Carlo Simulations and Their Statistical Analysis, 2004. doi:10.48550/ARXIV.COND-MAT/0410490.

P.R.

Bevington and

D.K.

Robinson, Data Reduction and Error Analysis for the Physical Sciences, McGraw-Hill, Boston, 2003.

Bradbury,

Frostig,

Hawkins,

M.J.

Johnson,

Leary,

Maclaurin,

Necula,

Paszke,

VanderPlas,

Wanderman-Milne and

Zhang, JAX: Composable transformations of Python+NumPy programs, 2018, http://github.com/google/jax.

Broadcasting in NumPy, https://numpy.org/doc/stable/user/basics.broadcasting.html.

Chapon,

Manuel,

Radaelli,

Benson,

Perrott,

Ansell,

Rhodes,

Raspino,

D.M.

Duxbury,

Spill and

Norris, Wish: The new powder and single crystal magnetic diffractometer on the second target station, Neutron News 22 (2011), 22–25. doi:10.1080/10448632.2011.569650.

Documentation of error propagation in Mantid, 2022, https://docs.mantidproject.org/nightly/concepts/ErrorPropagation.html.

Drepper, What Every Programmer Should Know About Memory, 2007, https://people.freebsd.org/~lstewart/articles/cpumemory.pdf.

Efron and

R.J.

Tibshirani, An Introduction to the Bootstrap, Chapman and Hall/CRC, 1994. doi:10.1201/9780429246593.

10.

C.R.

Harris,

K.J.

Millman,

S.J.

van der Walt,

Gommers,

Virtanen,

Cournapeau,

Wieser,

Taylor,

Berg,

N.J.

Smith,

Kern,

Picus,

Hoyer,

M.H.

van Kerkwijk,

Brett,

Haldane,

J.F.

del Río,

Wiebe,

Peterson,

Gérard-Marchant,

Sheppard,

Reddy,

Weckesser,

Abbasi,

Gohlke and

T.E.

Oliphant, Array programming with NumPy, Nature 585(7825) (2020), 357–362. doi:10.1038/s41586-020-2649-2.

11.

R.K.

Heenan,

Penfold and

S.M.

King, SANS at pulsed neutron sources: Present and future prospects, Journal of Applied Crystallography 30(6) (1997), 1140–1147. doi:10.1107/S0021889897002173.

12.

Heybrock,

Arnold,

Gudich,

Nixon and

Vaytet, Scipp: Scientific data handling with labeled multi-dimensional arrays for C

+ +

and Python, Journal of Neutron Research 22(2–3) (2020), 169–181. doi:10.3233/JNR-190131.

13.

Heybrock,

Vaytet and

J.-L.

Wynen, Systematic underestimation of uncertainties by widespread neutron-scattering data-reduction software – Supplementary Material, Zenodo, 2023. doi:10.5281/zenodo.7708115.

14.

Hjelm, The resolution of TOF low-Q diffractometers: Instrumental, data acquisition and reduction factors, Journal of Applied Crystallography 21 (1988), 618–628. doi:10.1107/S0021889888005229.

15.

ISIS Powder Diffraction Scripts Workflow Diagrams, https://docs.mantidproject.org/v6.5.0/techniques/ISISPowder-Workflows-v1.html.

16.

O.E.

Lebigot, Uncertainties: A Python package for calculations with uncertainties, 2023, http://pythonhosted.org/uncertainties/.

17.

Manasi,

M.R.

Andalibi,

R.S.

Atri,

Hooton,

S.E.M.

King and

K.J.

Edler, Self-assembly of ionic and non-ionic surfactants in type IV cerium nitrate and urea based deep eutectic solvent, The Journal of Chemical Physics 155(8) (2021), 084902. doi:10.1063/5.0059238.

18.

C.E.

Papadopoulos and

Yeung, Uncertainty estimation and Monte Carlo simulation method, Flow Measurement and Instrumentation 12(4) (2001), 291–298. doi:10.1016/S0955-5986(01)00015-2.

19.

P.F.

Peterson,

S.I.

Campbell,

M.A.

Reuter,

R.J.

Taylor and

Zikovsky, Event-based processing of neutron scattering data, Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 803 (2015), 24–28. doi:10.1016/j.nima.2015.09.016.

20.

P.F.

Peterson,

Olds,

A.T.

Savici and

Zhou, Advances in utilizing event based data structures for neutron scattering experiments, Review of Scientific Instruments 89(9) (2018), 093001. doi:10.1063/1.5034782.

21.

Sans2d: I(Q) workflow for a single run (sample), https://scipp.github.io/ess/instruments/loki/sans2d_to_I_of_Q.html.

22.

Sans2d: Time-of-flight Small-Angle Neutron Scattering instrument, https://www.isis.stfc.ac.uk/Pages/Sans2d.aspx.

23.

Scipp Documentation, 2022, https://scipp.github.io/.

24.

SNSPowderReduction algorithm documentation, https://docs.mantidproject.org/v6.5.0/algorithms/SNSPowderReduction-v1.html.

Systematic underestimation of uncertainties by widespread neutron-scattering data-reduction software

Abstract

Keywords

1. Introduction

1 Disclaimer: The authors of the present paper are the principal authors of Scipp and have previously contributed to Mantid for several years.

3 Whether this assumption is always correct is not the topic of this manuscript.

3.1. SANS

7 The direct-beam function allows to cross-normalize the incident spectrum to that of the empty beam (without sample) seen on the main detector.

16 The sample is NaCaAlF measured in Wish run number 52889 and the vanadium was measured in run number 52920.

3.4. The effect of smoothing

19 Note that, e.g., Mantid’s FFTSmooth (version 2) algorithm sets uncertainties directly to 0. This behavior appears to be undocumented.

4.1. Current software: Scope and impact of incorrect uncertainty handling

20 We do not have enough background knowledge about concrete subsequent data analysis methods to tell when and where such non-local properties might be irrelevant.

21 Even if automatic differentiation could handle operations such as histogramming, the Uncertainties Python package stores arrays with uncertainties as arrays of values with uncertainties. The very high memory and compute overhead makes this unsuitable for our needs.

22 Timings measured for NumPy [10] backed by a multi-threaded linear-algebra library.

Footnotes

Acknowledgements

Bootstrap

Figures for correlation matrices

References

¹
Disclaimer: The authors of the present paper are the principal authors of Scipp and have previously contributed to Mantid for several years.

³
Whether this assumption is always correct is not the topic of this manuscript.

⁷
The direct-beam function allows to cross-normalize the incident spectrum to that of the empty beam (without sample) seen on the main detector.

¹⁶
The sample is NaCaAlF measured in Wish run number 52889 and the vanadium was measured in run number 52920.

¹⁹
Note that, e.g., Mantid’s FFTSmooth (version 2) algorithm sets uncertainties directly to 0. This behavior appears to be undocumented.

²⁰
We do not have enough background knowledge about concrete subsequent data analysis methods to tell when and where such non-local properties might be irrelevant.

²¹
Even if automatic differentiation could handle operations such as histogramming, the Uncertainties Python package stores arrays with uncertainties as arrays of values with uncertainties. The very high memory and compute overhead makes this unsuitable for our needs.

²²
Timings measured for NumPy [10] backed by a multi-threaded linear-algebra library.