Abstract
Geochemical data from soils in mineralised areas commonly have skewed and non-normal distributions. As such, raw soil geochemical data cannot be used for direct geostatistical analysis of spatial variability and interpolation without introducing additional uncertainties to any interpretation. The non-normal distributions will influence the robustness and fitting of variograms to the data, and negatively influence the accuracy of any interpolations produced from these data. Therefore, prior to assessment, the dataset must be transformed to ensure that it has a normal distribution. Three transformations, namely the logarithmic, Box–Cox and Johnson transformations, were applied to As, Cd, Hg, Pb and Zn soil geochemical data from the Tongling metallogenic district, part of the Yangtze metallogenic belt, Anhui Province, China. The results of these transformations were analysed to determine the skewness of the data; and, using a Kolmogorov–Smirnov test, how closely the transformed data approximate a normal distribution. A comparison of the differing normalisation approaches indicates that: the logarithmic transformation could not transform the data to approximate a normal distribution; the Box–Cox transformation removed the skewness of the data but the results were still non-normally distributed; and the Johnson transformation proved to be the optimal method, with the results, including outliers, passing the Kolmogorov–Smirnov test. Both the Johnson and Box–Cox transformations also improved the shape of variograms produced from the data. However, compared to Box–Cox, more of the Johnson transformed data fit within 95% confidence intervals for the Kolmogorov–Smirnov test; this improved data distribution means that this transformation should be considered the preferred geostatistical normalisation tool for soil geochemical data. The application of the Johnson transformation to soil geochemical data may improve the robustness of predictive targeting and mineral exploration in areas of known mineralisation that have non-normal spatially variable data.
Get full access to this article
View all access options for this article.
