Abstract
Knowledge-and data-driven approaches are two major methods used to integrate various evidential maps for mineral prospectivity mapping (MPM). Geological maps, geochemical samples and data from known gold deposits were collected in the western Junggar area, Xinjiang Province. The geological and a spatial database for geological and mineral occurrences were constructed for the studied region. A weights-of-evidence model and a fuzzy logic model were employed for MPM, and the results were compared. Results indicate that favorable sedimentary rocks, fault density, fault distance and concentration of Au were the primary factors affecting Au mineralization. Arsenic (AS), Stibium (Sb), fault direction, quartz veins and intrusive rocks were secondary factors affecting Au mineralization. Conditional independence exerted a major influence on the weights-of-evidence model. However, posterior probability would be very high if the conditional independence was disregarded, which impaired results. Combining the quantification results provided by weights-of-evidence and the fuzzy membership values determined by expert knowledge, the mineral prospectivity mapping according to the fuzzy logic method was proved to be valid. For the study area, which had a large number of deposits, data-driven approaches for MPM are generally considered to be appropriate. However, if sufficient data are not collected, the knowledge-driven approaches, for example, the fuzzy logic method used in the present study, usually achieves a better result.
Introduction
Mineral exploration is a sophisticated process that seeks to discover new mineral deposits in a region of interest [28]. Mineral prospectivity mapping (MPM) is used as a tool to delineate target areas that most likely contain mineral deposits of a particular type [26]. In order to conduct MPM, multiple data sets, or layers (e.g., geological, geophysical, geochemical, and remote sensing data) must be collected, analyzed and integrated [17]. The integration of different digital geoscientific data sets is a key component of MPM. Typically, data integration is performed using geographic information system (GIS) applications.
Knowledge-and data-driven are two major types of approaches, which assign evidential weights and integrate various evidential maps for MPM and further exploration [10]. In data-driven techniques, the known mineral deposits in a region of interest are used as “training points” to recognize and establish spatial relationships of deposits with particular exploration evidential features [3], therefore, these techniques are proper in well-explored areas [3]. Examples of data-driven methods include weights-of-evidence [7], fuzzy weights-of-evidence [24], logistic regression [9], neural networks [27], Bayesian networks [4], and support vector machines [25]. In knowledge-driven techniques, few known mineral deposits are predicted in areas of interest, so expert experience and judgments are required. Analysts apply expert opinions to assess the relative importance of spatial evidence as meaningful decision support [19]. Several mineral potential mapping methods are classified as knowledge-driven techniques, including Boolean logic, index overlay, fuzzy analytical hierarchy process [3], and fuzzy logic [29].
In this study, after a brief introduction of the weight-of-evidence and fuzzy logic methods, nine layers of geological and geochemical datasets are integrated for MPM. These two methods are tested in the Western Junnggar area of Xinjiang, China. The primary goal of this study was to compare the application effects of knowledge- and data-driven techniques for MPM when there were known mineral deposits, but a lack of other data about a given study area.
Geological setting of the study area
Regional geological background
The western Junggar area is located on the western margin of the Junggar Basin in the northern part of Xinjiang, China. In terms of administrative division, the western Junggar area belongs to Tacheng prefecture, Ili Kazakh Autonomous Prefecture. It extends to the Sawuer Mountain in the north, Ebinur Lake in the south, Buerkesidai-Karamay in the east, and Omin-Tuoli-Alataw pass in the west (Fig. 1). The western Junggar area is the region known for the largest number of discovered gold deposits in Xinjiang, with over 200 gold deposits or mineralization occurrences. There is one proven large-scale gold deposit, two medium-scale gold deposits, five small-scale gold deposits, and over 190 mineralization occurrences [30].
The western Junggar area is situated at the convergence belt between the Siberian and Tarim Plates and in the Hercynian back-arc basin. Due to the influence of regional tectonic stress, the study area has relatively developed folds and fault structures. The strata cropped out in the western Junggar area primarily include Ordovician-Silurian epimetamorphic rock series in the lower Paleozoic, Devonian-Carboniferous marine volcanic rock-turbidite formations in the upper Paleozoic, and Permian-Triassic terrestrial volcano-molasses formations. Tailegula Formations, Baogutu Formations and Xibeikula Formations are the primary gold-bearing strata in the metallogenic districts; Devonian Kulumudi Formations are also a gold-bearing stratum. Neopaleozoic post-collisional plutonic rocks composed of intermediate-acid intrusive rocks are extensively developed in the metallogenic belt. They are divided into two categories. One is the huge acidic batholith dominated by alkali-feldspar granite, which is found on both sides of the Darabut fault and constitutes the Darabut alkali-rich igneous rock belt [31]. Examples falling into this category include the pluton under Miao’ergou, Akbastau, Karamay, and Hongshan. The other category consists of granodiorite-quartz diorite, which is primarly found in the form of small stock and distributed in the southeast of the Dalabut fault [5].
Geological model
The gold deposits in different parts of the western Junggar area form a group of deposits that are closely connected in terms of formation time, space and genesis. Deposited under a certain geological environment, they belong to the same metallogenic series and can be attributed to the metallogenic events in the upper and middle Variscan [13]. Thus, a conceptual model for gold deposit formation in the western Junggar area was proposed, as shown in Table 1.
Methodology
Weights-of- evidence
Weights-of-evidence (W-of-E) is a data driven method for mineral potential mapping (MPM). It was first applied to the prediction of mineral deposits by Bon-ham-Carter et al. [11]. This data-driven method can estimate the relative importance of individual layers of evidence by statistical means, and requires a min eral deposit dataset and a series of geological features in order to generate a mineral potential map [26]. The weights-of-evidence modeling technique comprises five steps: The estimation of prior probability P { Determination of weighting coefficients (
Calculation of posterior probability
Testing for conditional independence; Validation [18].
The difference between the two weights is known as the weights contrast (C). A contrast reflects the overall spatial association between the evidential layer and the mineral deposits. Studentized contrast represetns the ratio of contrast and the contrast standard deviation. This value reflects the significance level of the C value. In the present study, the influence of each weights-of-evidence layer of mineral deposits was quantitatively measured by C and S(C). One of the common criticisms of the weights-of-evidence method for mineral potential mapping is the problem of conditional independence [8]. In the weights-of-evidence model, one important assumption is that all evidential layers of the model are conditionally independent of mineralization; otherwise, this will lead to posterior probability departure and unreliability of predictions. To satisfy the conditional independence assumption, the data that may be problematic were eliminated from this paper.
Fuzzy logic is based on the fuzzy-set theory proposed by Zadeh [14], and allows the geologist to utilize their knowledge to build models used to generate mineral potential maps, and select the evidential layers that they believe are most critical for the particular style of mineralization. Additionally, fuzzy logic allows weights to be assigned to each layer based on expert opinion [16]. The Boolean set theory defines a membership which is either 1 or 0 (true or false), whereas the fuzzy-set theory defines a degree of membership in a set, represented by a value between 0 and 1 without a crispboundary [1].
The fuzzy model for mineral prediction is defined as a generic model: if
Using a fuzzy set operator, n fuzzy sets Ai are integrated to form a comprehensive fuzzy set F, expressed by Equation (7):
The fuzzy model in mineral prediction consists of two steps: (1) fuzzification of data; (2) fuzzy synthesis of fuzzified data. Fuzzification can be realized by determining the fuzzy function. Fuzzy synthesis is executed by using the operator. The most basic fuzzy operators are: (1) fuzzy AND; (2) fuzzy OR; (3) fuzzy algebraic product; (4) fuzzy algebraic sum; and (5) fuzzy gamma [23]. Note that
Fuzzy OR values are the maximum membership values from each evidential layer. Thus, fuzziness or the membership value of each grid unit is controlled by the maximum membership value in each grid.
Membership values from each evidential layer at each location are multiplied to calculate the fuzzy algebraic product. Thus, the fuzzy membership value of each evidential layer has an influence on the calculation result.
The gamma operator achieves a synthesis result within the interval between the maximum and the minimum membership value. This value range is affected by the fuzzy membership value of the input evidence (Fig. 2). The value of
In this study, geological and geochemical datasets were used as sources of evidence for mineral prospectivity mapping. The 13 geological maps were received from the Xinjiang Bureau of Geology and Mineral Resources, obtained by field surveys and mapping at a scale of 1 : 200,000. The geochemical data comprised of 39 major and trace elements within 8104 samples. A spatial database was developed to manage the geological and geochemical data in ArcGIS10.1. A geographic coordinate system, namely, Beijing 1954, was used (6-degree Gauss-Kruger zone 14, central meridian 81°, and unit m). The database comprises planar, linear and point features: (1) point features representing the Au mineral deposits; (2) linear features representing faults, geological boundaries and attitudes; (3) planar features representing intrusive rocks, sedimentary (volcanic) strata and quartz veins.
Application of mineral prospectivity mapping techniques
Weights of evidence (W-of-E)
In this study, according to the expert opinions, geological model and the collected data, nine data layers are used for MPM, including fault density, fault distance, fault direction, intrusive rocks, quartz veins, sedimentary rocks and concentrations of Au, As and Sb. The study area is 50356 km2 in size, divided into units with areas of 0.25 km2. The number of gold deposits (including mineralization occurrences) is 240, with prior probability of 0.001192.
Spatial analysis
The associations between these nine types of data and gold deposits (mineralization occurrences) are analyzed quantitatively (Table 2). A StudC value greater than 1.5 infers a true, strong positive correlation and a StudC value greater than 0.5 but less than 1.5 infers a true but weak positive correlation [21]. Therefore the weights-of-evidence layer with S(C) above 1.0 is considered to demonstrate close association with gold deposits. Analysis is conducted according to the following procedures: Spatial analysis of faults data:
Faults play a role in enabling fluid passage during mineralization [26]. The objective of fault density analysis is to determine the distribution of faults over the entire region, and the degree of fault convergence. On this basis, the spatial association between fault convergence and the known deposits can be analyzed. The results are shown in Fig. 3, and indicate that the faults are more concentrated in the middle and northwest regions of the study area; the area with high-value fault density in the middle corresponds to the locations of known deposits. As shown in Table 2, fault density has a controlling effect on gold deposits. That is, over the interval of fault density of 0.572–1.288 from the 5th class to the 9th class, the S(C) reaches a maximum of 14.7403, indicating extremely large influence on mineralization. For fault distance analysis, Euclidean distance is used to measure the shortest distance from the pixel to the center of the designated target. By this means, the spatial association between the distance of fault and deposits is determined. The maximum fault distance is 5000 m, as shown in Fig. 3. When the fault distance is 0.0–0.5 km, the S(C) is 9.4142. Fault direction analysis represents the directionof a certain pixel with respect to the nearest fault line in numerical form, reflecting the spatial distribution characteristics of faults. Nine directions of faults are considered: north (–1°–22.5°), northeast (22.5°–67.5°), east (67.5°–112.5°), southeast (112.5°–157.5°), south (157.5°–202.5°), southwest (202.5°–247.5°), west (247.5°–292.5°), northwest (292.5°–337.5°), and north (337.5°–360°). Ss shown in Fig. 3, results indicate that most faults are northeast- and southeast-trending. As shown in Table 2, the influence of fault direction on gold deposits or mineralization occurrences is primarily manifested in the first (–1°–22.5°) and the ninth (337.5°–360°) directions, with S(C) reaching a maximum of 3.9844. Therefore, fault density, fault distance, and fault direction are considered to be factors which influence the quantitative evaluation of favorability for gold mineralization. Spatial analysis of intrusive rocks
Granite is most-extensively distributed in the study area, followed by ultramaficrock, diorite and inter mediate-acid dyke, which are dated to the middle-late Hercynian. One intermediate-felsicacid magmatic event occurred in the Hercynian belt. As a result, multi-stage intrusive rocks of varying scale were formed. With the exception of ultramafic rocks, the gold content is higher in intermediate-basic rocks.
Magmatic activities play a crucial role in gold enrichment [2]. Therefore, intrusive rocks are selected from the database for buffer analysis. The maximum influence range of intrusive rocks is 8 km, as shown in Fig. 3.
Intrusive rocks have a smaller controlling effect on gold deposits, which is primarily manifested in the 6th class (Table 2). That is, when the distance from rocks is 5.0–6.0 km, the S(C) is only 1.0031. In terms of data-driven approaches, intrusive rocks have a smaller influence on mineralization. As indicated by the mineralization conceptual model, intrusive rocks, especially small intrusions, are more closely associated with gold deposits. Therefore, an intrusive rock buffer distance of 5.0–6.0 km is considered as an influence factor that is favorable for gold mineralization. Spatial analysis of quartz veins
The mineralization conceptual model indicates that gold deposits are closely related to quartz veins. Generally, quartz veins can be found at the sites of gold deposits. Therefore, buffer analysis and reclassification are performed for quartz veins. Combinedwith expert knowledge, the maximum influence range of quartz veins is determined to be 2.5 km, as shown in Fig. 3. Calculation results are shown in Table 2; results indicate that the S(C) within 0–500 m is 2.9646. Thus, the occurrence of quartz veins within this distance is an influence factor that is favorable for gold mineralization in the quantitative evaluation. Spatial analysis of sedimentary rocks
The study area is comprised of a great variety of sedimentary rocks, totaling 154 types. According to spatial overlay superimposition on the known deposits, 230 deposits are located in the sedimentary formations, accounting for 95.8% . As shown in Table 2, sedimentary rocks have a controlling effect on gold deposits. The maximum S(C) reaches 26.9836, indicating the extremely large influence of sedimentary rocks on mineralization. Therefore, the discovery of these sedimentary rocks can be considered an influence factor of favorability for gold mineralization. Spatial analysis of geochemical data
Geochemical anomalies are not controlling factors, but are responses of specific mineralization processes [26]. Au, As and Sb results indicate that close relationships to the formation of gold deposits are subject to rasterization on the basis of Kriging interpolation. Reclassification maps of the three elements are obtained (Fig. 3). It is determined by reclassification and superimposition spatial overlay analysis that most deposits fall within the high peak area of Au content, and some within the high peak areas of As and Sb contents. As shown in Table 2, the S(C) of Au content increases from 1.8, and the increase becomes more rapid after 2.7 until reaching the maximum of 4.4602 at 6.3–6.7 ppb. This indicates that Au content has an extremely large influence on mineralization. When As content is 12–28 ppm, the S(C) is 2.4197; when the Sb content is 0.6–1.0 ppm, the studentized contrast is 4.0265. Therefore, Au, As and Sb contents within these intervals are influence factors of favorability for gold mineralization.
CI test and calculation of posterior probability
Conducting the W-of-E modeling requires evidence map patterns to satisfy the pairwise conditional independence (CI) assumption [28]. An
Fuzzy logic
The fuzzy logic method is essentially knowledge-driven. Here, value is assigned to metallogenic data sets by combining quantificational weights-of-evidence calculations and expert knowledge. A value of 10 is assigned to metallogenic data sets of sedimentary rocks; a value of 9 is assigned to geochemical anomaly of Au; a value of e is assigned to metallogenic data sets of faults; a value of 8 is assigned to the metallogenic data sets of quartz veins; a value of 7 is assigned to intrusive rocks; and a value of 7 is assigned to the geochemical anomaly of As and Sb. Thus, the membership value of metallogenic data sets in the study area is calculated, with the fuzzy membership values of faults shown in Table 4. The membership value of other layers is calculated identically. Next, fuzzy synthesis is performed for all evidential layers according to four schemes, respectively (Fig. 6). As shown in Table 5, results indicate that scheme D is finally selected for subsequent prediction. A mineral prospectivity map based on fuzzy logic model is drawn, as shown in Fig. 7.
Validation
Generally, the discovery of new deposits is the best validation of metallogenic prediction. However, such validation can hardly be achieved in a study region as extensive as the western Junggar area, which usually requires a large investment in time and capital. Kemp presented two methods of validation [15]: (1) test whether the metallogenic districts have a higher probability; and (2) compare with the results from other methods. Bonham-Carter also proposed that an ore-forming potential map can be used to predict the deposit distribution and hence to validate the established model [12]. In this paper, the first method was used for validation. As shown in Table 6, the fuzzy logic mineral prospectivity map predicts 75% high- and medium-favorability metallogenic districts of the known deposits within 14% of the study area.; the posterior probability reaches 0.9739 and only 25% of the deposits are outside the predicted districts. As shown from the predictions in Fig. 7, most deposits are distributed in the prediction regions; only a few known deposits are outside the prediction regions, demonstrating a scattered distribution pattern. This may be due to the lack of some metallogenic information (e.g., gravity data) for this district. Based on the above evaluation, it is concluded that the fuzzy logic method can be applied to the mapping of potential metallogenic districts with high accuracy.
Discussion
Through the application of the two models, two predictions are obtained for the study area. These two models are compared in terms of process and results. Process
It is evident that the weights-of-evidence model is a data-driven model. This model can quantitatively predict the relationship between each type of metallogenic information evidence and tknown deposits. Then, a metallogenic prediction is realized based on the metallogenic evidences input into the model. The entire process displays a distinct quantificational feature. However, due to the restraint imposed by the conditional independence assumption, some metallogenic evidences are excluded from the model. The posterior probability of the conditional independence is 0.0000267–0.4350819 and the posterior probability of not passing the conditional independence is 0.0000032–0.90977197; the posterior probability of gold mineralization not passing the conditional independence test is higher than that of the gold mineralization passing this test. This agrees with results reported by researchers [8, 18]. It is inferred that the prediction not passing the conditional independence test will affect the judgement of metallogenic prospectivity. Since the metallogenic factors are binarized by the weights-of-evidence model, some information is lost. As a result, the final prediction has a non-negligible error due to the lack of some important metallogenic evidence. The fuzzy logic model was employed on the basis of the weights-of-evidence model. Combined with a data-driven approach and expert knowledge, all considered influence factors of gold mineralization were included in the model, thus avoiding information loss due to binarization. Some small values were assigned to the factors, but experts familiar with the regional geological conditions believed that the important metallogenic factors can be assigned higher values. For example, during analysis of intrusive rock buffer distances of 6 km and 8 km using the weights-of-evidence model, the results are as follows: C = 0.0638, and S(c)=0.243. However, given that 25 deposits are located within this distance, the layer was finally assigned a value of 7 according to expert opinion. Therefore, the metallogenic prediction model combined a data-driven approach with expert knowledge to produce more reliable results. Results
The results of mineral prospectivity mapping achieved by weights-of-evidence and fuzzy logic techniques are shown in the maps in Figs. 5 and 7. The degree of favorability of a particular location to host Au deposits is displayed on the legend; areas in red represent those that have been allocated high favorability values, areas in green represent those with medium favorability values, and areas in blue represent those with low favorability values. Due to a lack of some important metallogenic evidences in the weight-of-evidence model, the number of deposits and the posterior probability of gold mineralization are far lower than values obtained by the fuzzy logic model, as shown in Figs. 5 and 7, and Table 7. Therefore, the fuzzy logic model built on the basis of a data-driven approach and expert knowledge is superior to the weights-of-evidence model
Conclusions
In this study, weights-of-evidence and fuzzy logic methods were used to produce an Au prospectivity map of the western Junggar metallogenic belt. The results of this work lead to the following conclusions: Significant geological controls on gold mineralization are evident according to spatial analysis. According to the weight contrasts and studentized contrast, favorable sedimentary rock types, fault density, and fault distance were the primary factors influencing Au mineralization. Arsenic, Sb, fault direction, quartz veins and intrusive rocks were secondary factors influencing Au mineralization. This suggests that sedimentary rocks, faults and Au geochemical anomalies are priorities for detailed mapping in future explorations. Conditional independence exerts great influence on the weights-of-evidence model. This study demonstrates that posterior probability would be high if the conditional independence assumption is disregarded; this will affect the accuracy of prediction. However, the conditional independence assumption is difficult to meet in reality. The conditional independence test calculates the probability that the model is not conditionally independent, and results above 95 or 99% indicate that an assumption of conditional independence should be rejected. Therefore, a concern of future study is to find better ways to satisfy the conditional independence assumption; for example, changing the method of conditional independence testing, reducing the number of weight layers or changing the number of grid units. In this way, the weights-of-evidence model can be better applied to metallogenic prediction. The prospectivity map obtained by the fuzzy logic model indicates a strong correlation between areas of high posterior probabilities and known Au deposits, indicating that the nine evidential layers used in this study area are valid. Based on the quantification according to weights-of-evidence and the fuzzy membership values determined by experts, the fuzzy logic method was used for mineral prospectivity mapping with high accuracy. It was determined that 25% of the deposits are outside the predicted districts, likely due to the lack of important metallogenic evidence. The prediction will be more accurate if the geophysical data can be input into the prediction model. The results from the weights-of-evidence and fuzzy logic methods are compared. For the study area with a large number of deposits, the data-driven approach is believed to be more suitable for mineral prospectivity mapping. However, if the data are insufficient (e.g., no geophysical data), the knowledge-driven approach (e.g., fuzzy logic method) may achieve a better prediction. A weights-of-evidence model completely relying on a data-driven approach may not be the ideal prediction model.
Footnotes
Acknowledgments
This work was jointly supported by the Xinjiang Uygur Autonomous Major Project (201330121-3), National Basic Research Program of China 973Program (2014CB440803) and Natural Science Foundation of China (U1129302), Young Technical Talent Cultivation Program of Xinjiang Uygur Autonomous Region (2013731014).
