Abstract
Curcuma longa L. has been used as a food, cosmetic, traditional medicine, and natural dye for a long time in tropical and subtropical regions such as India, China, and Vietnam. Curcuminoids are considered the main bioactive compounds in this plant. This study focuses on metabolites profiling of the rhizome methanolic extract of C longa samples collected in 6 different provinces in Vietnam using liquid chromatography coupled with high-resolution mass spectrometry. The partial least-squares discriminant analysis model was then established to discriminate its metabolomes and identify the chemomarkers that help to distinguish C longa from 6 geographical locations. Consequently, collected samples were segregated into 3 main groups: northern (Lang Son, with typical content of 2 terpenoids), center (Nghe An), and southern highland (Lam Dong, with distinctive profile of 3 curcuminoids). The absolute curcuminoids’ amount was also measured based on the calibration curve of reference standards. The differential metabolites including curcumin, demethoxycurcumin, and bisdemethoxycurcumin were found with the highest range in samples from Lang Son, indicating the excellent quality of turmeric cultivated in this area.
Introduction
Curcuma longa L. (Zingiberaceae) has been considered for a long time as a coloring agent in foods, cosmetics, textiles, and traditional medicine for multiple pharmacological activities, particularly in India, China, and Vietnam. The rhizome of C longa contains curcuminoids (curcumin, demethoxycurcumin, and bisdemethoxycurcumin) as major bioactive compounds. 1 There are several previous studies reporting the attractive pharmacological properties of turmeric, including antitumor, anticancer, antimutation, antiulcer, anti-inflammation, antioxidant, antibacterial, antifungal, and antivirus activities.2-6 Because of the widespread multipurpose use, various governmental projects were established, not only for effective conservation and exploitation, but also for developing growth areas of suitable yielding accession of this medicinal plant. Thus far, C longa is one of the most important crops in many provinces of Vietnam such as Lam Dong, Nghe An, Bac Giang, Ha Noi, Hoa Binh, and Lang Son. The cultivated areas of C longa in these regions have been increasing dramatically in recent years due to the growing demand for curcumin from the pharmaceutical industry. Although Vietnamese C longa has been shown to have a high curcumin content compared to other varieties in Asia, the breeding program has still been underestimated in our country. 7 The farmers mainly rely on domestic accession handed through generations that were not stable due to nonavailability of the requisite high yielding genotype, slow multiplication rate, low curcumin content in available cultivars, and loss during cultivation and storage. Investigations on bioactive compounds from C longa by isolation and evaluation of these components have been reported extensively; however, no data on the complex metabolite profiling of C longa accessions collected from different provinces of Vietnam is available to date. The concept of geo-authentic herbs referring to medicinal materials produced in a specific region with better quality can be applied in developing the best turmeric cultivation areas. 8 To do that, a study on the geographical variation of curcuminoids content in C longa of Vietnam must be carefully considered. Besides, major curcuminoids (including curcumin, demethoxycurcumin, and bisdemethoxycurcumin), and other active metabolites, for example, terpenoids and minor curcuminoids, have attracted more attention during recent years due to their significant bioactivities.9-12 Therefore, discrimination of the comprehensive metabolite profile of turmeric samples collected from different regions of Vietnam could provide precious and valuable information.
In recent years, the metabolomics approach has been developed as a powerful tool to fulfill the above-mentioned demand. Metabolomics is generally defined as both the qualitative and quantitative analysis of all metabolites in an organism. Thus, these approaches might facilitate identifying patterns or metabolite markers that are typical for a species, a cultivar, or certain stages of development of an organism. 13 Using this approach, many previous studies revealed a wide range of metabolites in C longa extracts collected in different countries all over the world. Multivariate analysis was subsequently used to discriminate metabolic profiles among samples of different geographical locations, thus identifying the authentic production area.14-18 This research both has implications in the context of developing analytical (metabolomics) methods and for the use of such techniques in determining the chemical variability of turmeric samples collected from different geographical locations. The unsupervised and supervised models are then applied to find the differential metabolites that might be used for predicting the geographical origin of a new sample for quality control or traceability.
Results and Discussion
Metabolite Profiling of C longa
A total of 76 peaks were annotated in 120 samples of C longa collected from different locations of Vietnam by using UPLC-LTQ Orbitrap MS analysis (Table 1S, Text 1S, Supplemental Material). Compounds 2, 6, and 10 gave mass ions at m/z 369.13287, 339.12204, and 309.11148 correspondings to the molecular formulas C21H21O6, C20H19O5, and C19H17O4, respectively. The retention times and MS of these peaks were found to be similar to those of curcumin, demethoxycurcumin, and bisdemethoxycurcumin, standards injected with all samples. These peaks were thus identified as curcumin, demethoxycurcumin, and bisdemethoxycurcumin with the “identified compound” level according to the classification system of Sumner et al. 19 The remaining metabolites were annotated based on the exact mass and mass spectrum properties and classified as either “putatively annotated compound” (32 compounds) or “unknown” (41 compounds) in the structural elucidation level (Table 1S, Supplemental Material). The detailed information on the structural characterization of these compounds is given in the Supporting structural elucidation of mass spectra.
All these peaks were semiquantified by taking the peak area, normalized by the weight of each sample. Peak integration was automatically processed with an XCMS R-package (getEIC function). The matrix data were then subjected to multivariate analysis (Table 2S, Supplemental Material). To view the group clustering and outliers, a preliminary unsupervised principal component analysis (PCA) was conducted on this data (Figure 1S, Supplemental Material). As displayed in the PCA plot, samples from different regions were separated into groups. Components 1 and 2 explained 68.7% variance of data. However, the discrimination was not significant, especially in the case of the Bac Giang and Hoa Binh, and Lang Son and Ha Noi samples. Therefore, more powerful and supervised analyses were subsequently applied. Firstly, a hierarchical cluster analysis (HCA) model was built to discriminate 120 samples from 6 locations based on calculating the Euclidean distances among them. As shown in Figure 1, samples from Lam Dong are separately clustered and could be discriminated sharply from the others. Lam Dong is in the central highland, a mountainous region in the south of Vietnam, which has greater differences in soil, altitude, and weather conditions from the northern and central provinces. Thus, turmeric from Lam Dong may contain certain compositions that could be distinguished from the samples of other regions.

Hierarchical cluster analysis (HCA) dendrogram of 120 samples. The red box shows the area of samples from Lam Dong province, while no clear separation was observed for the other provinces.
For better group clustering and chemomarkers identification, a partial least-squares discriminant analysis (PLS-DA) model was subsequently conducted on this data. From the eigenvalue score plot (Figure 2), we observed that the combination of the first 2 principal components carried major variables information which explained more than 60% of total data variance. In the score plot of PC1 and PC2 (Figure 3a), a significant separation cluster of samples from Lam Dong was formed, which was consistent with the HCA dendrogram. In addition, samples from Nghe An and Lang Son were well distinguished from other locations. Interestingly, Lang Son is a far northern location, and Nghe An is in the central region of Vietnam, while other provinces are located between these 2 areas. This phenomenon revealed that turmeric grown in different provinces from the north to the south of Vietnam might produce unique compositions, despite having a partly common metabolite profile.

The eigenvalue score plot.

PLS-DA: (a) score plot and (b) loading plot of the 2 major components of metabolites in Vietnam turmeric samples.
The loading plot (Figure 3b) shows how strongly each metabolite contributes to the definition of components. In this figure, the PC1 showed the negative impact given by the values of M10 (bisdemethoxycurcumin), M25 (turmeronol A or B), and M19 (cyclocurcumin), and sharp positive influences by M27 and M8 (both of them annotated as dehydrocurdione or curmenol or 13-hydroxygermacrone or bisabolone-4-one or curcumenone). The contents of M20 (cedrenol), M35, and M43 (2 unknowns with the same formula C15H25) had the strongest negative correlation with PC2, while M70 showed the highest positive effect. Moreover, categorical dependent variables of the northern provinces, including Hoa Binh and Ha Noi, were observed with positive values on both PCs, while those of Nghe An and Lang Son were represented by nearly zero and negative values on PC2, respectively. Both these 2 categorical variables showed negative values on PC1, whereas only the variables from Lam Dong samples showed simultaneously positive PC1 and negative PC2 values. The positions of samples from different provinces in Figure 3a, the categorical responses in Figure 3b, and the significant influences of the above variables suggesting that the samples from Lam Dong produced typical contents of M27 and M8, while the samples from Lang Son could be specified by M10, M19, and M25; M70 could be a marker to discriminate samples from Ha Noi and Hoa Binh. These results were proved in the moving range charts, which performed the ranges of each variable on the cases (Figure 4a to f).

Moving range charts of the 6 metabolites M8, M27, M10, M19, M25, and M70.
Chemical profiling with multivariate analysis has been proven to be effective for the authentication of herbal medicine and food condiments from different origins.20,21 In the present study, chemical profiling and multivariate analysis were combined to discriminate turmeric samples from different regions and seek out the differential compounds. Previously, several researchers found that curcuminoid content was correlated with geographical conditions. Indeed, Poudel et al 22 revealed that the curcumin content was higher in turmeric cultivated in the southern region compared to that from the northern part of Nepal and South Korea. Gad and Bouzabata 23 reported the differences and similarities in metabolic profiles among turmeric samples from Algeria and Egypt. A targeted analysis performed for C longa from different provinces of China allowed discrimination of these samples based on not only major but also minor curcuminoids. 18 In this study, our untargeted metabolomics analysis was conducted to integrate all metabolites detected in the samples, even unknown compounds, in the multivariate analysis to unravel the interaction of the whole metabolome in the discrimination of turmeric samples from Vietnam. The established model can be used to discriminate the geo-authentic and nonauthentic herbs or facilitate traceability. In addition, the selection of chemomarkers responsible for the discrimination brings some benefits for the strategic development of the turmeric crop in Vietnam. As showed previously, the metabolic variability between samples from the southern highland area (Lam Dong) to other northern regions was supposed due to the contents of 2 terpenoids, M8 and M27. Previous studies reported many bioactivities of turmeric essential oils (described by terpenoids), such as antioxidant, anti-inflammatory, and antinociceptive properties. 24 Therefore, the samples from Lam Dong could be valuable material to extract the essential oil. Moreover, samples from a far north area (Lang Son) made the discriminatory pattern; and 3 curcuminoids, M10, M19, and M25, were considered markers for this segregation. This result revealed that either the Lang Son turmeric accession or the soil quality of this site facilitates curcuminoids production. The metabolomes variability between samples from Ha Noi and Hoa Binh was more evident in the content of an unknown (M70), which could be a new target to identify by further experiments. Fuzzy clustering of samples from Nghe An and Bac Giang may indicate the similarity in the content of secondary metabolites in turmeric in these areas.
The whole metabolome changes depending on the genetic resource and many geographical factors, for example, weather, altitude, pH of the soil, temperature, and humidity. As these collected samples were identified as C longa, the subspecies biodiversity of these turmeric accessions followed by harmonizing culture parameters should be assessed to determine whether the metabolic variability comes from phylogenetic or environmental factors. Further investigation of the genetic diversity of these samples (mainly focusing on the biosynthetic pathway of terpenoids or curcuminoids identified as chemomarkers in this study) will provide an indication of cultivar selection and breeding program in case the metabolic discrimination originated from the genetic profile. If the metabolomes are mainly affected by environmental parameters, the optimization of culture conditions or the selection of growing areas will be easily monitored by the quantification of selected markers to achieve the best yield of the turmeric crop from Vietnam.
Curcuminoids Quantification of C longa Collected in Different Locations of Vietnam
In order to quantify the absolute content of curcumin, demethoxycurcumin, and bisdemethoxycurcumin, a mixture of these standards was injected solely and spiked within 120 turmeric samples in the UPLC–Orbitrap–MS system. The intensity of each curcuminoid in the standard mixtures, that is, peak area, was used to make the calibration curve. The high value of correlation coefficient R2 (>0.998) for all curves indicated the linearity of the quantification. The contents of curcumin, demethoxycurcumin, and bisdemethoxycurcumin in each of the 120 samples were calculated based on the linear regression equations. Table 1 shows the average concentrations (µg/mg) of curcuminoids in turmeric samples collected from 6 locations in Vietnam. In terms of the relative distribution of the 3 curcuminoids in the metabolite profile, curcumin occupied the highest level, followed by demethoxycurcumin and bisdemethoxycurcumin. The ratio of curcumin and its 2 derivatives was found to be 5:3:2 (C:DC: BDC) in almost all samples, which is consistent with other studies.22,25,26 In addition, it can be seen that the curcuminoids profiles varied strongly according to geographical location. Curcuminoids contents were found to be highest in the samples cultivated in Lang Son, followed by Ha Noi, and Nghe An. Samples from Lam Dong, Hoa Binh, and Bac Giang expressed the lower curcuminoid contents. Indeed, for example, the average curcumin, demethoxycurcumin, and bisdemethoxycurcumin contents of samples from Lang Son were 110.3, 46.7, and 29.1 µg/mg dry weight, respectively, which were higher compared to the curcumin content of Korea samples and approximately equivalent to the average content of turmeric samples collected in Nepal. 22 A similar trend was observed with demethoxycurcumin and bisdemethoxycurcumin indicating that the turmeric cultivar and soil from Lang Son should be focused on more in the further investigation as a potential accession. This result was found to be consistent with the above observation from PLS-DA, which showed the greatest contribution of a range of metabolites to the models originated from samples collected in Lang Son. The analysis of samples from different locations within Vietnam showed enough regional variation within the country even though our study does not show the variation in curcuminoids content from the northern to the southern region, as shown in the study of Poudel et al 22 This indicated that the curcuminoids profiles of Vietnamese turmeric samples could be under the control of other factors. Among different parameters, temperature could be a critical aspect of curcuminoids content in turmeric samples of different geographical locations. Polyketide synthase type enzymes were found to be temperature sensitive; curcuminoid synthase catalyzes the formation of curcuminoids by condensing p-coumaroyl-CoA and malonyl-CoA.27-29 Further study harmonizing the cultivation conditions should be performed to identify the key factors that decide the major curcuminoids content in C longa of Vietnam.
Contents of Curcuminoids From C longa Methanolic Extract Samples From Different Geographical Locations of Vietnam. Curcuminoids Content is Expressed as Mean ± SD With the Reported Number of Replicates.
Significant difference from other locations.
Not significantly different from the locations represented by the same letter.
Conclusions
The metabolite profile of 120 turmeric samples from 6 provinces in 3 geographical regions of Vietnam was analyzed by an LC-HRMS-based metabolomics approach. The data were then successfully explored to discriminate between the locations. Turmeric from Lam Dong and Nghe An formed a clearer separation from other provinces in the HCA dendrogram. Using PLS-DA multivariate models, 2 terpenoids and 3 curcuminoids were proposed as chemomarkers for the turmeric geographical discrimination between samples from Lam Dong and Lang Son, respectively. Besides, we revealed a difference in major curcuminoids content among samples from 6 provinces. Of these, turmeric from Lang Son represented the highest amount of curcumin, demethoxycurcumin, and bisdemethoxycurcumin. This study presented a promised method for untargeted analysis using the multivariate model and chemomarker selection to predict the geographical origin of C longa and assess the quality of turmeric samples in Vietnam.
Experimental
Plant material. The rhizomes of 120 C longa individuals were collected at different locations of Vietnam, including Hoa Binh, Ha Noi, Lang Son, and Bac Giang in the northern region, Nghe An in the central region, and Lam Dong in Central Highlands, a mountainous area between the central and southern regions. The sampling information is shown in Table 2. The C longa samples were identified morphologically by Dr Nguyen The Cuong, Institute of Ecology and Biological Resources, Vietnam Academy of Science and Technology. Voucher specimens of plant materials were deposited at the Laboratory of Life Sciences, University of Science and Technology of Hanoi.
Information on Turmeric Samples.
Rhizomes were cleaned with distilled water to remove dust and soil, then rinsed with ultra-pure water (Millipore GmbH, Schwalbach, Germany, 18.2 MΩ.cm, TOC < 2 µg/L). These materials were chopped into small pieces, immediately put in nitrogen liquid, then freeze-dried until constant weight, followed by grinding into homogenous powder. All processed materials were stored at −80°C until required for LC-MS analysis.
Sample preparation. Freeze-dried powdered Curcuma rhizomes (50 mg) were extracted in 1 mL of MeOH, vortexed for 15 s, and then put in a sonicator bath for 1 h at 70°C. These extracts were cooled for 30 min, centrifuged at 10 000 rpm for 15 min, then diluted 100 times before injecting into the UPLC-LTQ Orbitrap MS.
Metabolite profiling. The analyses of metabolites in the methanolic extracts of C longa rhizomes were conducted on an Acquity UPLC system (Waters Corporation), hyphenated to an LTQ Orbitrap XL hybrid FTMS (Thermo Electron) via an atmospheric-pressure chemical ionization (APCI) source. Metabolites were separated in a BEH phenyl column (1.7 µm, 2.1 × 100 mm, Waters Corporation) coupled to a guard column. The mobile phase was as followed: A (UP water + MeOH 5% + formic acid 0.1%) and B (MeOH + formic acid 0.1%). Gradient elution was applied, as shown in Table 3S; Supplemental Material.
Analytes were positively ionized with an APCI source using the following parameter values: spray voltage 4 kV, capillary temperature 275°C, sheath gas 20 (arb), aux gas 5 (arb). In all these LC-MS analyses, full Fourier transform-MS were recorded between 100 and 1000m/z. The oven temperature was set at 60°C. All samples were kept at a temperature of 5°C.
Metabolomics measurement. Peak finding, peak integration, and retention time correction of the LCMS chromatograms were performed with the XCMS R-package. 30 The isotopes, adducts, and pc groups of peaks and candidate formulae were generated using the CAMERA R-package. 31 Integrated peaks of the mass (m/z) fragments were normalized across all samples by expressing the peak areas relative to the exact dry weight of tissue used in each extraction. The XCMS output of integrated peaks was tested for robust integration and representative m/z signals were selected for quantification of the absolute content of curcuminoids in C longa and multivariate analysis.
Simultaneous quantification of curcuminoids. The stock solutions were prepared in methanol at a concentration of 1 mg/mL for each curcuminoid. A serial dilution of the stock solution (5, 10, 25, 50, and 100 µg/mL) was prepared to establish the calibration curve. As previously mentioned in the sample preparation part, all C longa samples diluted to a final concentration of 0.5 mg/mL were subjected to LCMS analysis. The absolute quantities of curcumin, demethoxycurcumin, and bisdemethoxycurcumin were calculated according to the calibration curves. The contents of curcuminoids were expressed as mean ± SD.
Data analysis. Each sample was analyzed in triplicate and the mean value was calculated. Multivariate statistical analysis was performed in R version 3.5.3 (http://www.R-project.org/) using a mixOmics package and STATISTICA 12 (Dell Inc.). 32 Data of peak areas were logarithmically transformed and normalized by comparing the maximum and minimum values. A PCA model was used to see the overview of these data. Next, HCA was performed. Subsequently, a supervised PLS-DA was adopted to unravel the geographic discrimination in metabolomes of turmeric and identify the peaks that contributed to the discrimination. The difference in the absolute concentration of curcuminoids in 6 locations was evaluated by a one-way ANOVA test.
Supplemental Material
sj-docx-1-npx-10.1177_1934578X211045479 - Supplemental material for Geographical Discrimination of Curcuma longa L. in Vietnam Based on LC-HRMS Metabolomics
Supplemental material, sj-docx-1-npx-10.1177_1934578X211045479 for Geographical Discrimination of Curcuma longa L. in Vietnam Based on LC-HRMS Metabolomics by Kieu-Oanh Nguyen Thi, Hoang-Giang Do, Ngoc-Tu Duong, Tien Dat Nguyen and Quang-Trung Nguyen in Natural Product Communications
Footnotes
Acknowledgments
The corresponding author thanks Newton Fund for a travel grant to Dr Kieu-Oanh Nguyen T. to visit the Centre for Novel Agricultural Products (CNAP), Department of Biology, University of York, UK. We specially thank Dr Tony Larson from CNAP for his help in the analytical and data treatment part. We also gratefully acknowledge Laboratoire Mixte International—“Drug Resistance in South East Asia” (LMI-DRISA) for supporting the repair fee of the LCMS system in an urgent situation.
Declaration of Conflicting Interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
This research was supported by the International Foundation for Science (IFS) under grant number I-1-F-6275-1, and the Vietnam Academy of Science and Technology (grant no. TDNDTP.02/19-21).
Ethical Issues
Ethical Approval: Ethical Approval is not applicable for this article.
Statement of Human and Animal Rights
This article does not contain any studies with human or animal subjects.
Statement of Informed Consent
There are no human subjects in this article and informed consent is not applicable.
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
