Abstract
Chlorogenic acid (CGA) is an important phenylpropanoid metabolite widely present in plants, which exhibits a range of pharmacological activities in humans—including anticancer, antibacterial, hypoglycemic, hypolipidemic and antiviral activities—demonstrating promising potential for biomedical applications. However, the low content characteristic of this compound in natural plants limits its resource development and clinical application. Consequently, enhancing CGA biosynthesis in planta has emerged as a critical biotechnological objective. In recent years, advances in omics technologies and co-expression techniques have enabled the identification of six key enzymes involved in the biosynthesis of CGA: PAL, C4H, 4CL, C3H, HCT and HQT. Significant progress has also been made in the field of synthetic biology in this regard. These efforts have not only improved CGA production efficiency but also refined the molecular regulatory network governing its metabolic pathways. This paper systematically summarized the core pharmacological characteristics of CGA and innovatively elucidated the three-dimensional structures of key enzymes involved in its biosynthetic pathways, along with their catalytic and gene regulatory mechanisms. The study aims to deepen the understanding of CGA biosynthesis and facilitate its application in the biomedical field.
Introduction
Chlorogenic acid (CGA)(Figure 1), chemically defined as 3-O-caffeoylquinic acid, is a phenolic ester compound formed through the condensation of caffeic acid and quinic acid, with a molecular formula of C₁₆H₁₈O₉ and a molecular weight of 354.31 g/mol. 1 Also referred to as caffeotannic acid, it belongs to the phenylpropanoid class of metabolites biosynthesized via the shikimate pathway in plants, where cinnamic acid (derived from phenylalanine deamination) and quinic acid serve as direct precursors. 2 CGA is distributed in detectable quantities across diverse plant species, including: Eucommia ulmoides, Lonicera japonica, unroasted seeds of Coffea arabica, tubers of Solanum tuberosum, fruits of Malus domestica, leaves of Camellia sinensis.3,4 CGA is a common polyphenolic compound in plants and plays significant roles in plant physiology. It helps plants mitigate abiotic stressors such as heavy metals, 5 frost, 6 drought, 7 and salinity, 8 as well as defend against biotic threats including insects 9 and fungi. 10

Chemical structure of CGA.
Beyond its roles in plant and food systems, CGA demonstrates considerable potential for the treatment of human diseases. It exhibits multifaceted pharmacological effects, including: anticancer, antimicrobial, hypoglycemic, hypolipidemic, antiviral, anti-inflammatory, antioxidant and immunomodulatory effects.11–13 CGA can inhibit the growth of periodontal pathogenic bacteria, 14 as well as suppress the transfer of virulence and resistance genes in outer membrane vesicles of carbapenem-resistant Klebsiella pneumoniae. 15 CGA contributes to the modulation of glucose and lipid metabolism, offering promising avenues for preventing and alleviating obesity and supporting diabetes management. 16 Its notable antitumor properties are also evident; CGA can inhibit the proliferation of various cancer cell lines and induce apoptosis through multiple molecular pathways.17–19 Despite these compelling pharmacological activities, further rigorous clinical studies remain imperative to fully elucidate its mechanisms of action and therapeutic potential. These broad and promising medicinal properties continue to attract significant interest within the scientific community.
In recent years, researchers have made continuous effto identify key genes associated with the biosynthesis of CGA. To date, four distinct biosynthetic pathways of CGA, along with their corresponding enzymes, have been characterized in various plant species (Figure 2). Moreover, the critical regulatory roles of several transcription factor families—including MYB, WRKY, ERF, and bHLH—in modulating the activity of these enzymes have been elucidated. Hence, a comprehensive review is warranted to synthesize these recent advances and provide a roadmap for future research.

Major biosynthetic pathways of CGA. (Schematic cartoon displays four synthetic pathways of chlorogenic acid. The expression of the synthetic route of chlorogenic acid is explained in the article).
This review systematically summarizes and compares the pharmacological properties of CGA, while also providing an in-depth analysis of recent advances in understanding its biosynthetic regulation in plants. It further addresses a notable gap in previous reviews by incorporating the three-dimensional structures and catalytic mechanisms of key enzymes, and establishes a conceptual link between biosynthetic regulation and pharmacological effects. The synthesis presented herein is intended to provide a theoretical foundation and strategic direction for future basic research and novel drug development.
Pharmacological Activities
Anticancer Activity
CGA, a naturally occurring polyphenolic compound, has attracted growing attention due to its favorable biosafety profile and broad-spectrum anticancer properties. 20 The following section will elaborate on the antitumor effects of CGA, with a focus on several cancer types that have been extensively studied.
Activity against hepatocellular carcinoma
As a potential compound with anticancer properties, CGA has naturally been investigated in the context of liver cancer. However, early research primarily focused on its role as a chemosensitizer in HCC treatment. When combined with 5-fluorouracil (5-FU), CGA was shown to induce reactive oxygen species (ROS) burst, suppress activation of the extracellular signal-regulated kinase (ERK) pathway, and attenuate cancer cell proliferation, thereby significantly enhancing the inhibitory effect of 5-FU on HCC growth. 21 This finding suggested a promising strategy to overcome chemoresistance and improve therapeutic efficacy.
Yan et al. 22 identified a novel mechanism underlying the anti-HCC effects of CGA. They demonstrated that CGA inhibits tumor invasion and metastasis by modulating the balance between MMP-2/MMP-9 and TIMP-2, thereby influencing extracellular matrix degradation. Their study also reaffirmed that CGA suppresses ERK phosphorylation. Zhang et al. 23 reported that CGA exerts anti-HCC effects by modulating the gut microbiota in a rat model, though the study did not rule out the possibility of direct antitumor actions.
Subsequent research revealed that CGA can inhibit the non-canonical NF-κB pathway and DNMT1, while activating mitochondrial apoptosis, collectively contributing to suppressed tumor growth and promoted cell death.24,25 More recently, Zhang et al. 26 found that CGA directly binds to proteins such as PTGS2, AKR1C3, GPX4, and xCT, resulting in upregulation of PTGS2, ACSL4, and LPCAT3, and downregulation of AKR1C3, GPX4, and xCT. These changes lead to lipid peroxidation and ferroptosis in HCC, providing new insights into the mechanisms of CGA.
However, some studies indicate that CGA may also have context-dependent effects. During radiotherapy, CGA was shown to activate the Nrf2 pathway in Huh7 and Hep3B cells, promoting Nrf2 nuclear translocation and upregulating antioxidant gene expression. This attenuates the growth-inhibitory effects of radiotherapy on tumor cells—an effect confirmed by Nrf2 knockdown experiments to be dependent on Nrf2 rather than CGA's inherent antioxidant properties. 27 These findings caution against the use of CGA-containing supplements during radiotherapy.
Activity against colorectal cancer
Hou et al. 28 reported that CGA inhibits the viability of HCT116 and HT29 cells in a dose-dependent manner: within the concentration range of 125–1000 μM, higher doses resulted in lower cell viability, with a more pronounced effect observed in HCT116 cells. This inhibitory effect was attributed to CGA-induced pro-oxidant activity, which elevates ROS and subsequently causes DNA damage. Although excessive ROS triggers activation of the Nrf2/HO-1 antioxidant pathway, this response is insufficient to counteract the pro-oxidant and anti-proliferative effects of CGA. Notably, the concentration of CGA required to induce a ROS burst is significantly higher than that needed to activate the Nrf2-mediated antioxidant pathway. 27 This suggests a concentration-dependent, dual function of CGA, whereby it may scavenge ROS at lower levels but induce ROS at higher concentrations. Before CGA can be applied in patients, it is imperative for both basic scientists and clinicians to delineate its appropriate context of use and determine the optimal dosage. Similarly, Vélez-Vargas 29 and Villota et al. 30 also identified strong drug resistance in HT29 cells. Specifically, by comparing two distinct colorectal cancer cell lines—SW480 (with high Wnt activation) and HT-29 (with low Wnt activation)—Villota et al. elucidated the reason behind HT-29's pronounced resistance: CGA modulates the Wnt/β-catenin signaling pathway, thereby influencing cancer cell proliferation, migration, and invasion. Due to the low Wnt expression in HT-29 cells, CGA exhibits limited efficacy.On a more promising note, CGA may yield enhanced antitumor effects when used in combination with other agents. Co-treatment with lactoferrin (LF) significantly increased the proportion of late apoptotic SW480 cells (33.99% vs 16.03% in the control group). Nevertheless, the mechanistic details underlying such synergistic effects and the translational feasibility of these combinations require further experimental investigation. 31
Activity against breast cancer
Changizi et al. 32 demonstrated that CGA induces apoptosis in 4T1 breast cancer cells via modulation of the p53–Bax/Bcl-2–caspase-3 pathway. A concentration of 200 μM CGA exhibited the highest cytotoxicity toward 4T1 cells. The group further validated the relevance of this pathway in BALB/c mice. Their results indicated that daily administration of 40 mg/kg CGA for 14 days reduced tumor mass and volume in the treatment group, and in some cases even led to complete tumor eradication, supporting the potential of CGA in breast cancer therapy. 33
Moreover, in other breast cancer cell models, CGA was found to downregulate the expression of low-density lipoprotein receptor-related protein 6 (LRP6) in MCF-7 cells and inhibit the Wnt/β-catenin signaling pathway through direct binding to LRP6, as confirmed by microscale thermophoresis (MST). Interestingly, LRP6 expression was not affected in MDA-MB-231 cells, where CGA suppressed epithelial–mesenchymal transition (EMT) without altering LRP6 levels—a phenomenon not thoroughly explained by the authors. 34 Subsequently, Zeng et al. 35 provided insight into this mechanism by showing that CGA inhibits TNF-α-induced nuclear translocation of p65, thereby affecting the NF-κB/EMT axis. This led to reduced migration and invasion in MDA-MB-231, MDA-MB-453, and 4T1 cells. In a lung metastasis model, CGA treatment decreased metastatic nodule formation and increased the proportion of CD4⁺ and CD8⁺ T cells in the spleen. When combined with cinnamaldehyde, CGA exhibited enhanced inhibitory effects, including suppression of Akt signaling to prevent metastasis, 36 a metabolic shift from oxidative phosphorylation to glycolysis, 37 targeted inhibition of mitochondrial metabolism, and selective elevation of superoxide levels in cancer cells without affecting normal cells.
Activity against Gliomas
Using molecular docking techniques, Wang et al. 38 demonstrated that CGA exhibits high binding affinity with STAT1, with a binding energy of −41.74 kJ/mol. This interaction modulates the JAK–STAT signaling pathway, promotes microglial activation, and facilitates their polarization toward an anti-tumor phenotype. Clinical evidence from a case report showed that a patient with recurrent high-grade glioma achieved partial response (PR) after nine months of CGA treatment, with an overall survival of 5 years and 6 months. The treatment demonstrated a favorable safety profile, with a maximum tolerated dose (MTD) of 5.5 mg/kg. The most common adverse events included injection site induration (92%) and pain (12%), with no severe systemic toxicities reported. However, a major limitation is its short half-life (t₁/₂ ≈ 1 h), which necessitates daily intramuscular injections over several months to maintain effective blood concentrations, leading to poor patient compliance.39,40
To address these limitations, researchers have endeavored to develop strategies to prolong the half-life and enhance the efficacy of CGA. Ye et al. 41 developed a mannose-modified PEGylated liposomal system (Man-PEG-Lipo) for targeted delivery of CGA to tumor-associated macrophages (TAMs). This system proved to be efficient and safe, promoting the repolarization of M2 macrophages toward the M1 phenotype and enhancing anti-tumor immune responses. In a subcutaneous allograft model (G422 mice), continuous administration of Man-PEG-Lipo resulted in a tumor growth inhibition (TGI) rate of 60.3%, significantly higher than that achieved with free CGA. However, this study utilized a subcutaneous rather than an orthotopic glioma model, leaving the ability of the formulation to cross the blood–brain barrier (BBB) insufficiently validated.
The same research group 42 developed a novel self-microemulsifying drug delivery system (SMEDDS) for oral administration of CGA (CGA-SME). This formulation bypasses hepatic first-pass metabolism via the intestinal lymphatic transport pathway. In beagle dogs, the AUC and Cmax of CGA-SME were 2.5 and 2.9-fold higher than those of free CGA, respectively. The improved pharmacokinetics promoted the accumulation of CGA in mesenteric lymph nodes (MLNs), increased the infiltration of CD3⁺/CD4⁺/CD8⁺ T cells into tumors, activated anti-tumor immunity, and induced long-term immune memory—evidenced by an increased proportion of effector and central memory T cells (Tem and Tcm). CGA-SME demonstrated superior anti-tumor efficacy over free CGA in both subcutaneous and orthotopic glioma models, with activity comparable to that of temozolomide (TMZ). Although the study lacked long-term observation of immune memory effects and rechallenge experiments to validate sustained immunoprotection, the development of CGA-SME represents a significant advancement with considerable potential for clinical translation.
Other Anticancer Activities
Beyond the cancers previously discussed, CGA also exhibits inhibitory effects against pancreatic cancer (PANC), renal cell carcinoma, and osteosarcoma. In both renal and pancreatic cancers, CGA downregulates the expression of AKT. However, due to tumor heterogeneity, the affected signaling pathways differ. In renal carcinoma, 43 downregulation of AKT influences the PI3 K/Akt/mTOR signaling pathway, thereby inducing apoptosis and inhibiting proliferation in A498 renal cancer cells (IC₅₀ = 40 μM, 48 h). In PANC, 44 CGA suppresses proliferation, migration, and invasion, and promotes apoptosis in PANC-28 (IC₅₀ = 283.1 μM, 48 h) and PANC-1 (IC₅₀ = 432.8 μM, 48 h) cells via inhibition of the AKT/GSK-3β/β-catenin pathway. Clearly, CGA demonstrates stronger cytotoxicity in renal cancer cells compared to pancreatic cancer models. However, both studies lack in vivo validation and did not employ genetic approaches such as CRISPR/Cas9 or siRNA-mediated AKT knockdown to confirm the necessity of the pathway. Further investigation is required to elucidate the molecular mechanisms of CGA in treating these malignancies.
In studies on pancreatic ductal adenocarcinoma (PDAC), transferrin receptor 1 (TFR1) has been identified as a critical protein sustaining mitochondrial metabolism in PDAC cells. 45 Yang et al. 46 found that CGA downregulates c-Myc expression, leading to reduced TFR1 levels, significantly impairing mitochondrial respiration (OCR), inducing G1 phase arrest. Knockdown of TFR1 mimicked the inhibitory effects of CGA on cell growth and mitochondrial respiration, while c-Myc overexpression reversed the suppression of TFR1 and cell growth induced by CGA. Nevertheless, it remains unconfirmed whether c-Myc is a direct target of CGA, and since TFR1 is not a direct transcriptional target of c-Myc, intermediate regulators may be involved, warranting further investigation.
In osteosarcoma, CGA concentration-dependently inhibits STAT3 phosphorylation and Snail protein expression, thereby suppressing cell proliferation and inducing apoptosis. 47 Knockdown of STAT3 phenocopied the antitumor effects of CGA, whereas STAT3 overexpression reversed its actions, confirming the critical role of the STAT3/Snail pathway—a mechanism distinct from those observed in other cancers (eg, the ERK/MMP pathway). Interestingly, beyond its antitumor functions, CGA also promotes new bone formation. Yang et al. 48 successfully developed a one-step self-assembled nanohybrid material based on natural CGA and gold nanorods (AuNR@CGA), which under mild laser irradiation significantly repaired cranial defects in BALB/c nude mice. This material demonstrated excellent antitumor and bone regeneration capabilities both in vitro and in vivo, offering a novel strategy for comprehensive postoperative treatment of osteosarcoma.
As illustrated in the Table 1 and discussed above, CGA demonstrates broad-spectrum anticancer activity across multiple cancer models, with encouraging reproducibility observed in various studies. Comprehensive analysis indicates that the sensitivity to CGA and the consistency of its mechanisms of action vary among cancer types. Among these, the most robust and consistent evidence has been documented in HCC and glioblastoma.
The Anticancer Activities of CGA and Its Underlying Mechanisms.
In HCC, the core mechanism by which CGA exerts its antitumor effect is the suppression of tumor cell proliferation through inhibition of the MAPK/ERK signaling pathway.22,25 In glioblastoma, rather than following the conventional approach of direct tumor cell killing, CGA exhibits potent immunomodulatory functions by regulating the JAK-STAT signaling pathway.38,41,42
Hence, the following two translational paths for CGA in cancer therapeutics warrant prioritization: first, the in-depth development of CGA as an adjuvant therapeutic or chemopreventive agent for HCC; and second, leveraging its unique capacity to reprogram the tumor microenvironment to pioneer novel immune-combination therapies for glioblastoma.
Antibacterial activity
Amid growing concerns over antibiotic resistance, attention has increasingly turned to natural herbal medicines as potential sources of antimicrobial agents. CGA, one of the most abundant phenolic acids, has been the subject of numerous studies investigating its antibacterial properties. 51 Wang et al. 52 used molecular docking to demonstrate that CGA can form hydrogen bonds with LasR, RhlR, and PqsR, thereby downregulating quorum sensing-related genes such as lasI, lasR, rhlI, rhlR, pqsA, and pqsR. This interaction suppresses biofilm formation, swarming motility, and the expression of virulence factors in Chromobacterium violaceum. Additionally, CGA enhances innate immune responses in hosts by modulating the mitochondrial unfolded protein response, inhibiting lipopolysaccharide (LPS) formation, and reducing the production of pyocyanin (PYO), a key toxin in Pseudomonas aeruginosa53,54 (Table 2).
Antibacterial activities of CGA on Pseudomonas Aeruginosa.
Despite its promising effects against bacterial infections in humans, the primary applications of CGA in the field of antimicrobial resistance remain focused on its use as a natural food additive 55 or as an antibacterial agent in animal husbandry. 56 CGA has been shown to inhibit Yersinia species in milk 57 and E. coli in acidic food systems 58 through ROS-mediated mechanisms and membrane disruption. These properties suggest that CGA could offer novel strategies for food preservation and help mitigate the spread of antibiotic resistance.
Hypoglycemic Activity
Over the past several decades, the prevalence of diabetes across all age groups has continued to increase.59,60 Type 2 diabetes mellitus, in particular, is considered to result from the interplay between genetic and environmental factors. Evidence suggests that dietary habits and lifestyle play significant roles in either promoting or preventing the development of diabetes. 61 Studies have indicated that moderate coffee consumption may reduce the risk of type 2 diabetes, an effect attributed primarily to phenolic compounds in coffee, with CGA being the most prominent. 62 Clinical studies have demonstrated that CGA,63,64 either administered alone or in combination with green tea catechins (GTC), contributes to the prevention of type 2 diabetes in both men and women with impaired glucose tolerance (IGT). However, the underlying mechanisms remain incompletely elucidated and are largely inferred based on indicators such as blood glucose, insulin, and GLP-1 levels.
With further investigation, scientists have discovered that CGA not only aids in the prevention of type 2 diabetes but also mitigates various diabetic complications—including reproductive dysfunction, 65 hearing impairment, 66 memory deficits, 67 and osteoporosis 68 —through antioxidant mechanisms that are independent of glucose-lowering effects. Additionally, it reduces liver and kidney injury and inhibits fibrosis by suppressing pathways such as Notch1 and Stat3, making it suitable as an adjunct therapy alongside conventional hypoglycemic agents (eg, metformin, glimepiride) to enhance efficacy and reduce dosage.69,70 Regarding its direct glucose-lowering effects, Rehman et al. 71 found that CGA treatment significantly restored serum insulin levels and markedly reduced fasting blood glucose in db/db diabetic mice—an effect comparable to that of glibenclamide—suggesting potential β-cell functional recovery or protection. Moreover, the efficacy was further enhanced when CGA was formulated into myofibrillar protein–chlorogenic acid complexes compared to CGA alone. 72 Table 3 is the role of CGA in diabetes and its complications.
The Role of CGA in Diabetes and Its Complications.
In contrast, one of its isomers, neochlorogenic acid (nCGA), although capable of improving renal function, exhibits negligible hypoglycemic effects, implying it may possess inferior druggability compared to CGA. 73 Nonetheless, some researchers have sought to improve upon the efficacy of CGA. Cardullo et al. 74 synthesized 11 amide derivatives of CGA. Among them, compound 8 (featuring a tertiary amine alkyl chain) and compound 11 (containing a benzothiazole moiety) exhibited the strongest inhibitory activity against α-glucosidase (α-Glu), with IC₅₀ values of approximately 13–14 μM, significantly surpassing that of acarbose (268 μM). Furthermore, compound 11 demonstrated mixed-type inhibition against both α-glucosidase and α-amylase (α-Amy), displaying higher binding affinity to the enzymes than CGA. When combined with acarbose, it showed synergistic effects at low concentrations, highlighting its potential for further development as a candidate compound for type 2 diabetes treatment. However, it should be noted that all data from that study were derived from in vitro experiments, providing no insight into pharmacokinetics, toxicity, or actual therapeutic efficacy. Consequently, the translation of these findings into clinical applications remains a distant prospect.
Hypolipidemic Activity
Worldwide, complications arising from disorders related to lipid metabolism impose a substantial burden on healthcare systems. Obesity resulting from dysregulated lipid metabolism has attracted increasing public attention. 75 There is therefore an urgent need to identify novel therapeutic strategies with distinct mechanisms of action to combat obesity.
Interestingly, CGA demonstrates potential as an adjunct therapy for obesity and its metabolic complications (Table 4). It acts through multiple pathways, including central appetite regulation, modulation of the miR-146a–IRAK1–TRAF6 inflammatory axis, and restoration of gut microbial homeostasis. 76 In a study by Gao et al., the effects of eight polyphenolic compounds on Akkermansia muciniphila abundance and body weight were investigated in high-fat diet (HFD)-induced obese mice. The results revealed that CGA was the most effective among the eight polyphenols in upregulating A. muciniphila, while also significantly reducing body weight, fat accumulation, blood lipid levels, hepatic steatosis, and improving intestinal barrier function. 77 These findings suggest that CGA may indirectly promote the growth of A. muciniphila by inhibiting competing bacterial strains and modulating metabolic environments, thereby ameliorating HFD-induced obesity and related metabolic disorders.
The Mechanism of CGA in Preventing and Treating Obesity.
Furthermore, He et al. 78 proposed that CGA not only reduces weight gain by decreasing food intake but also enhances energy expenditure through elevated core body temperature, thermal dissipation, and increased brown adipose tissue (BAT) activity, contributing to its anti-obesity effects. Intriguingly, CGA intervention has also been shown to significantly mitigate perfluorooctanoic acid (PFOA)-induced obesity in male offspring. 79 Although the underlying mechanisms remain incompletely understood and current evidence is insufficient to support the development of CGA as a stand-alone pharmacological agent for obesity treatment, it is a naturally abundant polyphenolic compound. Increased dietary intake of CGA-rich foods—such as coffee and apples—may offer a viable preventive strategy against obesity.
Antiviral Activity
Viral infections can lead to severe diseases and are sometimes fatal. Lonicerae Japonicae Flos, a traditional Chinese herb, has been used to treat influenza virus infections, with CGA as its primary active component responsible for antiviral effects. 80 CGA exhibits broad-spectrum antiviral activity81,82 In molecular docking models, CGA demonstrates strong binding affinities to monkeypox virus protein 4QWO, Marburg virus protein 4OR8, and H5N1 neuraminidase (NA), outperforming several marketed drugs such as cidofovir and oseltamivir.83,84 In vitro studies have confirmed that CGA significantly inhibits NA activity at low concentrations (1.56-3.13 µg/ml) and reduces cytopathic effect (CPE), indicating its potency as a NA inhibitor. However, it also suffers from drawbacks such as low oral bioavailability.85,86
In a duck model of HBV infection, CGA inhibited HBV-DNA replication in a dose-dependent manner, reduced HBsAg secretion, and markedly decreased serum DHBV levels, showing superior efficacy compared to the positive control drug lamivudine. 87 Furthermore, Sinisi et al. 88 modified the core structure of CGA to synthesize 3,4-O-dicaffeoyl-1,5-γ-quinic acid, which exhibited potent submicromolar inhibition against both RSV A and B subtypes (EC₅₀≈0.19 µM), outperforming ribavirin. Nevertheless, its specific molecular target remains unidentified, and conclusive evidence—such as enzyme inhibition assays, binding studies, or screening of drug-resistant mutants—is still lacking to validate the mechanism.
Among CGA derivatives, dicaffeoylquinic acids (particularly the 3,4- and 4,5-disubstituted forms) demonstrate higher binding affinity and inhibitory activity against the PA subunit of H5N1 viral RNA polymerase compared to monosubstituted derivatives or the free acid, highlighting their potential as promising antiviral lead compounds. 89 The Table 5 summarizes the antiviral effects and mechanisms of CGA.
The Antiviral Effects of CGA and Its Underlying Mechanisms.
Although these studies highlight the multifaceted potential of CGA and its combination therapies across the aforementioned fields, the concentrations used in current in vitro experiments far exceed those tolerable in humans. While such high doses help identify CGA's molecular targets, they considerably limit its immediate clinical applicability. A fundamental obstacle lies in its unfavorable pharmacokinetic properties: Only approximately 29% of CGA are absorbed and metabolized in the small intestine within 0.5 h. The remainder transit to the colon, where they undergo microbial metabolism into more complex metabolites, such as dihydro derivatives, before being absorbed—a key reason for the low bioavailability of CGA. After entering the systemic circulation, CGAs and their metabolites are distributed primarily in organs and tissues including the liver, cecum, small intestine, skin, and adipose tissue. However, CGAs exhibit a very short half-life in humans, ranging from approximately 0.2 to 0.6 h, indicating that they are rapidly metabolized and cleared from the body.90–92
Thus, it is necessary to develop targeted delivery systems or structurally modified analogs of CGA capable of reaching therapeutically relevant concentrations in vivo, and to conduct rigorous preclinical and clinical trials to translate these findings into practical treatment strategies.
Structure-Activity Relationship (SAR)
Current research consensus indicates that the core pharmacophore of CGA is primarily constituted by the caffeoyl moiety, the number and position of hydroxyl groups on the quinic acid moiety, and the ester bond linking the two. 93 Specifically, the catechol structure (ortho-dihydroxy group) within the caffeoyl moiety is considered the main contributor to its potent antioxidant activity, as it effectively stabilizes radical intermediates and donates hydrogen atoms. 94 The quinic acid moiety acts as a hydrophilic carrier, whose multiple chiral centers and hydroxyl groups influence the overall polarity, spatial conformation, and interactions with biomembranes or enzymes, thereby modulating its bioavailability and specificity for target binding. The type and linkage position of the ester bond further affect the molecular stability. The Table 6 below provides a concise summary of the structure–activity relationships of CGA.
Structure-Activity Relationship (SAR) Summary Table of CGA Derivatives.
Biosynthetic Pathway and key Enzyme Genes
As previously noted in the context of structure–activity relationships, the pharmacological activities of CGA fundamentally stem from its specific chemical structure—a structure meticulously assembled via rigorously regulated biosynthetic pathways in plants. Furthermore, the expression levels and catalytic activities of key enzymes involved in CGA synthesis directly dictate the absolute content of CGA in plant tissues. This content is critically linked to the dosage available from dietary or medicinal intake, and dosage constitutes the foundation of its pharmacological effects. It is therefore imperative to elaborate on the regulatory mechanisms underlying CGA biosynthesis in plants and other relevant microorganisms.
The Biosynthetic Pathway of CGA
There are four biosynthetic pathways of CGA. The first step of each pathway starts with phenylalanine, which undergoes a deamination reaction catalyzed by phenylalanine ammonia-lyase (PAL) to produce cinnamic acid.102,103 Then, under the catalysis of different enzymes, CGA is generated from four different pathways.
Route I: Cinnamic acid is hydroxylated by cinnamate-4-hydroxylase (C4H) to form p-coumaric acid, which is then converted to p-coumaroyl-CoA by 4-coumarate-CoA ligase (4CL). Next, p-coumaroyl-CoA undergoes acylation with quinic acid, catalyzed by hydroxycinnamoyl-CoA shikimate hydroxycinnamoyl transferase (HCT), to form p-coumaroylquinic acid. Finally, p-coumaroylquinic acid is hydroxylated at the 3-position by p-coumaric acid 3-hydroxylase (C3H) to yield CGA 104 (Figure 3).

Route I of the CGA biosynthetic pathways.
Route II: p-Coumaroyl-CoA undergoes acylation with shikimic acid to form p-coumaroyl shikimic acid. This is then hydroxylated by C3H to produce caffeoyl shikimic acid, which is converted to caffeoyl-CoA via HCT. Finally, CGA is formed through an acylation reaction with quinic acid, catalyzed by hydroxycinnamate-CoA quinate hydroxycinnamoyl transferase (HQT) 105 (Figure 4).

Route II of the CGA biosynthetic pathways.
Route III: Starting from p-coumaric acid, caffeic acid is formed through the catalysis of C3H and C4H. It is then converted to caffeoyl-CoA via 4CL catalysis. Finally, CGA is synthesized through an acylation reaction with quinic acid, catalyzed by HQT103,106 (Figure 5).

Route III of the CGA biosynthetic pathways.
Route IV: Cinnamic acid is first converted to cinnamoyl glucose, which is then oxidized to form caffeoyl glucose. Finally, CGA is formed through the action of hydroxycinnamoyl-D-glucose quinate hydroxycinnamoyl transferase (HCGQT), which converts caffeoyl glucose to CGA 107 (Figure 6).

Route IV of the CGA biosynthetic pathways.
Among the four biosynthetic pathways of CGA, the third route is considered the primary pathway. 108 The biosynthesis of CGA in plant species has been elucidated, as summarized in Table 7.
Plants with Elucidated CGA Biosynthetic Pathways.
Key Enzymes and Gene Regulation in CGA Biosynthesis
Six key enzymes are involved in the four biosynthetic pathways of CGA, including: phenylalanine ammonia-lyase (PAL) 116 cinnamate 4-hydroxylase (C4H), 117 hydroxycinnamoyl-CoA shikimate hydroxycinnamoyl transferase (HCT), 118 p-coumarate 3-hydroxylase (C3H), 119 4-coumarate:CoA ligase (4CL), 120 hydroxycinnamate-CoA quinate hydroxycinnamoyl transferase (HQT). 121 The activity and gene expression levels of these enzymes can significantly impact CGA synthesis.
Phenylalanine ammonia-lyase (PAL)
PAL catalyzes the deamination of L-phenylalanine to trans-cinnamic acid and ammonia, serving as the key and rate-limiting enzyme in the first step of phenylpropanoid metabolism and is the most studied enzyme in this pathway. 122 Calabrese et al. 123 determined the three-dimensional structure of PAL using single-crystal x-ray diffraction combined with energetic modeling (Figure 7). They revealed that a multi-helix dipole network provides a low-energy barrier pathway for deprotonation at the C3 position. This finding refuted the widely accepted Friedel-Crafts-type carbocation mechanism in the field and established the dominance of the E1cb elimination reaction mechanism (Figure 8). However, the active site loop region reported in this study was disordered. It was not until Wang et al. 124 determined the high-resolution structure of the Anabaena variabilis PAL C503S/C565S double mutant that a PAL structure with a fully ordered active site was obtained for the first time. Using AutoDock technology, they identified the binding site for the substrate L-phenylalanine (L-Phe) within PAL. Employing a more compelling approach – covalent modification of Tyr78 with NHS-biotin – they further confirmed that the enzymatic reaction proceeds via a carbanion intermediate rather than a carbocation intermediate.

Quaternary structure of PAL. (The four individual monomers are color-coded in red, green, blue, and yellow, displaying the approximate 222 symmetry of the PAL tetramer. (a) Stereoview of PAL from a side perspective. (b) Top perspective of the PAL tetramer (90 offset from that of panel a). The structure was retrieved from the PDB database and visualized using PyMOL).

Catalytic mechanism of PAL. (The E1cb elimination reaction mechanism).
PAL is widely found in green plants like Cornus officinalis, 125 Agave americana 126 and Brassica oleracea var. Capitata. 127 It plays a key role in plant growth, development, and defense against pests and mechanical damage. 128 Scientists have also isolated PAL from fungi, 129 bacteria, 130 and algae. 131
Studies have shown that PAL is correlated with CGA biosynthesis. 132 Chang et al. 133 constructed recombinant genes via Gateway vectors and introduced them into Agrobacterium tumefaciens LBA4404. Using the leaf disc transformation method, they transferred the recombinant genes into tobacco (Nicotiana tabacum cv. Samsun NN) leaves to achieve overexpression of AtPAL2, yielding favorable results: PAL activity significantly increased in AtPAL2-overexpressing lines, with CGA content elevated by 2.1-fold.
The expression level of PAL genes exhibits strong correlations with plant growth and development. In Vaccinium dunalianum, PAL gene expression progressively declines in developing leaves and flowers with advancing maturation, whereas it shows an increasing trend in the fruit stems. 134 This is attributed to the substantial accumulation of anthocyanins and other compounds required during fruit ripening. Elevated PAL enzymatic activity facilitates the production of these secondary metabolites, and CGA is concomitantly consumed as a precursor for anthocyanin synthesis. Additionally, among VdPAL1–7, only VdPAL3 expression negatively correlated with CGA content. Integrating Chen et al.'s 135 finding that enhanced expression of the GuPAL1 gene in Glycyrrhiza uralensis markedly promotes flavonoid biosynthesis, it is postulated that VdPAL3 may preferentially direct carbon flux toward the flavonoid or other phenolic pathways, thereby reducing CGA accumulation. To elucidate the functional divergence of PAL isoforms, more in-depth investigations in the field of metabolomics are warranted.
Cinnamate 4-hydroxylase (C4H)
C4H, the first identified plant P450 monooxygenase 136 and a member of the CYP73A subfamily, is localized on the outer surface of the endoplasmic reticulum membrane, with electrons required for its catalytic reaction supplied by NADPH-cytochrome P450 reductase (CPR). 137 Although extensive research has been conducted on C4H, it was not until 2020 that Zhang et al. 138 successfully determined the crystal structure of the Sorghum bicolor SbC4H1 protein, enabling visualization of the three-dimensional architecture of this enzyme class (Figure 9). Hydrophobic residues in C4H (Phe-107, Val-118, Phe-119, etc) engage in van der Waals interactions with the phenyl ring of cinnamic acid, while the carboxylate group of cinnamic acid forms electrostatic interactions and hydrogen bonds with side chains of Arg-213, Ser-214, Ser-217, and Gln-218, thereby facilitating stable substrate binding. This complex utilizes electrons derived from NADPH to cleave molecular oxygen, inserting a hydroxyl group at the C4 position of the phenyl ring to generate p-coumaric acid.

Structure of SbC4H1. (The heme group is shown in red. It serves as the catalytic center of the enzyme where it activates molecular oxygen. The structure was retrieved from the PDB database and visualized using PyMOL.).
C4H expression is tissue - specific, varying in different parts of the plant. In Vanilla planifolia, VplC4H1 and VplC4H2 genes are highly expressed in flowers but less so in other tissues. 139 Han et al. 140 conducted transcriptomic and metabolomic analyses comparing two Iris species (Iris germanica and Iris pallida), revealing differential gene expression patterns. They observed that I. germanica exhibited upregulation of C4H and 4CL genes, resulting in high CGA accumulation but low flavonoid content. In contrast, I. pallida showed elevated expression of key flavonoid biosynthetic genes (eg, CHS, DFR, ANS, ANR, LAR, and 3GT), leading to substantial flavonoid accumulation. These findings indicate that C4H acts as a positive regulator of CGA biosynthesis by supplying p-coumaric acid as a precursor.
Conversely, Karlson et al. 115 reported contradictory results using CRISPRi-mediated silencing of the C4H gene in tobacco. C4H suppression elevated CGA levels by 6-fold. The authors proposed a putative mechanism: C4H knockdown may cause cinnamic acid accumulation, thereby activating a bypass pathway involving UGCT/HCGQT to synthesize CGA directly from cinnamic acid, thus circumventing C4H.
This apparent contradiction underscores the species-specific adaptive evolution of plant secondary metabolic networks. However, neither study analyzed the expression of the respective UGCT and HCGQT genes or their associated enzymatic activities. Consequently, two critical questions remain unresolved: Whether Iridaceae possess a functional UGCT/HCGQT bypass pathway, and the mechanistic basis for elevated CGA accumulation upon C4H silencing in N. tabacum requires further elucidation.
Hydroxycinnamoyl-CoA shikimate hydroxycinnamoyl transferase (HCT)
HCT belongs to the large family of acyltransferases, which are primarily involved in the synthesis of various secondary metabolites 141 (Table 8).Currently, the catalytic mechanism of HCT is understood in considerable detail. The τ-nitrogen of the universally conserved catalytic His153, provided by the N-terminal domain, forms a hydrogen bond with the shikimate 5-oxygen of p-coumaroylshikimate, deprotonating the 5-hydroxyl group of the acyl acceptor substrate shikimate, which then attacks the carbonyl carbon of the acyl donor p-coumaroyl-CoA. Furthermore, the indolic nitrogen of Trp371 from the C-terminal domain is within hydrogen-bonding distance of the carbonyl oxygen of p-coumaroyl-CoA, stabilizing the negative charge on the tetrahedral intermediate formed after nucleophilic attack. In the final step of the catalytic cycle, coenzyme A is released from the tetrahedral intermediate as a leaving group to produce the ester product p-coumaroylshikimate (Figures 10 and 11). 142 Interestingly, unlike many other essential metabolic enzymes, HCT exhibits weaker substrate specificity, which arises from the sub-microsecond timescale conformational flexibility of its arginine handle, and can also catalyze the condensation of p-coumaroyl-CoA with quinic acid which shares structural similarity with shikimate, to produce p-coumaroylquinic acid, and can even accept some non-natural substrates. 143

Structure of AtHCT. (The two quasi-symmetric N-terminal (blue) and C-terminal (green) domains are linked by a long loop (red). The structure was retrieved from the AlphaFold database and visualized using PyMOL.).

Catalytic mechanism of AtHCT.
Impact of HCT Gene Silencing on the Phenylpropanoid Pathway Metabolites in Nicotiana benthamiana and Arabidopsis thaliana.
Despite HCT being multifunctional, this chapter focuses on its effects on CGA and synthesis. In the research on Camellia sinensis, Shen et al. 144 found that, like PAL, HCT may play an important role in CGA biosynthesis. In this process, HCT is involved in two pathways: producing p-coumaroylshikimic/quinic acid for C3H - enzyme substrates and converting caffeoylshikimic acid from C3H - catalyzed reactions into caffeoyl - CoA. Wang et al. 145 found that in Lonicera japonica, tetraploid flower buds produce more CGA and luteolin than diploid ones. However, this study failed to fully account for the complexities of polyploid biology, such as subgenome effects and multi-layered regulation. It also did not explore the correlation between the expression of the HCT gene and the increased CGA content in tetraploid flower buds. Consequently, there remains significant room for deepening the understanding of the molecular mechanisms behind enhanced CGA accumulation in tetraploid L. japonica.
p-Coumarate 3-hydroxylase (C3H)
C3H is a cytochrome P450 monooxygenase that catalyzes the hydroxylation of p-coumaroylshikimic and quinic acids at the C3 position (Figure 12). It's a key enzyme in CGA biosynthesis. 119 However, the affinity of C3H for these two substrates varies among different species. RgCYP98A22 from Ruta graveolens metabolizes p-coumaroyl quinate to CGA more efficiently than p-coumaroyl shikimate. 146 Conversely, ObCYP98A13 from Ocimum basilicum exhibits a meta-hydroxylation capacity for p-coumaroyl shikimate that is 5 to 10 times greater than that for p-coumaroyl quinate. 147 However, in the absence of any reported crystal structures for C3H in these plants, the molecular basis for the subtle differences between these two enzymes remains unclear.

C3H substrates and products.
In Apple Fruit, MdC3H1/2/3 levels are positively correlated with CGA content. 148 When the C3H/APX gene is defective, it can affect enzymes related to phenylpropanoid metabolism, including C3H, potentially reducing CGA synthesis in plants. 149 But treating Zea mays with 50 mg/L of nano-silicon can boost the expression of core CGA biosynthesis genes like C3H, increasing CGA content by 44.2% and significantly enhancing pest-resistance. 150 Normally, the number of C3H genes in plants does not exceed three. However, through genomic analysis, Hu et al. 99 identified eight members in the C3H gene family in Liriodendron chinense, indicating a notable expansion. The authors proposed that this expansion likely occurred via tandem and segmental duplications, which may have provided the genetic basis for the efficient synthesis of CGA. Nevertheless, due to limitations in sampling, the possibility of genetic drift as a stochastic event—rather than adaptive selection in response to environmental pressures—cannot be ruled out. To further investigate whether this gene family expansion resulted from positive environmental selection, it would be necessary to increase the sample size of L. chinense, particularly including specimens from similar habitats but at different developmental stages. Furthermore, functional characterization through knockout of individual C3H genes could help clarify the role of each subtype in the biosynthesis of secondary metabolites.
4-Coumarate-CoA ligase (4CL)
4CL, also known as 4-coumarate-CoA ligase, is the third enzyme in the phenylpropanoid metabolic pathway. It catalyzes the formation of CoA esters from cinnamic acid derivatives serving as precursors for secondary metabolites like CGA and lignin. 151 The crystal structure (Figure 13) and enzymatic mechanism of 4CL1 from Populus tomentosa have been determined by researchers. 152 The catalytic mechanism initiates with the binding of ATP and hydroxycinnamate substrates to 4CL1. This binding induces a conformational change in the enzyme, activating the adenylate-forming partial reaction. In this state, the side chain of Lys-523 coordinates the carboxylate group of the hydroxycinnamate substrate, positioning it for nucleophilic attack on the α-phosphate of ATP. This reaction yields an AMP–hydroxycinnylate intermediate and inorganic pyrophosphate (PPi). Release of PPi triggers a second major conformational shift, transitioning 4CL1 into the thioester-forming state.

Overall structure of 4CL1. (The C-domain is colored red. The N-domain is colored green. The structure was retrieved from the AlphaFold database and visualized using PyMOL.).
In this closed conformation, a pantetheine-binding tunnel forms at the interface between the N- and C-domains. Concurrently, the side chain of His-234 reorients to create an access path for CoA to approach the acyl-adenylate intermediate. Catalytic residues Lys-438 and Gln-443 then facilitate nucleophilic attack by the thiol group of CoA on the intermediate, resulting in the formation of the hydroxycinnamoyl-CoA thioester product. Finally, the C-domain rotates back to an open conformation, releasing the thioester product and AMP and resetting the enzyme for another catalytic cycle.
Hu's study provides strong support for Gulick's “domain alternation catalysis” theory 153 and further confirms the structural and functional conservation within the ANL superfamily. However, the research does not elucidate the driving force behind product release—specifically, whether it occurs via spontaneous dissociation driven by thermal motion or is coupled with the binding of a new ATP molecule. These questions remain to be explored in future studies.
Given that 4CL is a key enzyme in the biosynthesis of CGA, the relationship between the two has attracted considerable research attention from scholars. In Eucommia ulmoides, 11 Eu4CL family members were identified and named Eu4CL1 to Eu4CL11. Notably, Eu4CL4 is predominantly expressed in fruits, with expression levels 35.04 times higher than in leaves. 154 Ulteriorly in the study by Zhong et al., 155 35 4CL genes were identified from the E. ulmoides genome using HMMER and BLAST methods, significantly surpassing the number reported in previous studies based on EST or transcriptome data. The authors also found that members of the Eu4CL gene family exhibit pronounced tissue specificity and developmental stage specificity in E. ulmoides, suggesting functional diversification within this gene family and their involvement in specific biological processes during different growth and developmental stages. The expression levels of Eu4CL15, Eu4CL16, and Eu4CL18 were significantly down-regulated during the development of bark, fruit, and leaves, whereas the expression of Eu4CL28 and Eu4CL9 was markedly up-regulated. In leaves, the expression of Eu4CL5 and Eu4CL13 showed a significantly positive correlation with CGA accumulation (r > 0.8). In bark, the expression of Eu4CL34 demonstrated an extremely strong positive correlation with CGA content (r > 0.999). Although functional validation of these genes through approaches such as gene silencing or overexpression was not conducted, these findings provide valuable candidate targets for molecular breeding strategies aimed at selectively enhancing CGA levels in specific tissues (leaves or bark) of E. ulmoides.
Hydroxycinnamate-CoA quinate hydroxycinnamoyl transferase (HQT)
The HQT gene, part of the plant acyl-CoA-dependent BAHD superfamily, 156 plays an important role in phenylpropanoid metabolism and is crucial for CGA accumulation. 157 However, to date, no crystal or three-dimensional structure of any HQT has been reported. So, Moglia et al. 158 utilized the crystal structure of SbHCT to perform homology modeling, predicting key residues and elucidating the catalytic mechanism of HQT (Figure 14): His-276 activates the hydroxyl group of quinic acid, which attacks the carbonyl carbon of caffeoyl-CoA, leading to the formation of CGA. Additionally, Moglia demonstrated that mutating His-276 to Tyr confirmed HQT also exhibits chlorogenate:chlorogenate transferase (CCT) activity. Under acidic conditions in the vacuole, this activity allows one molecule of CGA to serve as an acyl donor, facilitating nucleophilic attack by the alcoholic hydroxyl group of another CGA molecule. This reaction ultimately results in the release of one molecule of quinic acid and the formation of 3,5-di-O-caffeoylquinate.

The catalytic mechanisms of CCT and HQT. (CCT: chlorogenate:chlorogenate transferase. Proposed reaction scheme for diCGA synthesis by CCT using CGA as the acyl donor as well as the acyl acceptor compared with the BAHD activity of HQT and HCT using caffeoyl-CoA as the acyl donor and quinate as the acyl acceptor, respectively.).
The HQT gene has three isoforms (HQT1, HQT2, and HQT3). 107 In Bambusoideae, treating plants with the histone deacetylase (HDAC) inhibitor SBHA significantly enhances BmHQT1 activity, boosting the accumulation of 3-O-p-coumaroylquinic acid, a CGA precursor. 159 This suggests HQT1 may have higher catalytic activity than the other isoforms. 160 In N. tabacum plants, silencing the NtHQT gene results in a relatively normal plant phenotype but significantly reduces CGA levels. 161 Similarly, Medison et al. 121 isolated 58 IbHQT genes from vegetable sweet potato leaves. They achieved transient overexpression and silencing of IbHQT-g47130 in sweet potato callus tissue, revealing a positive correlation between IbHQT genes and CGA biosynthesis and accumulation in the plant. So, it's no surprise that when bioengineering eggplants overexpress SmHQT, their CGA levels exceed those of non-bioengineering eggplants by over two times. 162
While the manipulation of HQT gene expression effectively enhances CGA production, the subsequent subcellular trafficking and storage of this metabolite remain less elucidated. Addressing this gap, Li et al. 163 were the first to employ immunogold labelling (anti-HQT) and laccase-gold complex labelling techniques in Lonicera japonica to determine the subcellular localization of both the HQT protein and CGA, respectively. Their work localized the synthesis of CGA to both the cytoplasm and chloroplasts. Furthermore, they proposed two distinct pathways for transporting CGA from these sites to the vacuole for storage: CGA synthesized in the cytoplasm is suggested to enter the vacuole through vesicle fusion, while CGA synthesized in chloroplasts is hypothesized to be directly released into the vacuole via membrane fusion between the chloroplast and vacuolar membranes. However, the authors caution that these transport mechanisms are speculative inferences based solely on static electron microscopy observations. They lack direct experimental validation, such as demonstrating a blockage in CGA transport upon the inhibition of membrane fusion. Additionally, the specificity of the laccase-gold technique is limited, as laccase oxidizes a broad range of phenolic compounds; thus, the observed gold particle signals likely represent a general class of phenolics rather than CGA specifically. Therefore, the precise mechanisms governing the intracellular transport of CGA await further confirmation.
Transcription Factors Involved in Regulating the Synthesis of CGA
Transcription factors (TFs) are sequence-specific DNA-binding proteins that modulate spatiotemporal gene expression at the transcriptional level, thereby enabling organisms to adapt to unfavorable external conditions. 164 TFs perform two primary functions: binding to specific cis-regulatory elements (CREs) in the genome, and recruiting the transcriptional machinery to either activate or repress the expression of corresponding target genes (TGs). 165 Common families of transcription factors—such as WRKY, MYB (v-myb avian myeloblastosis viral oncogene homolog), and bHLH (basic helix-loop-helix)—play distinct roles in the regulation of CGA biosynthesis. 166
MYB transcription factors
MYB transcription factors constitute one of the largest families of transcriptional regulators in plants, typically containing one to four DNA-binding repeats, with the majority possessing two repeats and belonging to the R2R3-MYB subfamily. They play crucial roles in regulating the biosynthesis of secondary metabolites, including CGA. 167 In poplar, MYB165 and MYB194 function as broad-spectrum repressors of the flavonoid and phenylpropanoid metabolic pathways. This represents the first discovery in woody plants that MYB repressors may indirectly influence the shikimate pathway, thereby downregulating polyphenol synthesis. 168 Similarly, in tobacco, NtMYB59 targets and represses the positive regulator NtMYB12, significantly reducing the levels of CGA, flavonols, and anthocyanins. However, the double mutant exhibited abnormally elevated CGA content, suggesting that MYB12 might also act as a negative regulator of CGA biosynthesis, although the underlying mechanism remains unclear. 169
Beyond their repressive functions, MYB TFs can also act as activators to promote CGA accumulation. In Lonicera japonica, both LmMYB15 114 and LmMYB111 170 can directly bind to the promoters of early biosynthetic genes such as MYB3/MYB4 and structural genes including PAL1, C4H, and 4CL2, thereby regulating CGA synthesis. Heterologous overexpression of LmMYB111 in tobacco and homologous overexpression in Lonicera macranthoides both led to increased CGA accumulation. To facilitate global research efforts in the programming and optimization of L. japonica, Xiao et al. 171 developed the first comprehensive platform, the LjaFGD database, utilizing multi-omics technologies. Similar to other databases such as MCENet (maize) and croFGD (Catharanthus roseus), LjaFGD offers innovative features in co-expression network construction, functional module identification, and tool integration. In L. japonica, transcription factors including MYB, WRKY, and ERF regulate the biosynthesis of active compounds such as CGA and luteolin by binding to the promoter regions of key enzyme genes like PAL and HCT, thereby modulating their expression.
WRKY transcription factors
Similar to MYB transcription factors, WRKY TFs represent one of the largest families of transcriptional regulators in plants, with the distinction that WRKY proteins have so far been identified exclusively in plants. 172 WRKY TFs play crucial roles in plant growth, development, secondary metabolism and responses to biotic and abiotic stresses. 173 They function by binding to the (T)TGAC(C/T) W-box cis-element in the promoters of target genes, thereby inducing gene expression to maintain cellular homeostasis. 174 Studies have demonstrated that WRKY TFs are involved in the regulation of various phenolic compounds. 175
Ji et al. 176 used EMSA to demonstrate that PpWRKY70 directly binds to W-box elements in the promoters of PpPAL and Pp4CL, activating their transcription and thereby inducing the phenylpropanoid metabolic pathway. In cotton, 177 GhWRKY41 acts as a third-layer regulatory factor that directly activates fourth-layer structural genes (GhC4H and Gh4CL), leading to enhanced phenylpropanoid biosynthesis. To further investigate the phenylpropanoid pathway regulated by WRKY TFs, Zhang et al. 178 employed a Populus protoplast system to confirm that WRKY proteins bind to W-box elements to regulate the transcription of downstream HCT2, consequently affecting CGA biosynthesis. These findings highlight the important role of WRKY TFs in regulating CGA synthesis.
Unexpectedly, Wang et al. 179 used ChIP-qPCR and Dual-LUC assays to reveal that a negative feedback mechanism triggered by CGA may lead to a reduction in total polyphenol content. Nevertheless, since that study did not assess key enzyme activities, it cannot be ruled out that NtWRKY33a activates the transcriptional repressor NtMYB4, which suppresses alternative branches of phenylpropanoid metabolism, thereby redirecting carbon flux toward CGA biosynthesis.
Other transcription factors
In addition to the two transcription factor families mentioned above, other transcription factors are also capable of regulating CGA biosynthesis. Liu et al. 180 demonstrated through dual-luciferase assays, yeast one-hybrid assays, and electrophoretic mobility shift assays (EMSA) that TabHLH1 directly binds to the bHLH-binding motifs in the promoters of proTaHQT2 and proTa4CL. Although CGA content was not directly measured in their study, the expression levels of key enzymes involved in its biosynthesis—TaHQT2 and Ta4CL—were upregulated, suggesting a potential enhancement in CGA accumulation. More direct evidence supporting the role of bHLH family genes in promoting CGA synthesis comes from the study by Wang et al. 181 in Cucumis sativus. They found that CsMYC2 upregulates the expression of CsPAL, thereby promoting the biosynthesis of phenylpropanoid compounds, including CGA. The central role of CsMYC2 in this regulatory network was further validated using virus-induced gene silencing (VIGS).
Although the AP2/ERF family of transcription factors is hypothesized to have originated through horizontal transfer from bacterial or viral HNH-AP2 endonucleases via transposition and homing processes, 182 it also plays a crucial role in the biosynthesis of CGA. In N. tabacum, the transcription factor NtERF4a binds to the GCC boxes in the promoters of NtPAL1 and NtPAL2, activating their transcription. Overexpressing NtERF4a significantly boosts CGA accumulation in tobacco leaves, whereas silencing it inhibits CGA biosynthesis. 183
Similar to most enzymes, transcription factors also exhibit tissue specificity. He et al. 184 revealed the expression pattern of NtWIN1 in tobacco: it is highly expressed in stems, sepals, and pistils, while its expression increases in leaves during senescence. Overexpression of NtWIN1 led to a 25%–50% increase in CGA content and a 30%–67% decrease in scopoletin levels. Conversely, knockout of NtWIN1 produced the opposite effects. However, the study did not clearly elucidate the molecular mechanism by which NtWIN1 promotes CGA biosynthesis—specifically, whether it enhances the expression of key genes involved in CGA synthesis, such as PAL and HCT, or indirectly increases CGA accumulation by suppressing alternative branches of the phenylpropanoid pathway.
Although numerous studies have been conducted on transcription factors, the phylogenetic relationships and functional specificity among different species, as well as the precise mechanisms through which they bind to target genes and modulate transcriptional activity, remain incompletely understood. The roles of transcription factors in plants continue to be a highly active area of research. The Table 9 is a list of the transcription factors found in plants that are involved in CGA biosynthesis.
A List of the Transcription Factors Found in Plants That are Involved in CGA Biosynthesis.
Environmental Factors Involved in Regulating the Synthesis of CGA
The environment plays a significant role in influencing CGA synthesis. Studies have shown that light quality, photoperiod, CO₂ concentration, and altitude affect CGA content by regulating key enzyme genes in the biosynthetic pathway, such as PAL, C4H, and C3H. 188 Light in the environment is perceived by plant photoreceptors. The light signal inhibits COP1, allowing the HY5 protein to stabilize and accumulate before entering the nucleus, where it directly binds to the G-box (CACGTG) in the promoters of various genes such as CmBBX20, CmMYB3/6/16, or downstream structural genes (eg, CmCHS, CmFNS, CmHQT), activating their transcription and thereby promoting the accumulation of CGA and flavonoids. 112 Meanwhile, under the combined effect of blue light and high CO₂, the CGA content in lettuce can be rapidly increased. 189 These studies demonstrate that light is a necessary condition for CGA synthesis; in the absence of light, CGA synthesis is inevitably inhibited. 190 In addition, altitude is another important factor influencing CGA synthesis. Dong et al. 191 found in their research on pigmented potatoes from different altitudes that the expression of the C3H gene and its alleles is regulated by MYB/bHLH transcription factors. Potatoes grown at 2800 meters exhibit higher C3H gene expression and significantly higher CGA levels compared to those from other altitudes. However, it remains unclear how high altitude affects MYB and bHLH transcription factors and their interaction with CGA biosynthesis genes—whether it is due to day length, UV intensity, atmospheric pressure, or other factors.
Abiotic stress
In their living habitat, plants are constantly exposed to several constraints such as heavy metals, cold, heat, drought, ultraviolet radiation, salinity, etc, which are detrimental to their productivity. 192 To survive, plants develop tolerance to the stresses, and this evolution process results in the accumulation of phenylpropanoids including CGA in different tissues as a response to the harmful conditions. 193 In research of Zea mays by Cao et al., 194 six drought-stress-related core genes (ZmPAL3, ZmPAL5, ZmPAL6, ZmPAL8, ZmPAL11, and ZmPAL13) were identified in ZmPAL. Among these, the expression level of ZmPAL5 was positively correlated with drought resistance. It enhances plants’ drought tolerance and recovery ability by influencing osmotic-regulation-substance content and antioxidant-enzyme activity.
In addition to affecting the PAL gene, abiotic stress can also influence other genes. In Sonchus arvensis, the expression of PAL, C4H, 4CL, and C3H genes was highest in the middle leaves under 150 mM NaCl treatment, with C4H expression increasing up to 8-fold. In contrast, the expression of HCT and HQT decreased under all salt treatments. 195 Although the literature does not specify how CGA is synthesized when HQT is downregulated or absent—as HQT was not even detected in peach fruits 196 —it is most likely produced via the alternative UGCT/HCCQT bypass pathway, based on the previously described biosynthetic route.
Biotic stress
Biotic stress triggers immune responses in plants, promoting the accumulation of antimicrobial compounds such as CGA, whose concentrations shift dynamically upon pathogen attack. For example, coffee trees infected by Meloidogyne paranaensis exhibit elevated CGA levels as a defensive strategy. 197 This aligns with the work of Baker et al., who reported comparable CGA accumulation in tobacco leaves following inoculation with an incompatible antigen. 198 In the study by Fan et al., 199 infection by Phytophthora capsici was shown to enhance CGA content in Piper nigrum via upregulation of the Pn4CL gene. However, the mechanism through which the pathogen activates Pn4CL expression remains a major knowledge gap. Critical aspects of the signal transduction pathway—such as the recognition events, upstream regulators, and potential involvement of phytohormones—are still poorly characterized. The apparent rapid induction of Pn4CL, peaking within 24 h, suggests a targeted pathogen manipulation of host biosynthesis pathways, yet the exact molecular triggers and their origins remain unidentified. Without elucidating these signaling components, the broader understanding of how Piper nigrum mounts its defense against Phytophthora capsici remains incomplete.
Phytohormones
Phytohormones serve as crucial small molecules regulating the biosynthesis of CGA (Table 10). In sweetpotato stem tips, treatment with SA, ABA, and GA for 72 h significantly promoted CGA accumulation, whereas JA markedly suppressed CGA levels at 24–48 h, with a return to baseline by 72 h. 200 However, in potato tubers, application of the same concentration (100 μM) of ABA, SA, MeJA, along with sucrose (200 μM), produced opposing effects—significantly reducing CGA content, particularly with SA showing the most pronounced suppression. 201 Furthermore, in a study on Carthamus tinctorius, Liu et al. 202 first elucidated the regulatory mechanism of CGA biosynthesis in safflower cells under MeJA treatment. They also demonstrated that the expression of the MeJA-responsive gene CtHCT is positively correlated with CGA accumulation—a finding that again contrasts with observations in sweetpotato.
Effects of Phytohormones on CGA Accumulation and Genes Expression in Sweetpotato and Potato.
Note. CSE: caffeoyl shikimate esterase gene.
These discrepancies highlight the species-specific nature of phytohormonal regulatory networks in plants. There remains a critical need to delve deeper into the molecular mechanisms underlying hormone–plant interactions. Merely establishing correlations between the transcription of key genes and CGA accumulation is insufficient to explain these divergent phenomena.
Heterologous Production of CGA
In the twenty-first century, the rapid advancement of omics technologies has revolutionized the field of biosynthesis, moving beyond traditional reductionist approaches. These omics-based methods allow comprehensive analysis of the dynamic changes in genes, proteins, and metabolites within biosynthetic pathways, enabling systematic elucidation of regulatory mechanisms. This has significantly accelerated the discovery of biosynthetic routes and played an important role in unraveling the regulatory network of CGA biosynthesis. 203
Coupled with synthetic biology tools such as CRISPR-Cas9, it is now feasible to design specific sgRNAs targeting key enzymatic or regulatory genes in the CGA biosynthetic pathway for precise genome editing in plants or microbial hosts. Tuan et al. 186 overexpressed AtPAP1 in Platycodon grandiflorus hairy roots and observed via qRT-PCR that the expression levels of seven CGA biosynthetic genes (PgPAL1, PgPAL2, PgC4H, Pg4CL, PgC3H, PgHCT, PgHQT) were significantly upregulated—by up to 4.95-fold—accompanied by an increase in CGA content from 42.60 µg/100 mg DW to 421.31 µg/100 mg DW. However, such plant systems are hampered by long cultivation cycles, difficulties in scaling up, high costs, and susceptibility to environmental fluctuations, which limit their industrial applicability.
Microbial engineering offers a promising alternative to overcome these constraints. In Saccharomyces cerevisiae, systems metabolic engineering strategies—such as balancing the supply of phosphoenolpyruvate (PEP) and erythrose-4-phosphate (E4P), and employing enzyme fusion techniques—have enabled de novo production of CGA from glucose, achieving a titer of 1.62 g/L in a 5-L bioreactor. 204 Another study extended CGA production to Yarrowia lipolytica, 205 reaching 4.84 g/L in a 5-L reactor, substantially higher than outputs in E. coli systems. 206
Despite their advantages in yield and biosafety, yeast platforms still face common challenges including unbalanced precursor supply, poor enzyme substrate specificity, and active degradation pathways.
Conclusions
In summary, pharmacologically, CGA is a natural small molecule exhibiting a broad spectrum of bioactivities, including antitumor, antimicrobial, hypoglycemic, hypolipidemic, and antiviral effects. It shows particular promise in combating hepatocellular carcinoma and glioma. Concurrently, enhancing the ADME (Absorption, Distribution, Metabolism and Excretion) profile of CGA through pharmaceutical or chemical strategies represents a major focus of current research.
Regarding the biosynthesis of CGA, the accumulation of CGA in plants is not only determined by the plant's intrinsic characteristics but is also closely influenced by environmental conditions. The environment, transcription factors, and gene expression form an interconnected system rather than three isolated components. Environmental factors such as light and drought activate corresponding cis-acting elements, upregulate the expression of transcription factors including MYB, WRKY, and bHLH, which in turn enhance or suppress the expression of key genes involved in CGA biosynthesis—such as PAL, C4H, 4CL, C3H, HQT, and HCT. This regulatory cascade modulates both the abundance and activity of enzymes, ultimately influencing the synthesis of CGA (Figure 15). Furthermore, through synthetic biology techniques, CGA can now be produced heterologously, overcoming the limitations of traditional extraction methods and ensuring a reliable, sustainable supply.

Regulatory network influencing CGA biosynthesis.
Future Perspectives
Future research on CGA should prioritize the following interconnected directions: (1) employing advanced tools such as CRISPR/Cas to engineer its metabolic pathways; (2) optimizing the structure and function of key biosynthetic enzymes through protein engineering strategies like site-directed mutagenesis and directed evolution; (3) systematically elucidating critical factors governing CGA absorption and bioavailability, including its stability in the gastrointestinal tract, intestinal permeability, interactions with efflux transporters, and presystemic metabolism; (4) developing and optimizing novel nanocarrier-based delivery systems such as liposomes, exosomes, and polymeric nanoparticles; (5) conducting structure-activity relationship studies to guide the rational design and synthesis of more stable and potent CGA analogs or derivatives; (6) identifying and characterizing true active metabolites (eg, dihydrocaffeic acid) generated by gut microbiota and host metabolism; (7) and ultimately, through well-designed randomized, double-blind, placebo-controlled clinical trials, promoting the clinical translation of CGA, its optimized formulations, or novel analogs for specific diseases such as cancer, diabetes, and obesity, to scientifically evaluate their efficacy, optimal dosage, long-term safety, and overall clinical potential.
Footnotes
Abbreviations
Acknowledgments
We wish to thank Mr Li Yuyuan for his expert guidance in PyMOL. We are also grateful to the reviewers for their insightful comments and constructive suggestions.
Author Contributions
Z-Q.G. wrote the pharmacology section. W-C.H. wrote the biosynthesis section. Y.Z., J.C. and W-C.H. revised and approved the final version of the paper.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by Shandong Administration of Traditional Chinese Medicine (2019-0989).
Declaration of Conflicting Interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: Figures 1, 2, 3, 4, 5, 6, 8, 11, 12, and 14 were drawn using ChemDraw based on data from the literature. Figures were created by the authors under the Declarations section.
