Abstract
Background:
Nasopharyngeal carcinoma (NPC) is a form of head and neck cancer that is endemic in most parts of Asia. Etiological factors for NPC include genetic disposition, consumption of a nitrosamine-rich diet, and Epstein-Barr virus (EBV) infection. One of the important EBV oncoproteins responsible for NPC pathogenesis is the Latent Membrane Protein 1 (LMP1). LMP1 interacts with nuclear factor-kappa B (NF-κB) signalling pathway, which regulates genes involved in inflammation, immunity, and cell survival. The dysregulation of NF-kB pathway gives rise to inflammation-related diseases and cancer. Hence, targeting inhibition and activation of NF-kB pathway could be a potential therapeutic strategy against NPC.
Results:
Using a variety of bioinformatics tools such as eVITTA and Pathview, we analysed the NF-kB signalling pathway using four NPC datasets available on the GEO database. Our analyses showed that the four datasets had a total of 991 differentially expressed genes. Pathway analysis showed the genes in the NF-kB signalling pathway were mostly upregulated and had a positive enrichment score. In addition, proteins from NF-kB family, namely IKBKB, NFKB1, NFKB2, and RELA and EBV proteins such as BSLF1, BZLF1, and BRRF1 could be explored further as biomarkers and drug targets towards developing chemotherapeutics against NPC and other EBV-associated diseases.
Conclusion:
To the best of our knowledge, this paper serves as the first NF-kB signalling pathway analysis and potential biomarker identification in NPC. Effective targeted approaches must be developed to address the unique features of the NF-kB signalling pathway for future NPC clinical research.
Introduction
Nasopharyngeal carcinoma (NPC) is the malignancy of squamous cells from the epithelial cells of the nasopharynx. This particular type of cancer has a high incidence rate and is almost exclusively found in East Asia, Southern China, the Middle East, and North Africa.1 For context, NPC is the 23rd most common cancer worldwide and is ranked the fifth most common cancer in Malaysia.2 According to the Surveillance, Epidemiology and EndResults registry, the 5-year survival rate of NPC patients is 61%, and recurrence may happen within the 3 years of treatment accompanied by serious clinical complications.3 Several factors could lead to NPC oncogenesis, namely, patients’ genetic susceptibility, consumption of food containing carcinogenic volatile nitrosamines, such as salted fish, or infection from Epstein–Barr virus (EBV).4
EBV often leads to NPC pathogenesis and other cancers such as Hodgkin Lymphoma and Non-Hodgkin Lymphoma.5 The latent infection of EBV plays an important role in the early stages of NPC. This is because EBV latency often leads to multiple chromosomal abnormalities from specific oncogenes and tumor suppressor genes that result in dysregulation of cellular pathways in NPC.6–8 Viral proteins that downregulate proapoptotic genes and upregulate pro-proliferative genes are produced during this latent period. Such proteins include the Epstein–Barr nuclear antigen (EBNA1; important for EBV episomal maintenance) and the latent membrane proteins 1 (LMP1).9 LMP1 is an important EBV oncoprotein that confers stem-cell-like self-renewal properties to NPC.10 LMP1 also contributes to NPC pathogenesis by interfering with the NF-κB (nuclear factor-kappa B) signaling pathway.11,12
The nuclear factor kappa B (NF-kB) signaling pathway is a family of transcription factors involved in the signal transduction pathway. The pathway consists of protein dimers that regulate genes involved in inflammation, immunity, and cell survival.13 Under normal conditions, the inactive NF-kB dimers are bound to the IkB proteins. When the NF-kB dimers are phosphorylated by the IkB-inhibitor of NF-kB-kinase (IKK) complex, the NF-kB dimer translocates into the nucleus inducing transcription of target genes. The NF-kB pathway can be activated through various stimuli such as inflammatory cytokines, DNA damage, bacterial, or viral products.14 The activation of the signal pathway can be done canonically, noncanonically, or atypically. Canonical pathway (also known as the classical pathway) is due to the degradation or phosphorylation of IKK complex. This results in the p50/p65 NF-kB dimer migrating to the nucleus and activating gene transcription.15 This pathway is mainly triggered by tumor necrosis factor (TNF)-alpha, interleukin 1, or by-products of bacterial and viral infections. The noncanonical pathway is induced by a TNF-receptor family such as lymphotoxin beta or B-cell activating factor. This results in the degradation of NF-kB-inducing kinase (NIK) and IKK-alpha mediated p100 degradation and consequentially to p52/RelB heterodimers translocating to the nucleus.16,17
Studies show that targeting inhibition and activation of the NF-kB signaling pathway could be a potential therapy for NPC as well as other types of cancer.18–20 This is because the dysregulation of the NF-kb signaling pathway gives rise to inflammation-related diseases and cancer due to its crucial role in immune response modulation and tumor-cell proliferation.21,22 Currently, NPC clinical trials focusing on the NF-kB signaling pathway are limited, as proteins in other pathways such as the vascular endothelial growth factor pathways or epidermal growth factor receptor pathways remain popular as drug targets for NPC.23
In this present study, we aim to highlight the critical role of the NF-kB signaling pathway and identify potential biomarkers and targets for future NPC therapeutic strategies.
Method and Experiment
Gene expression datasets
The expression profile datasets consisting of GSE12452,24 GSE13597,25 GSE34573,26 and GSE6463427 were obtained from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/). The GSE34573 data were constructed using the GPL96 [HG-U133A] Affymetrix Human Genome U133A Array. All other databases were constructed using the GPL570 [HG-U133 Plus 2] Affymetrix Human Genome U133 Plus 2.0 Array platform. In total, the NPC and normal nasopharyngeal cell samplecounts were as follows: GSE12452: 41 samples, GSE13597: 28 samples, GSE34573: 20 samples, and GSE64634: 16 samples. The tissue biopsies were all confirmed to be EBV-positive NPC except for one NPC sample in GSE34573. The samples used in GSE64634 did not disclose whether their NPC samples are EBV-positive.
GEO2R
The differentially expressed genes (DEGs) between the respective control and NPC samples were screened and compared using the interactive web tool GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/). GEO2R uses the GEOquery and Linear Models for Microarray Analysis (LIMMA) R packages from the bioconductor project for analysis of high-throughput genomic data. The p-values obtained were adjusted using the Benjamini & Hochberg (false discovery rate) method. DEGs were classified with an adjusted p-value significance level of less than 0.05. Other parameters were used as default. Results were ranked by p-value, with the most significant genes having the smallest p-value.
Gene set enrichment, pathway, and network analyses
We used the web-based visualization and inference toolbox for transcriptome analysis, web-based visualization and inference toolbox for transcriptome analysis (eVITTA) version 1.3.128 (https://tau.cmmt.ubc.ca/eVITTA/), to identify the gene sets of the ranked DEGs from various gene databases. These databases include Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome Pathway, WikiPathways, Biological Process, Cancer Gene Neighborhoods, Cancer Modules, Oncogenic signatures, and Human Cancer Metastasis Database. The gene set sizes were filtered at min = 15 and max = 200 with a p-value of <0.005 and the adjusted p-value of <0.01. The enrichment network looked into the similarities between leading-edge genes using Jaccard Coefficient at threshold (0 ≤ x ≤ 1): 0.25, p-value < 0.01 and adjusted p-value < 0.025. The mapping and integration of data for pathway analysis were done using Pathview.29,30 Multiple dataset comparison was completed using the following default options: p-value < 0.05, log2fold change (FC) = |−0.1|, false discovery rate (FDR) <1.1, and in both positive and negative directions. The NF-kB and EBV networks were then visualized using Cytoscape software version 3.9.1.31
Protein–protein interaction analysis
The Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) (https://string-db.org) and Human-Virus Interaction Database (HVIDB) (http://zzdlab.com/hvidb/)32 were used for the exploration and prediction of NF-kB gene set with EBV gene set based on their UniProt ID (https://www.uniprot.org). The Biocarta NF-kB family gene setlist consisted of RELA (Q04206), RELB (Q01201), REL (Q04864), NFKB2 (Q00653), NFKB1 (P19838), NFKBIA (P25963), NFKBIB (Q15653), NFKBIE (O00221), NFKBIz (Q9BYH8), p100 (Q9Y6K5), p105 (P55290), CHUK (O15111), inhibitor of NF-kB kinase subunit beta (IKBKB) (O14920), NEMO (Q9Y6K9), and BCL3 (P20749). All the protein–protein interaction (PPI) network data obtained from the databases were derived from validated experiments. The PPI analysis was categorized into four methods, namely, machine learning, domain–domain interactions (DDI) or structural feature methods, data/literature mining, and comparing interologs. A PPI combined score of >0.4 was applied as the reliability threshold for interaction. The Cytoscape software version 3.9.1 was also used to visualize the PPI networks.
Results
Datasets profiling
In this study, a total of 105 samples were normalized using LIMMA to remove gene expression variation among the chosen datasets. The GSE12452 transcriptome dataset consisted of extracted ribonucleic acid (RNA) isolated from laser-captured epithelium NPC tissues (31 samples) and normal nasopharyngeal tissue (10 samples). The GSE13597 dataset had 25 snap-frozen biopsies with histologically confirmed undifferentiated NPC and three nasopharyngeal tissues with no evidence of malignancy. A total of 20 samples containing gene expression from the GSE34573 dataset had 16 NPC and four normal controls. Meanwhile, the GSE64634 dataset consisted of RNA isolated from laser-captured NPC tissues (12 samples) and normal nasopharyngeal tissue (four samples). All the respective datasets’ sample values were normalized prior to gene set analysis, as shown by the median-centered boxplot (Fig.1).

Expression values distribution of (A) GSE12452, (B) GSE13597, (C) GSE34573, and (D) GSE64634. Samples are colored according to the group control or NPC. The horizontal axis represents the sample IDs, and the vertical axis represents the Log2 read counts values.
Differentially expressed genes (DEGs) identification
To identify genes that were differentially expressed between NPC and normal nasopharyngeal cells, we used the statistical tool LIMMA. LIMMA is useful when dealing with various data types and experimental designs while minimizing the occurrence of false positives by applying multiple-testing corrections on p-values. A total of 1,398 DEGs (1,333 upregulated and 65 downregulated) were found in GSE12452, 1,011 DEGs (982 upregulated and 29 downregulated) in GSE13597, 1,770 DEGs (1,754 upregulated and 16 downregulated) in GSE34573, and 1,089 DEGs (1,029 upregulated and 60 downregulated) in GSE64634. These identified DEGs were then used for NF-kB pathway analyses and multiple datasets comparison.
Enrichment and pathway analyses
After identifying the DEGs, we analyzed the NF-kB signaling pathway using eVITTA. The density plot of NF-kB gene sets based on the KEGG NF-kB signaling pathway showed the upregulation of NF-kB signaling pathway genes in the respective datasets (Fig.2). Gene set analysis calculates the enrichment score (ES) based on the correlation of the gene to the phenotype. The ES is the maximum deviation from zero based on the entire ranked gene list. A positive ES means the gene set is ranked highly in correlation to the observed phenotype, while a negative ES indicates otherwise. ES values range from −1 to 1, whereby 0 is a good fit and ±1 is otherwise. Based on the analyses, the ES for the NF-kB signaling pathway was found to be 0.462 (GSE12452), 0.389 (GSE13597), 0.319 (GSE34573), and 0.446 (GSE64634). From our integrated pathway data, we observed that most NF-kB genes were upregulated in all datasets except those from GSE64634 (Fig.3). The downregulation observed in GSE64634 pathway analysis could be due to either the small sample size, cell sensitivity, or decreasing cell quantity. We also investigated the EBV and LMP infection pathways in all the datasets, and their ES was found to be positive. The ES forEBVinfection pathway was 0.544 (GSE12452), 0.481 (GSE13597), 0.447 (GSE34573), and 0.371 (GSE64634), while the ES of LMP1 infection pathway was 0.551 (GSE12452), 0.57 (GSE13597), 0.503 (GSE34573), and 0.464 (GSE64634).

Density plot of NF-kB gene sets for (A) GSE12452, (B) GSE13597, (C) GSE34573, and (D) GSE64634 based on KEGG NF-kB signaling pathway at adjusted p-value < 0.05. The horizontal axis represents the rank score, and the vertical axis represents the counts.

Integrated pathway data of NF-kB signaling pathway for (A) GSE12452, (B) GSE13597, (C) GSE34573, and (D) GSE64634 based on KEGG NF-kB signaling pathway at the scale between −1 and 1. The red represents upregulation, and the blue represents downregulation.
Multiple dataset comparison of NF-kB pathway
Next, we compared the expression levels of the genes in the NF-kB signaling pathway between the NPC and normal nasopharyngeal cells (Table1). We used the Log2 FC metric to describe the magnitude of the expression change of the case/control ratio. The FDR metric used is the adjusted p-value employed for this multiple comparison datasets measurement. As aforementioned, a total of 21 NF-kB genes are involved in the pathway, namely, CHUK, FADD, IKBKB, IKBKG, IL1A, IL1R1, MAP3K1, MAP3K14, MAP3K7, MYD88, NFKB1, NFKBIA, RELA, RIPK1, TAB1, TNF, TNFAIP3, TNFRSF1A, TNFRSF1B, TRADD, and TRAF6 (Biocarta NF-kB pathway). Among these, 11 were identified as differentially expressed in GSE12452, four in the GSE34573, and two each in the GSE13597 and GSE64634. The presence of tumor necrosis factor alpha-induced protein 3 (TNFAIP3) and TNF receptor type 1-associated DEATH domain (TRADD) genes was seen in all except the GSE64634 dataset. Another noteworthy result was that the RELA gene was differentially expressed only in the GSE34573 dataset.
Comparison analysis of NF-kB signaling pathway gene sets in the four datasets
PPI assessment
Previous studies showed that LMP1 can activate the NF-kB signaling pathway. We were interested to understand potential PPIs between LMP1 and proteins in the NF-kB signaling pathway. To do this, we collected evidence from STRING and HVIDB for potential PPI by comparing interologs, predicting DDI, machine learning-based prediction, and data/literature mining. Physical or chemical reactions between the protein molecules and amino acid residues mediated by structural domains are known as DDI. There needs to be at least one known DDI pair presence to suggest DDI evidence. Meanwhile, an interolog refers to a pair of interacting proteins that are identified or present in another species. We found five PPIs between NF-kB and EBV family gene product (Table2). The interacting pairs were IKBKB with LMP1, NFKB1 with BSLF1, NFKB1 with BZLF1, NFKB2 with BRRF1, and RELA with BZLF1. In the case of the BZLF1 pair with NFKB1 and RELA, the interologs and machine learning predictions suggested that they were statistically significant, although the DDI suggested otherwise. In addition, the DEGs post-infection collected from data mining show that PPIs of the interest proteins have more DEGs upregulation compared to downregulation, except for the O14920 (IKBKB) and P03230 (LMP1) PPI.
PPI evidence of NF-kB signaling pathway and EBV family
Discussion
In this study, we analyzed and identified biomarker targets in the NF-kB signaling pathway from four NPC datasets. We used bioinformatics tools to identify genes that were differentially expressed between NPC and normal nasopharyngeal cells. It is always a challenge to discover the biological meaning and interpretation of expression profile data. The current gene set analysis method comes with several limitations in terms of specificity, sensitivity, and reproducibility due to the lack of gold standard datasets.33 In addition, the knowledge to understand exactly how these genes communicate or interact with each other about the phenotype as a group may not be known since they arbitrarily relied on defined gene sets. However, to overcome these disadvantages, we also presented pathway analysis using additional bioinformatic tools.
From our NPC datasets analyses, we found that there were more upregulated DEGs compared to downregulated ones. Specifically, in the context of the NF-kB signaling pathway, all datasets showed upregulation of genes associated with the pathway. The NF-kB signaling pathway is mainly upregulated during cancer development when bioinformatical analyses were performed to identify DEGs and potential pathways in NPC.34 The pathway is also among the top significant ranked genes according to our GSEA findings. Gene set analysis constructed on whole-exome sequencing of patient-derived xenografts and cell lines detected that gene upregulation in EBV-positive cohorts has significantly enriched in NF-κB and phosphatidylinositol 3-kinase (PI3K) signaling pathways.35 In their study, the NF-kB signaling pathway was within the top ranked gene list and had an ES of approximately 0.4, indicating that the pathway may have a functional role in NPC pathogenesis. However, there are also studies that analyzed different GEO datasets using the same bioinformatic tools and found no candidate markers to the ones reported here.36,37 Moving forward, it will be interesting to identify each of the markers found, particularly as a drug target in both invitro and animal studies.
Our multiple dataset comparison for the gene set involving the NF-kB signaling pathway showed consistent upregulation of TNFAIP3 and TRADD. TNFAIP3 is a ubiquitin-editing enzyme that encodes the cytoplasmic zinc finger protein A20. The protein modulates the immune and inflammatory response by inhibiting the activation of NF-kB and TNF-mediated apoptosis pathways.38 The protein was identified as a tumor suppressor protein that ubiquitinates the interferon regulatory factor 7 (IRF7), which regulates innate and adaptive immune responses including virus-inducible genes. Additionally, endogenous IRF7 activity and its inducer LMP1 were markedly increased because of TNFAIP3 expression knockdown in Burkitt Lymphoma P3HR1 cells.39 TNFAIP3 has also been suggested to be a potential target in developing personalized treatment for NPC patients.40 Their analysis revealed that TNFAIP3 had a negative correlation with distant metastasis of the patients, in which low TNFAIP3 level was related to the worse overall survival of NPC patients.
TRADD is an adapter molecule that is connected to the cytoplasmic domain of the TNFRSF1A/TNF receptor 1. TRADD also inhibits the adaptor protein TNF receptor-associated factor 2 (TRAF2), thus suppressing TRAF2-mediated apoptosis signaling. In addition, TRADD facilitates cell death signaling and NF-kB activation.41 Eliopoulos etal.42 provide further evidence for TRADD and TRAF2 roles in the LMP1-activated pathway. They examined the extreme C-terminus of LMP1 protein and identified the protein region essential for associating LMP1 with TRADD and C-terminal activation region 2-facilitated NF-kB pathway. In a recent study, Xu etal.43 successfully showed that targeting TRADD is a promising strategy to treat human diseases. They treated the mouse model with a novel TRADD inhibitor called Apostatin-1 that helps to restore protein homeostasis and blocks apoptosis by activating autophagy in cells with accumulated mutant tau, α-synuclein, or huntingtin.
Our PPI assessment showed that IKBKB, NFKB1, NFKB2, and RELA of NF-kB family protein could potentially interact directly with the EBV family (LMP1, BSLF1, BZLF1, and BRRF1). IKBKB is a vital IKK complex catalytic subunit and has a key role in the canonical NF-kB signaling pathway.44 This protein phosphorylates the inhibitor of the NF-kB complex causing dissociation of the inhibitor and consequently activating the NF-kB. A study done by Guo etal.45 reveals that IKBKB has the potential to predict NPC patients’ clinical responses to radiotherapy. They evaluated single nucleotide polymorphisms associated with the efficacy and acute toxic reactions after radiotherapy in NPC patients and found that IKBKB was related to grades 3–4 acute radiation-induced myelosuppression. Zhou etal.46 further analyzed the NF-κB (p65) and IKBKB clinicopathological significance in NPC patients from Guangdong Province, China. They observed that high mRNA expression of NF-κB/p65 and IKBKB was associated with significantly shorter disease-free survival rates. Therefore, NF-κB/p65 and IKBKB could be potential therapeutic targets.
Apart from IKBKB, NFKB1 could also potentially interact with the EBV family. NFKB1 or NF-kB p105 subunit consists of a heterodimeric complex mediated by a proteasome. This is to conserve their independent function by ensuring the production of p50 and p105 as well as the post-translation process of NFKB1/p105. NFKB1 dual function is also involved in the immune response and acute phase reaction through the generation of p50.47 Homozygous mutations in NFKB1 can lead to a severe clinical presentation of combined immunodeficiency.48 Consequently, this is accompanied by disruption of the canonical NF-κB signaling pathway, irregularities in T and B cell maturation and function as well as in cytokine secretion, specific antibody formation, and cell proliferation.
On the other hand, the NFKB2 or NF-kB p100 subunit is one of the many dimer complexes of NF-kB involved in the noncanonical pathway. The inactive NF-kB complex family is held together in the cytoplasm with the help of NF-kB inhibitors. NFKB2 processes p100 to p52 and also forms the RELB-p52 heterodimeric complex that acts as a transcriptional activator. However, NFKB2 can also produce a p52-p52 homodimer, which performs as a transcriptional repressor.49 NFKB2 heterozygous mutations can implicate the NF-kB pathway signaling in immunoregulation, which causes conditions such as common variable immunodeficiency and DAVID syndrome.50
Meanwhile, RELA is a transcription factor p65 that can form into NF-kB heterodimeric complexes with NFKB1 and REL. RELA interacts with members of the NF-kB inhibitor (IkB) family to hold the inactive NF-kB complex in the cytoplasm. Upon the degradation of NF-kB inhibitors due to different activator’s responses, the active NF-kB complex translocates into the nucleus, resulting in the activation of the canonical pathway. Bettelli etal.51 showed that repressing the transcriptional activity of RELA in nuclear factor activated T-cells and NF-κB proteins can affect cytokine gene expression in T-cells. A recent study found that RELA along with IRF3 and IRF5 is among the key transcription factors that regulate the interferon (IFN) response during severe acute respiratory syndrome coronavirus 2 infection.52
Our PPI findings also predicted that four EBV family proteins, namely, LMP1, BSLF1, BZLF1, and BRRF1, could form interactions with proteins in the NF-kB family. As mentioned earlier, LMP1 is a crucial and established EBV gene that interferes with the NF-kB signaling pathway. However, LMP1 can also interfere with the metabolic signaling pathway to promote NPC tumor cells by enhancing insulin-like growth factor 1 resulting in activation of the mammalian target of rapamycin complex 2 (mTOR-C2)/protein kinase B signaling involved in glucose metabolism.53 Similarly, Mai etal.54 found that LMP1 also has a role in the mTOR signaling pathway when it was found to induce the expression of microRNA and PI3K in NPC cells.
Another important finding to note is the potential interaction between BZLF1 pairing with NFKB1 and RELA. There was interolog and machine learning evidence (Table2) for both NFKB1 and RELA pairing with BZLF1, but their DDI evidence suggested otherwise. PPI annotation data are correlation based on recorded evidence. Hence, it may have a high false-positive rate and varying evidence quality.55 On the other hand, DDI has a lower false-positive rate than PPI, although it lacks coverage, and it is evaluated based on the domain profile correlation.56 Looking into the role of the trans-activator protein BZLF1, it has an important function as an immediate-early protein to induce early lytic cycle genes from EBV latent infection. It activates another EBV gene promoter (BSLF2/BMLF1) and acts as a binding protein for genome replication.57 BZLF1 was demonstrated invitro and invivo to have a physical interaction with NFKB/p65 and inhibits p65 transcriptional function. The p65 transcriptional inhibition induces NF-kB nuclear translocation, which leads to viral latency while negatively regulating BZLF1.58 However, various factors need to be considered between BZLF1 and the host cell environment since the activation of the BZLF1 promoter requires both trans-activating factors and the loss of repressors participation.59
In addition, there was also a statistically significant PPI score between the EBV family of BSLF1 and BRRF1 with the family of NF-kB. BSLF1 is a DNA primase in EBV, while BRRF1 is a transcriptional activator of BZLF1 promoter by enhancing the ability of BRLF1 to induce EBV lytic infection.60 To date not much information can be gathered for their role during EBV infection in NPC except that BRRF1 induces EBV lytic gene expression, subsequently activating the c-Jun N-terminal kinase signaling pathway via TRAF2 and p53.61 Hence, these EBV viral proteins will be of interest for future studies.
Conclusion
To the best of our knowledge, this paper serves as the first NF-kB signaling pathway analysis and potential biomarker identification in NPC. Effective targeted approaches must be developed to address the unique features of the NF-kB signaling pathway for future NPC clinical research. From this study, we observed the NF-kB signaling pathway was mostly upregulated and had a positive ES. In addition, the IKBKB, NFKB1, NFKB2, and RELA of NF-kB family proteins and the EBV family of BSLF1, BZLF1, and BRRF1 could serve as potential targets toward developing more effective chemotherapeutics against NPC.
Footnotes
Acknowledgments
The authors thank the anonymous reviewers of this publication and greatly appreciate the contributions from all the researchers involved in the research study datasets used as material in this article.
