Abstract
Drug-resistant tuberculosis (TB), which results mainly from the selection of naturally resistant strains of Mycobacterium tuberculosis (MTB) due to mismanaged treatment, poses a severe challenge to the global control of TB. Therefore, screening novel and unique drug targets against this pathogen is urgently needed. The metabolic pathways of Homo sapiens and MTB were compared using the Kyoto Encyclopedia of Genes and Genomes tool, and further, the proteins that are involved in the metabolic pathways of MTB were subtracted and proceeded to protein-protein interaction network analysis, subcellular localization, drug ability testing, and gene ontology. The study aims to identify enzymes for the unique pathways for further screening to determine the feasibility of the therapeutic targets. The qualitative characteristics of 28 proteins identified as drug target candidates were studied. The results showed that 12 were cytoplasmic, 2 were extracellular, 12 were transmembrane, and 3 were unknown. Furthermore, druggability analysis revealed 14 druggable proteins, of which 12 were novel and responsible for MTB peptidoglycan and lysine biosynthesis. The novel targets obtained in this study are used to develop antimicrobial treatments against pathogenic bacteria. Future studies should further shed light on the clinical implementation to identify antimicrobial therapies against MTB.
Introduction
Tuberculosis (TB), which ranges from asymptomatic infection to fatal disease, is an airborne infectious disease caused by Mycobacterium tuberculosis (MTB). Furthermore, it is estimated that these pathogenic bacteria have infected one-third of the world’s population, and more than 250 people die of TB daily. 1 The emergence of multidrug-resistant tuberculosis (MDR-TB) has exacerbated the situation, making the disease a top priority to be resolved globally.
The conventional drug discovery and development approach uses expensive methods that are time-consuming, complex, and only uncover a small number of potential targets. 2 However, computational approaches leveraging Omics data analyses have been widely used in pharmaceuticals to identify and accelerate drug discovery with lower failure rates in clinical trials.3,4 The search for potential pharmacological targets has become more accessible due to the development of sequenced human and pathogen genomes accessible in public databases.
The current methods for discovering therapeutic targets are greatly based on non-homologous enzymes, chokepoint enzymes, critical genes unique to a pathogen, and genes linked to resistance and virulence. 5 There is a previous report on MTB’s metabolic pathways and protein-protein interaction (PPI).6-8 However, a deep gene ontology (GO) analysis to identify the putative targets has not been performed. These actions are essential because GO investigation paves the way for identifying significant features, such as molecular function (MF) and cellular processes. 9
The metabolic pathways of the pathogen and the host are compared to discover enzymes critical to the MTB’s survival. The procedure started with identifying host and pathogen metabolic pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Subsequently, the specific pathways to the pathogen were compared by analyzing the distinctive pathways of the extracted enzymes and submitted to an online platform for identification. The discovered essential enzymes used as therapeutic targets are determined by the similarity of innovative drug targets (DT) using DrugBank, cellular localization, and GO analysis.
Materials and Methods
Comparative analysis of host and pathogen metabolic pathways
The metabolic pathways for the host (Homo sapiens, KEGG ID: T01001) and the pathogen (MTB H37Rv-sensitive strain to anti-tuberculosis drugs, KEGG ID: T00015) were collected from the KEGG database (accessed on April 8, 2022) (https://www.genome.jp/kegg/). 10 The pathways were compared to identify the unique routes only present in MTB. Furthermore, the enzymes in shared and unique pathways of MTB were extracted from KEGG, and the protein sequences were retrieved in FASTA format from the NCBI (National Center for Biotechnology Information) database (Figure 1).

A schematic representation of the methodology.
Identification of non-homologous essential proteins
The protein sequences were submitted to Geptop tool 2.0 (http://guolab.whu.edu.cn/geptop/) to identify their essentiality in MTB. 11 The webserver discovers essential genes of MTB by comparing their orthology and phylogeny with the crucial gene database and differential expression gene. These essential genes were searched against proteins from the human RefSeq protein database for non-homology using NCBI-BLASTP (https://blast.ncbi.nlm.nih.gov/). 12 Proteins with identity below 35% and an E-value cut-off of 0.005 were selected as non-hosts. 13
PPI network analysis
The PPI network of non-homologous proteins was analyzed using string analysis (https://string-db.org/) in Cytoscape v.3.9.0 (https://cytoscape.org/). 14 The interaction of network data was examined by the network analyzer module. 15 Furthermore, the functional modules of non-homologous proteins were detected using the Cytoscape plugin MCODE. The scores and parameters include the degree cut-off of 2, maximum depth of 100, k-core of 2, and node score cut-off of 0.2. 16 The uppermost hierarchical module was selected as the possible metabolic functional association of the interacting proteins and was assigned for further analysis.
Subcellular localization and identification of novel drug targets through PPI network analysis
Subcellular localization of the essential non-human proteins selected from network analysis was predicted by PSORTb v3.0.2 (https://www.psort.org/psortb/) 17 and CELLO v2.518 (http://cello.life.nctu.edu.tw/cello2go/). 18 Transmembrane proteins were identified by TMHMM-2.0 (https://services.healthtech.dtu.dk/service.php?TMHMM-2.0) based on the hidden Markov model. 19 The most probable topology of a membrane protein was determined using the N-best algorithm. Proteins with transmembrane helices estimated to have less than 50 amino acid residues from the N terminus were extracted as possible candidates for signal peptides. Furthermore, when a cleavage site is predicted to be >0.5, the signal peptide was cleaved off, and the prediction was redone. 19 The proteins selected as novel drug targets were cytoplasmic and transmembrane. 20 The DrugBank database (https://go.drugbank.com/) was used to identify novel targets with an E-value of less than 10−5, sequence identity more significant than 35%, and score slightly greater than 100. 21
Functional enrichment analysis
The Database for Annotation, Visualization, and Integrated Discovery (DAVID) v.6.8 (https://david.ncifcrf.gov/) 22 was used to perform GO and KEGG enrichment analysis to investigate functional annotation and pathways involved in novel drug targets. The complete list of all selected proteins was sent to GO enrichment analysis under the headings of the cellular compartment (CC), biological process (BP), MF, and KEGG. Finally, the significance threshold was set at a P value <.05.
Results
Metabolic pathway analysis and identification of essential proteins
A total of 345 and 131 metabolic pathways of H sapiens (S1) and MTB (S2), respectively, were extracted from the KEGG pathway database. Furthermore, 43 of the 131 pathways of MTB were unique and comprised 548 enzymes. The essentiality of these enzymes for the pathogen was analyzed using Geptop 2.0, and it was discovered that 313 (S3) of them were predicted as essential genes. NCBI-Blastp was performed to identify the homology of the enzyme with human proteins. Of the 313 essential proteins, 197 proteins (S4) identified as human non-homologous protein.
PPI analysis
According to STRING analysis, 192 nodes and 1154 interaction lines were analyzed from 197 essential proteins of MTB (S5). The PPI data originating from STRING into Cytoscape were further analyzed to explore the significance of proteins in the protein networks and the primary cluster using the MCODE plugin (S6). This was conducted due to the complexity and originality of the network. The highest cluster with the lowest P value comprised 29 nodes and 194 edges and was selected as the possible metabolic functional association between identified proteins (Figure 2).

Protein-protein interaction of 29 proteins from non-host essential proteins from Mycobacterium tuberculosis.
GO enrichment analysis
Because 1 protein-coding gene (uppP) is not mappable in the MCODE, we decided to use 28 proteins for further analysis. Gene ontology enrichment analysis was performed on these 28 proteins (S7) to explore their underlying mechanisms in MTB using the DAVID tool (Figure 3). There were 13 GO enrichment terms for BP. The top 5 enriched terms are cell cycle, cell division, shape regulation, peptidoglycan biosynthetic process, and cell wall organization (false discovery rate [FDR] = 0.0001). Only 2 CC items were obtained from GO enrichment cytoplasmic and an integral membrane component. There were 5 GO terms for MF enrichment, and the most enriched (FDR < 0.0001) were transferase activity, transferring glycosyl group, and arabinosyltransferase activity. Moreover, KEGG analysis revealed 3 pathways correlated with the respected proteins: vancomycin resistance, lysine biosynthesis, and peptidoglycan biosynthesis.

Gene ontology analysis of 28 proteins from non-host essential proteins from MTB. Analysis of the proteins under the headings of biological process, molecular function, cellular compartment, and pathways. The x-axis shows significantly enriched categories of the proteins, and the y-axis shows the terms (P < .0001).
Prediction of subcellular localization and identification of novel drug targets
The subcellular localization of 28 proteins revealed that 12 were cytoplasmic, 2 were extracellular, and 13 were transmembrane, with the exclusion of 1 protein that did not fulfill the requirements (number of predicted transmembrane helix <50 amino acid residues and total probability of peptide cleavage >0.5). 19 Subsequently, the novel targets were queried against the DrugBank database. Proteins showing no matching hits against the DrugBank database at the threshold were nominated as novel drug targets. The results showed 12 proteins were uniquely involved in pathogen-specific unique pathways, and peptidoglycan and lysine biosynthesis (Table 1).
List of proteins selected as novel drug targets.
Abbreviation: GlmU, N-acetylglucosamine-1-phosphate uridyltransferase.
Discussion
This study focused on subtractive genome analysis, which resulted in identifying proteins that could serve as prospective drug targets against the pathogenicity of MTB. Developing a drug, particularly for non-homologous targets, does not affect the host’s biology and has a specific effect on the pathogen. Furthermore, 12 unique proteins from MTB were proposed, and 4 of them are cytoplasmic and 8 are transmembrane unique. Furthermore, 12 non-homologous proteins were discovered in different pathways, such as arabinogalactan biosynthesis, lipoarabinomannan (LAM) biosynthesis, lysine biosynthesis pathway, and O-antigen nucleotide sugar biosynthesis.
Kushwaha et al used a similar strategy to identify therapeutic targets in MTB. Subsequently, 18 prospective drug targets were identified using metabolic pathway and chokepoint analysis. 7 However, this study is more refined as GO and non-homology analysis, as well as druggability, functionality, essentiality, and cellular localization, were included. These provided detailed information, such as their location in the cell, about the drug targets.
Phosphatidylinositol mannoside acyltransferase (patA) is an essential enzyme involved in the biosynthesis of phosphatidyl-myo-inositol mannosides (PIMs), which are vital components of glycolipids/glycoglycans of the mycobacterial cell envelope. 23 Phosphatidyl-myo-inositol mannosides are an important virulence factor during MTB infection and have been shown to be an important enzyme in both in vitro and in vivo growth.24,25 Defects in these proteins do not directly affect the life of the pathogenic bacteria but reduce the integrity of the cell wall. Therefore, these results suggest that patA is a promising drug target candidate.25,26
Galactan 5-O-arabinofuranosyltransferase, terminal beta-(1→2) arabinofuranosyltransferase, and decaprenylphosphate N-acetylglucosamine phosphotransferase are involved in arabinogalactan and LAM biosynthesis, which are essential components of the MTB cell wall. The mycobacterial cell wall is the most frequently adopted target for anti-TB drugs due to the fundamental nature of its synthesis and assembly. 26 This intricating structure, which consists of 3 separate layers of peptidoglycan, arabinogalactan, and mycolic acid, enhances cell proliferation, virulence, and antibiotic resistance. 27 Targeting the enzymes for synthesizing and assembling the arabinan domains of arabinogalactan and LAM presents opportunities for new therapies.
Diaminopimelate epimerase is an important enzyme for lysine biosynthesis, a significant component in the bacterial peptidoglycan cell wall.
28
It plays an essential role in converting LL-DAP into meso-DAP in the lysine biosynthesis pathway in bacteria by converting LL-DAP into meso-DAP. The products of this pathway, namely, meso-DAP and
N-acetylglucosamine-1-phosphate uridyltransferase (GlmU) is a bifunctional enzyme with uridyltransferase and acetyltransferase activities catalyzed by the N-terminal and C-terminal domains, respectively. 30 It plays a crucial role in synthesizing UDP-N-acetylglucosamine, a fundamental precursor of the cell wall peptidoglycan of MTB.31,32 A study by Soni et al showed that GlmU depletion led to decreased MTB survival. 30 TPSA, a GlmU inhibitor, was reported to impair the cell wall and membrane integrity of MTB. 33 Therefore, this bifunctional enzyme could be a promising target for new TB drugs. 31
There are some limitations based on this study. Several drug targets were identified, but not all were in pharmacological activity; hence, they could potentially miss the medication target (undruggable). In addition, functional studies and clinical trials are still required to confirm the safety and efficacy of the drugs.
Conclusions
The availability of complete genome sequences and computer-aided analysis to discover potential anti-TB drug targets has become a new trend. This study performed comparative metabolic pathways to identify the probable anti-TB targets. These results highlight an innovative method to discover therapeutic targets for treating MTB infection. Twelve novel drug targets were reported in this research and are involved in different pathways, including arabinogalactan biosynthesis, LAM biosynthesis, lysine biosynthesis pathway, and O-antigen nucleotide sugar biosynthesis. These results can be further exploited for rational drug design for MTB.
Supplemental Material
sj-xlsx-1-bbi-10.1177_11779322231171774 – Supplemental material for Bioinformatics Analysis to Uncover the Potential Drug Targets Responsible for Mycobacterium tuberculosis Peptidoglycan and Lysine Biosynthesis
Supplemental material, sj-xlsx-1-bbi-10.1177_11779322231171774 for Bioinformatics Analysis to Uncover the Potential Drug Targets Responsible for Mycobacterium tuberculosis Peptidoglycan and Lysine Biosynthesis by Dian Ayu Eka Pitaloka, Afifah Izzati, Siti Rafa Amirah, Luqman Abdan Syakuran, Lalu Muhammad Irham, Athika Darumas Putri and Wirawan Adikusuma in Bioinformatics and Biology Insights
Footnotes
Acknowledgements
The author is grateful to Lidya Chaidir for the helpful discussion and to Rizky Abdulah for the remarkable support.
Author Contributions
D.A.E.P. was involved in conceptualization, methodology, writing the original draft, and supervision. A.I., S.R.A., and L.A.S. collected data, analyzed, and wrote the original draft. L.M.I., A.D.P., and W.A. wrote, revised, and edited the manuscript. All authors contributed to manuscript revision, and read and approved the submitted version.
Declaration of conflicting interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding:
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by Directorate of Research and Community Service Universitas Padjadjaran (grant number 2203/UN6.3.1/PT.00/2022).
Supplemental Material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
