Abstract
Schistosomiasis, otherwise known as bilharzia or snail fever, is a disease that usually affects poor people and people exposed to poor sanitation. The disease affects over 200 million people worldwide annually. Schistosomiasis has been treated using a single drug, praziquantel, since the 1970s and this is resulting in schistosomes becoming resistant. Therefore, there is an urgent need to develop new antischistosoma drugs and vaccines. This study focuses on identifying potential antischistosomal compounds from the plant
Introduction
Schistosomiasis (also called bilharzia) is a neglected tropical disease with a significant socioeconomic impact and a global public health concern.
1
According to conservative estimates, schistosomiasis affects over 230 million people worldwide.2,3 The disease is caused by trematode flukes of the
The complexity and diversity of natural products offer a wide range of potential compounds for drug discovery.
1
Since ancient times, medicinal plants have been used to treat a variety of diseases and, 80% of the population in developing countries still rely on herbal remedies to meet their primary health care requirements.17,18 To date, several plants are being used in folklore medicine for treatment of schistosomiasis and these include
Identification of drug candidates through computational screening approaches is a promising and cost-effective strategy that plays a vital role in drug design and discovery. 24 To discover new and affordable treatments for schistosomiasis, computer-aided drug design (CADD) approaches have been employed to develop novel schistosomicidal agents.16,25-27
In this study, molecular docking, drug-likeness prediction, toxicity prediction, binding free energy calculations, and molecular dynamics (MD) simulations were applied to identify potential lead candidates from a library of
We employed molecular docking to find potential
Materials and Methods
The complete workflow for the identification of

Schematic representation of the steps of the methodology.
Computational tools
All software were run on an HP 2170p Intel(R) Core(TM) i5-3427U CPU @ 1.80 GHz, 4 GB RAM, and 464 GB hard disk space. Windows 10 Professional was used as the Computer Operating System.
Ligand preparation
Protein preparation
The crystal structure of sulfotransferase in complex with oxamniquine (PDB ID: 4MUB, Resolution: 1.75 Å) was retrieved from Protein Data Bank (PDB) (https://www.rcsb.org/). All ligands were deleted before the protein was prepared using the Dock Prep module of UCSF Chimera v1.17.1 34 (https://www.cgl.ucsf.edu/chimera/). The Dock Prep steps included the deletion of water molecules, adding missing hydrogen atoms, adding polar charges, and completing missing side chains using the Dunbrack 2010 rotamer library.
Molecular docking
Autodock Vina
35
integrated within the PyRx software was used for molecular docking simulations.
Validation of the molecular docking protocol
To validate the molecular docking protocol, we used the re-docking method. Briefly, the pose of the ligand co-crystallised with the receptor protein was compared with the pose of the ligand docked in the active site of this same protein. Then, the RMSD was used to compare the average distance between the ligand subjected to molecular docking and the co-crystallised ligand. In this method of validation, the docking protocol is valid if the RMSD is less than 2.0 Å. 38 The RMSD calculation was done using the BIOVIA Discovery Studio Visualiser (http://accelrys.com/products/collaborative-science/biovia-discovery-studio/) software.
Drug-likeness analysis
The drug-likeness of compounds was evaluated by DataWarrior v5.5.0. 39 The Lipinski Rule of Five parameters (RO5) 40 (molecular weight ⩽ 500, lipophilicity/clogP ⩽ 5, H-Donors ⩽ 5 and H-Acceptors ⩽ 10), which is an empirical rule of thumb was used to filter the compounds for their drug-likeness properties. Compounds that satisfy the RO5 are considered to have good solubility and permeability properties. 41 To remove false positives from the screened library, the Pan Assay Interference Compounds (PAINS) were filtered out using the PAINS Remover 42 tool (https://www.cbligand.org/PAINS).
Toxicity prediction
The ORIS Property Explorer (https://www.organic-chemistry.org/prog/peo/) was used to predict the toxicity (mutagenic, tumorigenic and reproductive effects) and drug score of the compounds that had passed the drug-likeness assessment.
Molecular mechanics/Poisson-Boltzmann (generalised Born) surface area calculations
The webserver, fastDRH 43 (http://cadd.zju.edu.cn/fastdrh/), was used to calculate the free energies of protein-ligand complexes using the molecular mechanics/Poisson-Boltzmann (generalised Born) surface area, MM/PB(GB)SA, methods. The force fields for the receptor and ligands were ff19SB (with OPC water model) 44 and GAFF2, 45 respectively. Binding free energy calculations are considered more accurate than molecular docking as it considers the solvation effect. 46
Molecular dynamics simulation
CABS-flex 2.0 47 was used to evaluate the conformational stability of the top-ranked protein-ligand complexes. CABS-flex offers rapid and accurate protein flexibility simulations (10 ns), which correlate with protein flexibility data generated by the nuclear magnetic resonance.47,48 Simulations were executed with default parameters.
Results and Discussion
Molecular docking
Molecular docking is an established method used in structure-based computer-aided drug discovery. Docking is employed in virtual screening to identify compounds likely to bind to a target protein. This can, therefore, reduce the number of compounds that need to be experimentally tested. In this study, 163
Summary of molecular docking between screened hits and
Validation of molecular docking accuracy
The molecular docking protocol was validated by cross-docking the co-crystallised ligand (sulfotransferase) using the same parameters as those applied for the

Validation of molecular docking by superimposition of
Drug-likeness analysis
Drug-likeness analysis is the calculation of the chances of a molecule becoming an oral drug regarding bioavailabity.
49
Some compounds have good binding affinity with target receptors but, however, fail in clinical trials at advanced stages of drug discovery due to their lack of drug-likeness properties.
50
The desirable properties are derived from the Lipinski RO5 which states that drug-like compounds have Log
The physiochemical properties of screened hits and the reference drug.
Besides the RO5, we also evaluated the screened compounds for PAINS. Compounds that belong to PAINS have promiscuous behaviour as they react nonspecifically with many biological targets rather than specifically affecting 1 desired target.42,52 Thus, compounds with a substructure of PAINS are unsuitable to be lead compounds for drug discovery.53-55 As shown in Table 2, of the 14 compounds that passed the RO5 screen, only 2 compounds (CID_5280445 and CID_5317284) were filtered out. This means that the remaining 12 ligands are not likely to produce false positives in high-throughput screen tests.
Toxicity risk prediction
In silico toxicity analysis is an important step as it helps identify potential toxicities early in the drug discovery process, saving time and resources. 56 In this study, in silico toxicity analysis was carried out for the compounds that passed both the drug-likeness criterion and the PAINS filter (Table 3). Toxicity risk was calculated to determine the mutagenicity, tumorigenicity reproductive effects and dermatological effects (irritants) of the screened compounds. As shown in Table 3, the reference drug and 6 other compounds had no predicted toxic properties. In addition to toxicity assessments, OSIRIS calculated the drug scores of the screened hits. The drug score shows the probability of a compound to be a drug and it considers both the physiochemical and toxicity evaluation. The score value of 1 shows that the compound is a good drug candidate and shows that the compound has no possibility of being a drug. The toxicity and drug score results are shown in Table 3.
Toxicity prediction and drug scores of the screened hits and the reference drug.
Shortlisted compounds
After our rigorous virtual screening, drug-likeness assessments, toxicity tests and PAINS analysis, we remained with 5 possible potential inhibitors of
Shortlist of screened hit compounds that passed the RO5, PAINS filter and the toxicity screen.
Interaction analysis of shortlisted best binders and the reference drug with Sm SULT
LigPlot+ and PyMOL were used to visualise the 2-dimenisional (2D) and 3D interactions of the protein-ligand complexes (Figures 5 to 8) produced from docking with Autodock Vina. The interaction results are summarised in Table 5. Rosmarinic acid (CID5281792) showed the highest binding affinity of −8.8 kcal/mol, forming 4 hydrogen bonds with Lys 137, Val 128, Asn 228, Arg 228, and 11 hydrophobic interactions with SmSULT as calculated by Autodock Vina. This was followed by hispidulin (CID5281628) which gave a binding energy of −8.6 kcal/mol, forming 1 hydrogen bond with Asp 91 and 13 hydrophobic bonds. The other 3 top compounds – gamma-cadinene (CID15094) (−7.8 kcal/mol), (+)-delta-cadinene (CID441005) (−7.7 kcal/mol), (+)-gamma-cadinene (CID6432404) (−7.7 kcal/mol) – did not form any hydrogen bond interactions with side chains. However, the compounds formed 10, 13 and 8 hydrophobic interactions, respectively.

Hydrogen bonding and hydrophobic interactions of rosmarinic acid (CID5281792) with

Hydrogen bonding and hydrophobic interactions of hispidulin (CID5281628) with

Hydrophobic interactions of gamma-cadinene (CID15094) with

Hydrophobic interactions of (+)-delta-cadinene (CID441005) with

Hydrophobic interactions of (+)-gamma-cadinene CID6432404 with

Hydrogen bonding and hydrophobic interactions of the reference bond with
Protein-ligand binding energy and interactions generated from docking with PyRx.
The intermolecular hydrogen bonding between the protein and ligand plays an important role in stabilising the protein-inhibitor complex.
57
Therefore, hydrogen bonds are an important parameter to consider in computer-aided drug development. Ultimately, a combination of hydrogen bonds and many hydrophobic interactions are likely to result in a more desirable drug candidate. Our results showed that the top 2 high scoring compounds and the reference drug had extensive hydrophobic interactions with the
Based on the interaction analysis results, we selected rosmarinic acid (CID5281792) and hispidulin (CID5281628) for further in silico analysis to calculate the binding free energy scores and stability of their respective docked complexes.
Molecular mechanics/Poisson-Boltzmann (generalised Born) surface area calculations
Molecular mechanics/Poisson-Boltzmann (generalised Born) surface area using fastDRH was used to calculate the binding free energy of selected docked complexes as relying only on the docking score does not adequately predict the binding affinity between the protein and the ligand. 58 As shown in Table 6, we found all complexes to have significant binding energy scores. In all the procedures used, rosmarinic acid had the highest negative binding free energy, followed by the reference ligand and then finally hispidulin. The negative binding energy scores shown for all the docked complexes suggest that the complexes are stable and warrant further investigations.
Binding free energy for the hit compounds and the reference drug in complex with 4MUB.
Molecular dynamics simulations
Unlike molecular docking that treats the receptor and the ligand as rigid molecules, MD simulations take into account the conformational changes of the receptor and the ligand, mimicking the real physiological conditions in the cell.
59
We, therefore, used CABS-flex 2.0 to evaluation of the root mean square fluctuation (RMSF) of the top-ranked protein-ligand complexes. Thus, the RMSF values of rosmarinic acid (CID5281792) and hispidulin (CID5281628) were compared with the reference drug. The RMSF values calculate the fluctuation of each amino acid residue as it interacts with the ligand throughout the simulation. The higher the RMSF value, the more flexible are the residues whereas low RMSF value indicates limited movements. For all the complexes studied, we observed that the RMSF variation for all the complexes was minimal, indicating less variability in the protein structure (Figure 9). For the reference drug, the minimum, maximum and average RMSF values of the

RMSF plots of rosmarinic acid (CID5281792), hispidulin (CID5281628) and the reference drug with the
More importantly, the active pocket amino acid residue that interacted with the ligands during docking remained highly stable in the presence of both the screened compounds and the reference drug. Based on these results, we conclude that the protein-ligand complexes evaluated were stable in the physiological condition. Therefore, both rosmarinic acid and hispidulin are potentially potent inhibitors for

2D structures of rosmarinic acid and hispidulin.
Conclusions
Praziquantel has been the only drug used for treating schistosomiasis, and it is slowly becoming less effective because of resistance of some schistosomes. Therefore, there is a need for immediate development of schistosomiasis drugs to prevent a potential health crisis. In this study, we employed various computational methods to identify novel
Footnotes
Funding:
The author(s) received no financial support for the research, authorship and/or publication of this article.
Declaration of conflicting interests:
The author(s) declared no potential conflicts of interest with respect to the research, authorship and/or publication of this article.
Author Contributions
RS conceived and designed this study. All authors contributed to the primary investigation, formal analysis, data interpretation and to the writing of the manuscript. RS supervised the study. All authors have read and agreed to submit the final version of the manuscript.
Ethics approval
Ethics approval was not required for this study.
