Abstract
Background:
Machine learning models for phage–host range prediction and design require comprehensive training data on phage genomes and host ranges to predict phage–host interactions effectively.
Materials and Methods:
This study characterizes phage sample NRG-P0074 viral sample RU1 from unclassified Mosigvirus, originally isolated by the Betty Kutter. The complete genome of NRG-P0074 was sequenced, annotated, and analyzed using various bioinformatic tools. Host range analysis was conducted using the Escherichia coli Reference (ECOR) Library and nine Escherichia coli (E. coli) K12 strains (Keio Knockout Collection) with single nonessential gene deletions.
Results:
The genome of NRG-P0074 spans 168,357 base pairs with a guanine-cytosine (GC) content of 37.5%. NRG-P0074 exhibited permissiveness in 15.28% of the ECOR isolates and all 9 Keio knockout strains. Comparative genomic analysis revealed that NRG-P0074 is closely related to E. coli phage a20. Its genome is comprised of 270 coding sequences, 153 known genes, 16 terminators, 3 ribosomal-binding sites, 0 tRNAs, and 117 hypothetical proteins.
Conclusions:
This research provides valuable data for developing machine learning models to predict phage–host interactions, aiding the development of targeted phage therapies against antibiotic-resistant bacteria.
Get full access to this article
View all access options for this article.
