Abstract
Machine learning (ML) is poised to accelerate antibiotic discovery by rapidly identifying and generating compounds with desirable properties. Despite focused effort, algorithmic advances alone have yielded only modest improvements in real-world performance. Greater gains will likely come from improved data acquisition, data representation, and model output interpretation by domain experts. Field-wide efforts in more standardized data curation, benchmarking, and publication practices are also essential to ensure that ML methods reach their full potential to help us efficiently discover new antibiotics to address unmet clinical needs. This review focuses on the data-centric choices necessary to build ML pipelines for antibiotic discovery that are robust, reliable, efficient, and biologically grounded.
Keywords
Get full access to this article
View all access options for this article.
