Artikel,

Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost Functions

V. Golkov, A. Becker, D. Plop, D. Čuturilo, N. Davoudi, J. Mendenhall, R. Moretti, J. Meiler, und D. Cremers.
(2020)
DOI: 10.48550/ARXIV.2007.07029

Metadaten

BibTeX-Schlüssel: golkov_becker_plop_čuturilo_davoudi_mendenhall_moretti_meiler_cremers_2020
Eintragstyp: article
Jahr: 2020
Verlag: arXiv
abstractnote: Computer-aided drug discovery is an essential component of modern drug development. Therein, deep learning has become an important tool for rapid screening of billions of molecules in silico for potential hits containing desired chemical features. Despite its importance, substantial challenges persist in training these models, such as severe class imbalance, high decision thresholds, and lack of ground truth labels in some datasets. In this work we argue in favor of directly optimizing the receiver operating characteristic (ROC) in such cases, due to its robustness to class imbalance, its ability to compromise over different decision thresholds, certain freedom to influence the relative weights in this compromise, fidelity to typical benchmarking measures, and equivalence to positive/unlabeled learning. We also propose new training schemes (coherent mini-batch arrangement, and usage of out-of-batch samples) for cost functions based on the ROC, as well as a cost function based on the logAUC metric that facilitates early enrichment (i.e. improves performance at high decision thresholds, as often desired when synthesizing predicted hit compounds). We demonstrate that these approaches outperform standard deep learning approaches on a series of PubChem high-throughput screening datasets that represent realistic and diverse drug discovery campaigns on major drug target families.
DOI: 10.48550/ARXIV.2007.07029
URL: https://arxiv.org/abs/2007.07029

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Bitte melden Sie sich an um selbst Rezensionen oder Kommentare zu erstellen.

Zitieren Sie diese Publikation

@article{golkov_becker_plop_čuturilo_davoudi_mendenhall_moretti_meiler_cremers_2020, abstractnote = {Computer-aided drug discovery is an essential component of modern drug development. Therein, deep learning has become an important tool for rapid screening of billions of molecules in silico for potential hits containing desired chemical features. Despite its importance, substantial challenges persist in training these models, such as severe class imbalance, high decision thresholds, and lack of ground truth labels in some datasets. In this work we argue in favor of directly optimizing the receiver operating characteristic (ROC) in such cases, due to its robustness to class imbalance, its ability to compromise over different decision thresholds, certain freedom to influence the relative weights in this compromise, fidelity to typical benchmarking measures, and equivalence to positive/unlabeled learning. We also propose new training schemes (coherent mini-batch arrangement, and usage of out-of-batch samples) for cost functions based on the ROC, as well as a cost function based on the logAUC metric that facilitates early enrichment (i.e. improves performance at high decision thresholds, as often desired when synthesizing predicted hit compounds). We demonstrate that these approaches outperform standard deep learning approaches on a series of PubChem high-throughput screening datasets that represent realistic and diverse drug discovery campaigns on major drug target families.}, added-at = {2024-11-22T16:40:08.000+0100}, author = {Golkov, Vladimir and Becker, Alexander and Plop, Daniel T. and Čuturilo, Daniel and Davoudi, Neda and Mendenhall, Jeffrey and Moretti, Rocco and Meiler, Jens and Cremers, Daniel}, biburl = {https://puma.scadsai.uni-leipzig.de/bibtex/23f683b0027c90044ae68e84b4644533a/scadsfct}, doi = {10.48550/ARXIV.2007.07029}, interhash = {ffb2748dbab7558d61c61b9d1b49c5b1}, intrahash = {3f683b0027c90044ae68e84b4644533a}, keywords = {imported topic_lifescience zno}, publisher = {arXiv}, timestamp = {2025-07-29T10:29:15.000+0200}, title = {Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost Functions}, url = {https://arxiv.org/abs/2007.07029}, year = 2020 }

PUMA

Deep Learning for Virtual Screening: Five Reasons to Use ROC Cost Functions

Metadaten

Tags

Nutzer

Kommentare und Rezensionenanzeigen / verbergen

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf