Machine Learning - driven insights for predicting the impact of nanoparticles on the functionality of biomolecules, Illustrated by the case of DNA Damage-Inducible Transcript 3 (CHOP) inhibitors

Ivanova, Mariya, Russo, Nicola, Mihaylov, Gueorgui and Konstantin, Nikolic ORCID logoORCID: https://orcid.org/0000-0002-6551-2977 (2025) Machine Learning - driven insights for predicting the impact of nanoparticles on the functionality of biomolecules, Illustrated by the case of DNA Damage-Inducible Transcript 3 (CHOP) inhibitors. IEEE Transactions on Pattern Analysis and Machine Intelligence. ISSN 0162-8828 (Submitted)

[thumbnail of PDF/A] PDF (PDF/A)
Machine Learning - driven insights for predicting the impact of nanoparticles on the functionality of biomolecules_IvanovaM_accessible.pdf - Submitted Version
Restricted to Repository staff only

Download (1MB)
[thumbnail of Appendices_ Machine Learning - driven insights for predicting _Appendices_IvanovaM.pdf] PDF
Appendices_ Machine Learning - driven insights for predicting _Appendices_IvanovaM.pdf - Supplemental Material
Restricted to Repository staff only

Download (1MB)

Abstract

The presented study contributes to ongoing research that aims to overcome challenges in predicting the bio-applicability of nanoparticles (NPs). The approach explored a variety of combinations of nuclear magnetic resonance (NMR) spectroscopy data derived from the Simplified molecular-input line-entry system (SMILES) notations and small biomolecule features. The resulting datasets were utilised for machine learning (ML) with scikit-learn and deep neural networks (DNN) with PyTorch. Despite the obstacles in predicting how NPs influence biomolecule functionalities, the methodology was reasoned in terms of its applicability to compounds both with and without NPs. The methodology was illustrated through a quantitative high-throughput screening (qHTS) aimed at finding DNA Damage-Inducible Transcript 3 (CHOP) inhibitors. Based on this data, the optimal ML performance was achieved by the Random Forest Classifier, which was trained on 19,184 samples and tested on 4,000, resulting in 81.1% accuracy, 83.4% precision, 77.7% recall, 80.4% F1-score, 81.1% ROC, and a five-fold cross-validation score of 0.821. Complementing the main study, two computational approaches were developed to enhance CHOP inhibitor prediction. The first identifies the most desirable/undesirable functional groups for CHOP inhibition. The second, a CID_SID ML model, achieved 90.1% accuracy in predicting whether compounds designed for other purposes possess CHOP inhibition potential.

Item Type: Article
Additional Information: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
Keywords: Scikit learn, PyTorch, SMILES, NMR, CID_SID ML model
Subjects: Computing
Depositing User: Mariya Ivanova
Date Deposited: 16 Sep 2025 14:13
Last Modified: 16 Sep 2025 14:15
URI: https://repository.uwl.ac.uk/id/eprint/14069

Actions (login required)

View Item View Item

Menu