Machine Learning - driven insights for predicting the impact of nanoparticles on the functionality of biomolecules, Illustrated by the case of DNA Damage-Inducible Transcript 3 (CHOP) inhibitors

Ivanova, Mariya; Russo, Nicola; Mihaylov, Gueorgui; Konstantin, Nikolic

Machine Learning - driven insights for predicting the impact of nanoparticles on the functionality of biomolecules, Illustrated by the case of DNA Damage-Inducible Transcript 3 (CHOP) inhibitors

Lists

Ivanova, Mariya, Russo, Nicola, Mihaylov, Gueorgui and Konstantin, Nikolic ORCID: https://orcid.org/0000-0002-6551-2977 (2025) Machine Learning - driven insights for predicting the impact of nanoparticles on the functionality of biomolecules, Illustrated by the case of DNA Damage-Inducible Transcript 3 (CHOP) inhibitors. IEEE Transactions on Pattern Analysis and Machine Intelligence. ISSN 0162-8828 (Submitted)

	PDF (PDF/A) Machine Learning - driven insights for predicting the impact of nanoparticles on the functionality of biomolecules_IvanovaM_accessible.pdf - Submitted Version Restricted to Repository staff only Download (1MB)
	PDF Appendices_ Machine Learning - driven insights for predicting _Appendices_IvanovaM.pdf - Supplemental Material Restricted to Repository staff only Download (1MB)

Abstract

The presented study contributes to ongoing research that aims to overcome challenges in predicting the bio-applicability of nanoparticles (NPs). The approach explored a variety of combinations of nuclear magnetic resonance (NMR) spectroscopy data derived from the Simplified molecular-input line-entry system (SMILES) notations and small biomolecule features. The resulting datasets were utilised for machine learning (ML) with scikit-learn and deep neural networks (DNN) with PyTorch. Despite the obstacles in predicting how NPs influence biomolecule functionalities, the methodology was reasoned in terms of its applicability to compounds both with and without NPs. The methodology was illustrated through a quantitative high-throughput screening (qHTS) aimed at finding DNA Damage-Inducible Transcript 3 (CHOP) inhibitors. Based on this data, the optimal ML performance was achieved by the Random Forest Classifier, which was trained on 19,184 samples and tested on 4,000, resulting in 81.1% accuracy, 83.4% precision, 77.7% recall, 80.4% F1-score, 81.1% ROC, and a five-fold cross-validation score of 0.821. Complementing the main study, two computational approaches were developed to enhance CHOP inhibitor prediction. The first identifies the most desirable/undesirable functional groups for CHOP inhibition. The second, a CID_SID ML model, achieved 90.1% accuracy in predicting whether compounds designed for other purposes possess CHOP inhibition potential.

Item Type:	Article
Additional Information:	This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible.
Keywords:	Scikit learn, PyTorch, SMILES, NMR, CID_SID ML model
Subjects:	Computing
Date Deposited:	16 Sep 2025
URI:	https://repository.uwl.ac.uk/id/eprint/14069

Actions (admin access)

References

[1] X. Shan et al., “Current approaches of nanomedicines in the market and various stage of clinical translation”, Acta Pharm. Sin. B, vol.12, no. 7, pp. 3028–3048, 2022, doi: 10.3389/fped.2024.1396408
[2] A. Ponta, “Considerations for Drug Products that Contain Nanomaterials”, FDA, May 2024 [online] https://www.fda.gov/drugs/cder-small-business-industry-assistance-sbia/considerations-drug-products-contain-nanomaterials#:~:text=FDA%20recently%20released%20the%20guidance,to%20conventional%20manufacture%20or%20storage (accessed 10 January 2025)
[3] K. Eitel, G. Bryant and H.J. Schöpe, “A Hitchhiker’s Guide to Particle Sizing Techniques”, Langmuir, vol. 36, no. 35, pp. 10307-10320, 2020, doi: 10.1021/acs.langmuir.0c00709
[4] C. Fornaguera, G. Calderó, M. Mitjans, M.P. Vinardell, C. Solansa and C. Vauthierc, “Interactions of PLGA nanoparticles with blood components: protein adsorption, coagulation, activation of the complement system and hemolysis studies”, Nonoscale, vol. 14, no. 7, pp. 6045-6058, 2015, doi: 10.1039/C5NR00733J
[5] M. Zhou et al., “Evolution from the plasmon to exciton state in ligand-protected atomically precise gold nanoparticles”, Nat. Commun., vol. 7, no. 10, p. 13240, 2016, doi: 10.1038/ncomms13240
[6] A.V. Singh et al., “Artificial Intelligence and Machine Learning in Computational Nanotoxicology: Unlocking and Empowering Nanomedicine”, Adv. Healthc. Mater., vol.9, no. 17, p. 1901862, 2020, doi: 10.1002/adhm.201901862
[7] H. Salehi, M. Etemadi and P. Kazemi, “Current Trends and Challenges in Pharmacoeconomic Aspects of Nanocarriers as Drug Delivery Systems for Cancer Treatment.”, Front. Pharmacol., vol. 12, 763403, 2021, doi: 10.2147/ijn.s323831
[8] T. L. Moore et al, “Nanoparticle colloidal stability in cell culture media and impact on cellular interactions”, Chem. Soc. Rev., vol. 44, no. 17, 6287-6305, 2015, doi: 10.1039/C4CS00487F
[9] D. T. Savage, J. Z. Hilt and T. D. Dziubla, “In Vitro Methods for Assessing Nanoparticle Toxicity”, Methods Mol Biol., vol. 1894, pp. 1-29., 2019 doi: 10.1007/978-1-4939-8916-4_1
[10 ] I. Furxhi, F. Murphy, M. Mullins, A. Arvanitis and C.A. Poland, “Practices and Trends of Machine Learning Application in Nanotoxicology”, Nanomaterials (Basel, Switzerland), vol. 10, no. 1, p.116, 2020, doi: 10.3390/nano10010116
[11] Q Qi and Z. Wang, “Integrating machine learning and nano-QSAR models to predict the oxidative stress potential caused by single and mixed carbon nanomaterials in algal cells”, Environmental Toxicology and Chemistry, 2025;, vgae049, doi: 10.1093/etojnl/vgae049
[12] A. Rybinska-Fryca, A. Mikolajczyk and T. Puzyn, “Structure–activity prediction networks (SAPNets): a step beyond Nano-QSAR for effective implementation of the safe-by-design concept”, Nanoscale, vol. 40, no. 10, pp. 20669-20676, 2020, doi: 10.1039/D0NR05220E
[13] D. Weininger, “SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules”, J. Chem. Inf. Model., vol. 28, no. 1, pp. 31-36, 1988, doi: 10.1021/ci00057a005
[14]H. Nieto-Chaupis, "Computational Simulation of Artificial Nanoparticles Paths," 2020 IEEE Third International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), Laguna Hills, CA, USA, 2020, pp. 193-197, doi: 10.1109/AIKE48582.2020.00045
[15] B. Rudrapath et al., “Predicting Nano Particle by Biological behavior-A Machine Learning Approach”, NanoWorld J, 2023, doi: 10.17756/nwj.2023-s3-169
[16] S. K. Niazi and Z. Mariam, “Artificial intelligence in drug development: reshaping the therapeutic landscape”, Ther. Adv. Drug Saf., vol.16, 2025, doi: 10.1177/204209862513217
[17] H. Guo, X. Xing, Y. Zhou, W. Jiang, X. Chen, T. Wang, et al. “A Survey of Large Language Model for Drug Research and Development.”, IEEE Access., vol. 13, pp. 51110-51129, 2025, doi: 10.1080/14656566.2022.2161366
[18] G.R. Fulmer et al., “NMR Chemical Shifts of Trace Impurities: Common Laboratory Solvents, Organics, and Gases in Deuterated Solvents Relevant to the Organometallic Chemist”, Organomet., vol. 29, no. 9, pp. 2176-2179, 2010, doi: 10.1021/om100106e
[19] J. Abramson et al., “Accurate structure prediction of biomolecular interactions with AlphaFold 3”, Nature, vol. 630, pp. 493–500, 2024, doi: 10.1038/s41586-024-07487-w
[20] M. Milic et al. “NMR Quantification of Hydrogen-Bond-Accepting Ability for Organic Molecules”, J. Org. Chem., vol. 86, no. 9, pp. 6031-6043, 2021, doi: 10.1021/acs.joc.0c02876
[21] P. H. Kowalski, A. Krzemińska, K. Pernal, E. Pastorczak, “Dispersion Interactions between Molecules in and out of Equilibrium Geometry: Visualization and Analysis, J. Phys. Chem. A, vol. 126, no. 7, pp. 1312-1319, 2022 doi: 10.1021/acs.jpca.2c00004
[22] NMRDB Tools for NMR spectroscopists; Predict 13C NMR. https://nmrdb.org/13c/index.shtml?v=v2.138.0 (accessed 2025-10-14)
[23] T. Sajed et al., “Accurate Prediction of 1H NMR Chemical Shifts of Small Molecules Using Machine Learning.” Metabolites, vol.14, no. 5, p. 290, 2024, doi: 10.3390/metabo14050290
[24] S. Kuhn S. and Johnson S.R. “Stereo-aware extension of HOSE codes.”, ACS Omega, vol.4, pp. 7323–7329, 2019, doi: 10.1021/acsomega.9b00488
[25] C. Han, D. Zhang, S. Xia, and Y. Zhang, “Accurate Prediction of NMR Chemical Shifts: Integrating DFT Calculations with Three-Dimensional Graph Neural Networks.”, J. Chem. Theory Comput., vol. 20, no. 12, pp. 5250-5258, 2024, doi: 10.1021/acs.jctc.4c00422
[26] L. E. Marbella and J. E. Millstone, “NMR Techniques for Noble Metal Nanoparticles” Chem. Mater, vol. 27, no. 8, pp. 2721-2739, 2015, doi: 10.1021/cm504809c
[27] F. De Biasi, F. Mancin and F. Rastrelli, "Nanoparticle-assisted NMR spectroscopy: A chemosensing perspective", Prog. Nucl. Magn. Reson. Spectrosc., vol. 117, pp. 70-88, 2020, doi: 10.1016/j.pnmrs.2019.12.001
[28] Y. Zhang, C.G. Fry, J.A. Pedersen and R.J. Hamers, “Dynamics and Morphology of Nanoparticle-Linked Polymers Elucidated by Nuclear Magnetic Resonance”, Anal. Chem., vol. 89, no. 22, pp. 12399-12407, 2017, doi: 10.1021/acs.analchem.7b03489
[29] C. Guo and J. L. Yarger, “Characterizing gold nanoparticles by NMR spectroscopy. Magn Reson Chem., vol.56, no. 11, pp.1074–1082, 2018, doi: 10.1002/mrc.4753
[30] I. C. Felli and R. Pierattelli, “13C Direct Detected NMR for Challenging Systems”, Chem. Rev., vol. 122, no.10, pp. 9468-9496, 2022, doi: 10.1021/acs.chemrev.1c00871
[31] M.L. Ivanova, N. Russo and K. Nikolic, “Leveraging 13C NMR spectrum data derived from SMILES for machine learning-based prediction of a small molecule functionality: a case study on human Dopamine D1 receptor antagonists”, ArXiv, 2025, 10.48550/arXiv.2501.14044
[32] M.L. Ivanova, N. Russo and K. Nikolic, “Comparative Analysis of Computational Approaches for Predicting Transthyretin Transcription Activators and Human Dopamine D1 Receptor Antagonists”, ArXiv, 2025, 10.48550/arXiv.2506.01137
[33] F. Pedregosa et al., “Scikit-learn: Machine Learning in Python,” JMLR, vol. 12, pp. 2825-2830, 2011, https://scikit-learn.org/stable/about.html
[34] PubChem, “Explore Chemistry”, National Institutes of Health. https://pubchem.ncbi.nlm.nih.gov/ Accessed 20 February 2025
[35] Xlogp3, http://www.sioc-ccbg.ac.cn/skins/ccbgwebsite/software/xlogp3/ Accessed 4 Jan 2025
[36] W.D. Ihlenfelldt, Y. Takahahashi, H. Abe and S. Sasaki, “Enhanced CACTVS browser of the open NCI database”, J. Chem. Inf. Comput. Sci., vol. 42, pp. 46-57, 2002, doi: 10.1021/ci010056s
[37] M. L. Ivanova, N. Russo, N. Djaid and K. Nikolic, “Application of machine learning for predicting G9a inhibitors”, Digital Discovery, vol. 3, no.10, pp. 2010-2018, 2024, doi: 10.1039/D4DD00101J
[38] M.L. Ivanova, N. Russo and K. Nikolic, “Targeting Neurodegeneration: Three Machine Learning Methods for Discovering G9a Inhibitors Using PubChem and Scikit-Learn”, ArXiv, 2025, doi:10.48550/arXiv.2503.16214
[39] A.M.H. van der Veen, J. Meija, A. Possolo, Antonio and D.B. Hibbert, "Interpretation and use of standard atomic weights (IUPAC Technical Report)", Pure Appl. Chem., vol. 93, no. 5, 2021, pp. 629-646, doi: 10.1515/pac-2017-1002
[40] P. Ertl, B. Rohde and P. Selzer, “Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties”, J Med Chem, vol. 43, no. 20, pp. 3714-7, 2000, doi: 10.1021/jm000942e
[41] T. Cheng et al., “Computation of octanol-water partition coefficients by guiding an additive model with knowledge”, J. Chem. Inf. Model., vol.47, no. 6, pp. 2140-8, 2007, doi: 10.1021/ci700257y
[42] A. Paszke et al., “PyTorch: An Imperative Style, High-Performance Deep Learning Library”, In Advances in Neural Information Processing Systems 32, pp. 8024–8035, 2019. [online] https://proceedings.neurips.cc/paper_files/paper/2019/hash/bdbca288fee7f92f2bfa9f7012727740-Abstract.html (accessed 10 January 2025)
[43] T. Akiba et al. “Optuna: A next generation hyperparameter optimization framework”, ArXiv, 2019, doi: 10.48550/arXiv.1907.10902
[44] G. Van Rossum, J. F. Drake, Python (Version 3.12.3), Centrum voor Wiskunde en Informatica Amsterdam, 1995. [Computer software]. Retrieved from https://www.python.org/
[45] Jupyter, Home page. Jupyter. 2024 https://jupyter.org/ (accessed 4 Jan 2025)
[46] National Center for Biotechnology Information, "PubChem Bioassay Record for AID 2732, HTS for small molecule inhibitors of CHOP to regulate the unfolded protein response to ER stress, Source: Emory University Molecular Libraries Screening Center", PubChem [online] https://pubchem.ncbi.nlm.nih.gov/bioassay/2732 (accessed 30 March 2025)
[47] H. Hu, M. Tian, C. Ding, S. Yu, “The C/EBP Homologous Protein (CHOP) Transcription Factor Functions in Endoplasmic Reticulum Stress-Induced Apoptosis and Microbial Infection”, Front Immunol., vol. 9, p. 3083. doi: 10.3389/fimmu.2018.03083
[48] A. Fribley, K. Zhang, R.J. Kaufman,” Regulation of apoptosis by the unfolded protein response”, Methods Mol Biol., vol. 559: pp.191-204, 2009, doi: 10.1007/978-1-60327-017-5_14
[49] A. Stilkerich et al., “Cell Homeostasis or Cell Death—The Balancing Act Between Autophagy and Apoptosis Caused by Steatosis-Induced Endoplasmic Reticulum (ER) Stress”, Cells., vol.14, no.6, p. 449, 2025, doi: 10.3390/cells14060449
[50] W. Zhang et al., “Endoplasmic reticulum stress—a key guardian in cancer”, Cell Death Discov., vol.10, p. 343, 2024, doi: 10.1038/s41420-024-02110-3
[51] Y. Yang et al., “Endoplasmic reticulum stress and the unfolded protein response: emerging regulators in progression of traumatic brain injury”, Cell Death Dis. vol.15, p. 156, 2024, doi: 10.1038/s41419-024-06515-x
[52] Z. He et al., “The role of endoplasmic reticulum stress in type 2 diabetes mellitus mechanisms and impact on islet function”, PeerJ., vol.13, p. e19192, 2025, doi: 10.7717/peerj.19192
[53] P. M. Sleiman et al, “Trans-ethnic Genomic Informed Risk Assessment for Alzheimer’s disease: An International Hundred K+ Cohorts Consortium Study.” Alzheimers Dement. Online July 14, 2023, doi: 10.1002/alz.13378
[54] M. Criado-Marrero, L.J. Blair, “CHOP is not a main contributor to tau-mediated toxicity”, Alzheimer's Dement., vol. 17, p. e058717, 2021, doi: 10.1002/alz.058717
[55] P. Aimé et al., “The drug adaptaquin blocks ATF4/CHOP-dependent pro-death Trib3 induction and protects in cellular and mouse models of Parkinson's disease”, Neurobiol Dis., vol.136, p. 104725. 2020, doi: 10.1016/j.nbd.2019.104725
[56] A.R. Sternberg et al. “Pre-clinical evaluation of an enhanced-function factor VIII variant for durable haemophilia A gene therapy in male mice”, Nat Commun., vol. 15, p. 7193, 2024, doi: 10.1038/s41467-024-51296-8
[57] E.A. Liu, A.P. Lieberman, “The intersection of lysosomal and endoplasmic reticulum calcium with autophagy defects in lysosomal diseases”, Neurosci. Lett., vol. 697, pp. 10-16, 2019, doi: 10.1016/j.neulet.2018.04.049
[58] F. Dasí, “Alpha-1 antitrypsin deficiency”, Med. Clin. (English Edition), vol.162, no. 7, pp. 336-342, 2024, doi: 10.1016/j.medcle.2023.10.026
[59] M.L. Ivanova, N. Russo and K. Nikolic, “Hierarchical Functional Group Ranking via IUPAC Name Analysis for Drug Discovery: A Case Study on TDP1 Inhibitors”, ArXiv, 2025, doi: 10.48550/arXiv.2503.05591
[60] S. Kim et al., “PubChem Substance and Compound databases”, Nucleic Acids Research, 44, D1202-13, 2016, doi: 10.1093/nar/gkv951
[61] M.L. Ivanova, N. Russo and K. Nikolic, ”Predicting novel pharmacological activities of compounds using PubChem IDs and machine learning (CID-SID ML model)”, ArXiv, 2025, doi: 10.48550/arXiv.2501.02154
[62] National Center for Biotechnology Information, "PubChem Bioassay Record for AID 1996, Aqueous Solubility from MLSMR Stock Solutions, Source: Burnham Center for Chemical Genomics", PubChem [online] https://pubchem.ncbi.nlm.nih.gov/bioassay/1996 (access 20 March 2025)
[63] A. Ortiz-Perez et al., “Machine learning-guided high throughput nanoparticle design”, Digital Discovery, vol.3, pp. 1280-1291, 2024, doi: 10.1039/D4DD00104D
[64] L. Yang et al., "Machine learning applications in nanomaterials: Recent advances and future perspectives", Chem. Eng. J., vol. 500, p. 156687, 2024, doi: 10.1016/j.cej.2024.156687
[65] M. Greenacre et al.,” Principal component analysis”, Nat. Rev. Methods Primers, vol. 2, no.100, 2022, doi: 10.1038/s43586-022-00184-w
[66] GitHub, “Reordering of the feature importance list according to the relative proportion of the active cases” [online] https://github.com/articlesmli/NMR_ML_CHOP/blob/main/8.reordering_feature_importance_list.ipynb (accessed 08 June 2025)
[67] GitHub, “Extraction of the most desirable functional group/fragment for CHOP inhibition” [online] https://github.com/articlesmli/NMR_ML_CHOP/blob/main/9.5.CHOP_group_all_dfs.ipyn (access 08 June 2025)
[68] GitHub, “Extraction of the most desirable functional group/fragment for CHOP inhibition” [online] https://github.com/articlesmli/NMR_ML_CHOP/blob/main/9.5.CHOP_group_all_dfs_ZEROS.ipynb (access 08 June 2025)

Tools

CORE (COnnecting REpositories)

The University of West London

Machine Learning - driven insights for predicting the impact of nanoparticles on the functionality of biomolecules, Illustrated by the case of DNA Damage-Inducible Transcript 3 (CHOP) inhibitors

Abstract

Actions (admin access)

Menu