Is it time to treat AI as a creature?

Ivanova, Mariya; Nicholls, Michael

Is it time to treat AI as a creature?

Lists

Ivanova, Mariya and Nicholls, Michael (2025) Is it time to treat AI as a creature? AI&Society. (Submitted)

Preview

PDF (PDF/A)
Is it time to treat AI as a creature_PrePrnt_IvanovaM_accessible.pdf - Submitted Version
Available under License Creative Commons Attribution.
Download (404kB) | Preview

Abstract

This conceptual article addresses the escalating challenges of advanced AI, focusing on handling the risks of AI hallucinations, deceptive behaviours, and self-preserving autonomy that can cause catastrophic harm or lead to meaningful escape from human control. The ideas discussed in the paper are based on LLM systems with a Zero Trust policy. The proactively embeds Human-in-the-Loop (HITL) architectures, offering a superior alternative to traditional control paradigms, keeping human responsible for the final choice of the AI outcome. By developing and adopting the proposed strategies, catastrophic risks can be mitigated without hindering AI evolution. Continuous human oversight and intervention at critical decision points will not only prevent disaster but also cultivate a genuine synergy of human and AI intelligence, fostering a new era of collaborative progress.

Item Type:	Article
Keywords:	synergy of human and AI intelligence, Human-in-the-Loop, AI hallucinations, AI deceptive behaviours, AI self-preserving autonomy
Subjects:	Computing
Date Deposited:	10 Sep 2025
URI:	https://repository.uwl.ac.uk/id/eprint/14054

Downloads

Downloads per month over past year

Actions (admin access)

References

Akbarighatar, P. (2025) 'Operationalizing responsible AI principles through responsible AI capabilities', Ai and ethics (Online), 5(2), pp. 1787–1801. Available at: https://doi.org/10.1007/s43681-024-00524-4
Barkur, S.K., Schacht, S. and Scholl, J. (2025) 'Deception in LLMs: Self-preservation and autonomous goals in large language models', ArXiv, , pp. 34. Available at: https://doi.org/10.48550/arxiv.2501.16513
Batool, A., Zowghi, D. and Bano, M. (2023) 'Responsible AI governance: A systematic literature review', . Available at: https://doi.org/10.48550/arxiv.2401.10896
Bostrom, N. (2014) Superintelligence. 1. ed. edn.Oxford Univ. Press.
Bostrom, N. (2019), The Vulnerable World Hypothesis. Glob Policy, 10: 455-476. https://doi.org/10.1111/1758-5899.12718
BSI (2025) ‘Design Principles for LLM-base Systems with zero trust’ , Available at: https://www.bsi.bund.de/SharedDocs/Downloads/EN/BSI/Publications/ANSSI-BSI-joint-releases/LLM-based_Systems_Zero_Trust.html#:~:text=In%20this%20collaborative%20German-French%20publication%20titled%20%22Design%20Principles,secure%20deployment%20of%20large%20language%20model%20%28LLM%29%20systems
Bughin, J. (2025) 'Doing versus saying: Responsible AI among large firms', AI & society, 40(4), pp. 2751–2763. Available at: https://doi.org/10.1007/s00146-024-02014-x
Carlsmith, J. (2023) 'Scheming AIs: Will AIs fake alignment during training in order to get power?', . Available at: https://doi.org/10.48550/arxiv.2311.08379
Cummings, M.L. (2025) 'Identifying AI hazards and responsibility gaps', IEEE access, 13, pp. 54338–54349. Available at: https://doi.org/10.1109/ACCESS.2025.3552200
Diffchecker, Available at https://diffcheck.io/
Gensim library. Available at: https://pypi.org/project/gensim/
Goellner, S., Tropmann-Frick, M. and Brumen, B. (2024) 'Responsible artificial intelligence: A structured literature review', . Available at: https://doi.org/10.48550/arxiv.2403.06910
Grassini, S. and Koivisto, M. (2025) 'Artificial creativity? evaluating AI against human performance in creative interpretation of visual stimuli', International journal of human-computer interaction, 41(7), pp. 4037–4048. Available at: https://doi.org/10.1080/10447318.2024.2345430
Greenblatt, R., et al. (2023) 'AI control: Improving safety despite intentional subversion', ArXiv, . Available at: https://doi.org/10.48550/arxiv.2312.06942
He, Y., et al. (2025) 'Evaluating the paperclip maximizer: Are RL-based language models more likely to pursue instrumental goals?', Арѝиж, , pp. 15. Available at: https://doi.org/10.48550/arxiv.2502.12206
Huang, L., et al. (2025) 'A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions', ACM transactions on information systems, 43(2), pp. 1–55. Available at: https://doi.org/10.1145/3703155
Hugging Face models. Available at: https://huggingface.co/
Jedličková, A. (2025) 'Ethical approaches in designing autonomous and intelligent systems: A comprehensive survey towards responsible development', AI & society, 40(4), pp. 2703–2716. Available at: https://doi.org/10.1007/s00146-024-02040-9
Karran, A.J., et al. (2025) 'Multi-stakeholder perspective on responsible artificial intelligence and acceptability in education', NPJ science of learning, 10(1), pp. 44–12. Available at: https://doi.org/10.1038/s41539-025-00333-2
Korbak, T., et al. (2025) 'How to evaluate control measures for LLM agents? A trajectory from today to superintelligence', . Available at: https://doi.org/10.48550/arxiv.2504.05259
Marri,R., Dabbara, L.N. and Karampuri, S. (2024) ‘AI security in different industries: A comprehensive review of vulnerabilities and mitigation strategies ‘, Int. J. Sci. Res. Arch., 13(01), pp. 2375–2393. Available at: https://doi.org/10.30574/ijsra.2024.13.1.1923
Meduri, K. et al., (2025) ‘Accountability and Transparency Ensuring Responsible AI Development’. In P. Bhattacharya, A. Hassan, H. Liu, & B. Bhushan (Eds.), Ethical Dimensions of AI Development (pp. 83-102). IGI Global Scientific Publishing. https://doi.org/10.4018/979-8-3693-4147-6.ch004
McCarthy, J., et al. (2006) 'A proposal for the dartmouth summer research project on artificial intelligence: August 31, 1955', The AI magazine, 27(4), pp. 12–14. Available at: https://doi.org/10.1609/aimag.v27i4.1904
Mohamed, N. (2023) 'Current trends in AI and ML for cybersecurity: A state-of-the-art survey', Cogent engineering, 10(2). Available at: https://doi.org/10.1080/23311916.2023.2272358
NLTK. Available at: https://www.nltk.org/
Nye, M., et al. (2021) 'Show your work: Scratchpads for intermediate computation with language models', ArXiv, . Available at: https://doi.org/10.48550/arxiv.2112.00114
Obbu, S. (2025) 'Zero trust architecture for AI-powered cloud systems: Securing the future of automated workloads', World Journal of Advanced Research and Reviews, 26(1), pp. 1315–1339. Available at: https://doi.org/10.30574/wjarr.2025.26.1.1173
Ofusori, L., Bokaba, T. and Mhlongo, S. (2024) 'Artificial intelligence in cybersecurity: A comprehensive review and future direction', Applied artificial intelligence, 38(1). Available at: https://doi.org/10.1080/08839514.2024.2439609
O'Keefe, C. et al., (2025) ‘Law-Following AI: Designing AI Agents to Obey Human Laws’, 94 Fordham L. Rev. 57, Available at: http://dx.doi.org/10.2139/ssrn.5242643
Papagiannidis, E., Mikalef, P. and Conboy, K. (2025) 'Responsible artificial intelligence governance: A review and research framework', The journal of strategic information systems, 34(2), pp. 101885. Available at: https://doi.org/10.1016/j.jsis.2024.101885
Park, P.S., et al. (2024) 'AI deception: A survey of examples, risks, and potential solutions', Patterns (New York, N.Y.), 5(5), pp. 100988. Available at: https://doi.org/10.1016/j.patter.2024.100988
Radanliev, P., et al. (2024) 'Ethics and responsible AI deployment', Frontiers in artificial intelligence, 7, pp. 1377011. Available at: https://doi.org/10.3389/frai.2024.1377011
Raza, S., et al. (2025) 'Who is responsible? the data, models, users or regulations? A comprehensive survey on responsible generative AI for a sustainable future'. Available at: https://doi.org/10.48550/arxiv.2502.08650
Sadek, M., et al. (2025) 'Challenges of responsible AI in practice: Scoping review and recommended actions', AI & society, 40(1), pp. 199–215. Available at: https://doi.org/10.1007/s00146-024-01880-9
Shamsuddin, R., Tabrizi, H.B. and Gottimukkula, P.R. (2025) 'Towards responsible AI: An implementable blueprint for integrating explainability and social-cognitive frameworks in AI systems', AI Perspectives & Advances, 7(1), pp. 1. Available at: https://doi.org/10.1186/s42467-024-00016-5
Shetty, P. (2024) 'AI and security, from an information security and risk manager standpoint', IEEE access, 12, pp. 77468–77474. Available at: https://doi.org/10.1109/ACCESS.2024.3408144
Shi, B., et al. (2017) 'Relationship between divergent thinking and intelligence: An empirical study of the threshold hypothesis with chinese children', Frontiers in psychology, 8, pp. 254. Available at: https://doi.org/10.3389/fpsyg.2017.00254
Smith, S.M., et al. (2025) 'A university framework for the responsible use of generative AI in research', Journal of higher education policy and management, , pp. 1–20. Available at: https://doi.org/10.1080/1360080X.2025.2509187
spaCy. Available at: https://spacy.io/
Stillwell, H. and Harrington, S. (2025) ‘Michael Scott Is Not a Juror: The Limits of AI in Simulating Human Judgment”, SSRN, p.54. Available at: http://dx.doi.org/10.2139/ssrn.5400737
Taylor, I. (2025) 'Is explainable AI responsible AI?', AI & society, 40(3), pp. 1695–1704. Available at: https://doi.org/10.1007/s00146-024-01939-7
Vaswani et al. (2017)’Attention is all you need’, ArXiv, p.15. Available at: https://doi.org/10.48550/arXiv.1706.03762
Vilas, M. J. (2024)’ Position: An Inner Interpretability Framework for AI Inspired by Lessons from Cognitive Neuroscience‘, ArXiv, p.17. Available at: https://doi.org/10.48550/arXiv.2406.01352
Vulpe, S., et al. (2024) 'AI and cybersecurity: A risk society perspective', Frontiers in computer science (Lausanne), 6. Available at: https://doi.org/10.3389/fcomp.2024.1462250
Walker, P.B., et al. (2025) 'Harnessing metacognition for safe and responsible AI', Technologies (Basel), 13(3), pp. 107. Available at: https://doi.org/10.3390/technologies13030107
WinMerge, Available at: https://winmerge.org/?lang=en
Wen, J., et al. (2024) 'Adaptive deployment of untrusted LLMs reduces distributed threats', ArXiv, . Available at: https://doi.org/10.48550/arxiv.2411.17693
Zhang, A.Q., et al. (2025) 'AURA: Amplifying understanding, resilience, and awareness for responsible AI content work', Proceedings of the ACM on human-computer interaction, 9(2), pp. 1–45. Available at: https://doi.org/10.1145/3710931

Tools

CORE (COnnecting REpositories)

The University of West London

Is it time to treat AI as a creature?

Abstract

Downloads

Actions (admin access)

Menu