A taxonomy of risks, the normative framework and a controlled-trust architecture for the trustworthy deployment of large language models in public administration and enterprises.

Abstract

This paper addresses the security of artificial intelligence (AI) systems, in particular applications built on large language models (LLMs), as they are deployed across public administration, mid-sized organisations and large enterprises. Drawing on current threat taxonomies and empirical findings on so-called shadow AI, it delineates the specific attack surface that arises from the probabilistic and opaque nature of generative models and from the absence of any separation between instructions and data. The paper further synthesises the binding regulatory and voluntary normative framework (EU AI Act, NIS2, GDPR, ISO/IEC 42001, NIST AI RMF) and proposes a controlled-trust architecture grounded in a multi-level heuristic gate and in risk management spanning the entire model lifecycle. In conclusion, it formulates implications differentiated by type of entity and argues that security built into the design is not an obstacle but a precondition for the sustainable adoption of AI.

Keywords: artificial intelligence; large language models; cybersecurity; risk management; prompt injection; shadow AI; EU AI Act; governance.

1. Introduction and Problem Definition

Over a short span of time, artificial intelligence has moved from an experimental stage into production use. Generative models are now employed to process official submissions, triage requests, summarise contracts, support software development and operate customer channels. The pace of adoption has, however, outstripped the build-out of corresponding security and governance mechanisms, giving rise to a new attack surface and a new class of operational risk. Whereas the earlier phase of development was characterised by the question of model capabilities, the present phase raises the question of the trustworthiness of their outputs — that is, under what conditions and to what extent they may be trusted in decisions with legal or economic effect.

In this paper we do not treat AI security as a simple subset of classical information security. A traditional information system is largely deterministic and its behaviour is auditable at the level of individual operations. A system built on an LLM is, by contrast, probabilistic, its decision-making process is only partly interpretable and — crucially — instructions and data enter the model through a shared channel without any explicit separation[1]. The aim of the paper is to (i) define the specific threat taxonomy of such systems, (ii) summarise the relevant normative framework and (iii) propose a control architecture applicable across organisation types.

2. The Specific Threat Surface of LLM Systems

The reference taxonomy is provided by the OWASP Top 10 for LLM Applications in its 2025 edition[1]. For the second year in a row, the top position is occupied by prompt injection — a class of attacks in which an adversary inserts into the input an instruction that the model interprets as a command rather than as content to be processed. Because the model cannot reliably distinguish data from instructions, it complies with the injected instruction. The attack may be direct (from the user) or indirect, concealed in a document, an e-mail or a web page that the model processes as part of retrieval-augmented generation (RAG).

Second place is now held by sensitive information disclosure. Models may memorise and reproduce fragments of training data, including personal data and trade secrets; leakage also occurs through the prompt, a connected system or inadequate access management[1]. The taxonomy further encompasses data and model poisoning, supply-chain vulnerabilities, excessive permissions granted to autonomous agents and the generation of misinformation. A common feature of these threats is that they arise in the semantic layer and are generally not caught by traditional perimeter tools (firewall, antivirus); they therefore require a dedicated control layer.

3. Shadow AI as an Organisational Risk

Empirical findings suggest that a significant portion of the risk is generated not by external attackers but by employees acting in good faith. The phenomenon of shadow AI — the use of unapproved tools beyond the oversight of IT — has become one of the fastest-growing vectors of data leakage[2][3]. Available surveys indicate that most employees use generative tools at work, a substantial share access them through private accounts outside the organisation’s control, and a non-negligible proportion admit to entering sensitive data[3]. IBM’s Cost of a Data Breach Report 2025 states that roughly one fifth of breached organisations were compromised through shadow AI, increasing the average cost of an incident by hundreds of thousands of dollars[2].

Indicator Value
Organisations that recorded unapproved use of AI 98%
Share of data breaches (2025) via shadow AI ~20%
Average increase in incident cost +$670k
Organisations with a policy to detect shadow AI 37%

Figure 1. Selected indicators of the scale and consequences of shadow AI. Source: [2], [3].

The consequence is twofold. First, data leave the organisation’s perimeter at the moment they are entered into a public service; second, there is no audit record of which data were disclosed and to whom. In public administration the impact is more severe, since it concerns citizens’ data under a special protection regime. A restrictive approach (a blanket ban) appears counterproductive, as it pushes usage even deeper into the shadows; a more appropriate solution is to provide a secure, approved and monitored alternative.

“AI security does not begin with the model, but with the question of which data may enter it and who may trust what comes out of it.”

— the principle of data minimisation and governed trust

4. The Normative and Regulatory Framework

The framework for deploying AI is formed by an overlapping system of binding rules and voluntary standards. In the EU, the binding layer comprises the EU AI Act, NIS2 and GDPR; the voluntary — yet increasingly required in practice — layer is formed by ISO/IEC 42001 and the NIST AI RMF, which provide a methodically validated means of achieving and demonstrating compliance (Table 1).

Framework Nature Main requirements for the organisation
EU AI Act[6] Binding (EU) Classification of systems by risk, risk management, technical documentation, human oversight and transparency. Most provisions effective from August 2026, high-risk systems under Annex III from December 2027.
NIS2[7] Binding (EU) Proportionate cyber-risk management measures, encryption, multi-factor authentication, access management and incident reporting; accountability at management level.
GDPR[8] Binding (EU) Restriction of fully automated decision-making (Art. 22), mandatory data protection impact assessment (DPIA, Art. 35), data minimisation.
ISO/IEC 42001[4] Standard / certification Establishment of an AI management system (AIMS) — governance, risk management, transparency and ethics across the lifecycle; independent audit, certificate typically valid for 3 years.
NIST AI RMF[5] Voluntary framework Four functions — Govern, Map, Measure, Manage — from a risk-management culture through identification and measurement to ongoing risk handling over the system’s lifetime.

Table 1. Overview of the relevant regulatory and normative frameworks for deploying AI.

For large enterprises, ISO/IEC 42001 acquires a function analogous to that of ISO/IEC 27001 in information security — it becomes a standard of trust and a criterion in supplier selection[4]. For public administration, by contrast, the EU AI Act is decisive, since several of its uses (social benefits, border protection, justice, law enforcement) are placed in the high-risk category[6].

This article is part of IOAS paid content.

Secure payment via Stripe · access restorable by e-mail, no sign-up