Security of Artificial Intelligence Systems in Public-Sector and Enterprise Deployment

A taxonomy of risks, the normative framework and a controlled-trust architecture for the trustworthy deployment of large language models in public administration and enterprises.

Abstract

This paper addresses the security of artificial intelligence (AI) systems, in particular applications built on large language models (LLMs), as they are deployed across public administration, mid-sized organisations and large enterprises. Drawing on current threat taxonomies and empirical findings on so-called shadow AI, it delineates the specific attack surface that arises from the probabilistic and opaque nature of generative models and from the absence of any separation between instructions and data. The paper further synthesises the binding regulatory and voluntary normative framework (EU AI Act, NIS2, GDPR, ISO/IEC 42001, NIST AI RMF) and proposes a controlled-trust architecture grounded in a multi-level heuristic gate and in risk management spanning the entire model lifecycle. In conclusion, it formulates implications differentiated by type of entity and argues that security built into the design is not an obstacle but a precondition for the sustainable adoption of AI.

Keywords: artificial intelligence; large language models; cybersecurity; risk management; prompt injection; shadow AI; EU AI Act; governance.

1. Introduction and Problem Definition

Over a short span of time, artificial intelligence has moved from an experimental stage into production use. Generative models are now employed to process official submissions, triage requests, summarise contracts, support software development and operate customer channels. The pace of adoption has, however, outstripped the build-out of corresponding security and governance mechanisms, giving rise to a new attack surface and a new class of operational risk. Whereas the earlier phase of development was characterised by the question of model capabilities, the present phase raises the question of the trustworthiness of their outputs — that is, under what conditions and to what extent they may be trusted in decisions with legal or economic effect.

In this paper we do not treat AI security as a simple subset of classical information security. A traditional information system is largely deterministic and its behaviour is auditable at the level of individual operations. A system built on an LLM is, by contrast, probabilistic, its decision-making process is only partly interpretable and — crucially — instructions and data enter the model through a shared channel without any explicit separation^[1]. The aim of the paper is to (i) define the specific threat taxonomy of such systems, (ii) summarise the relevant normative framework and (iii) propose a control architecture applicable across organisation types.

2. The Specific Threat Surface of LLM Systems

The reference taxonomy is provided by the OWASP Top 10 for LLM Applications in its 2025 edition^[1]. For the second year in a row, the top position is occupied by prompt injection — a class of attacks in which an adversary inserts into the input an instruction that the model interprets as a command rather than as content to be processed. Because the model cannot reliably distinguish data from instructions, it complies with the injected instruction. The attack may be direct (from the user) or indirect, concealed in a document, an e-mail or a web page that the model processes as part of retrieval-augmented generation (RAG).

Second place is now held by sensitive information disclosure. Models may memorise and reproduce fragments of training data, including personal data and trade secrets; leakage also occurs through the prompt, a connected system or inadequate access management^[1]. The taxonomy further encompasses data and model poisoning, supply-chain vulnerabilities, excessive permissions granted to autonomous agents and the generation of misinformation. A common feature of these threats is that they arise in the semantic layer and are generally not caught by traditional perimeter tools (firewall, antivirus); they therefore require a dedicated control layer.

3. Shadow AI as an Organisational Risk

Empirical findings suggest that a significant portion of the risk is generated not by external attackers but by employees acting in good faith. The phenomenon of shadow AI — the use of unapproved tools beyond the oversight of IT — has become one of the fastest-growing vectors of data leakage^[2][3]. Available surveys indicate that most employees use generative tools at work, a substantial share access them through private accounts outside the organisation’s control, and a non-negligible proportion admit to entering sensitive data^[3]. IBM’s Cost of a Data Breach Report 2025 states that roughly one fifth of breached organisations were compromised through shadow AI, increasing the average cost of an incident by hundreds of thousands of dollars^[2].

Indicator	Value
Organisations that recorded unapproved use of AI	98%
Share of data breaches (2025) via shadow AI	~20%
Average increase in incident cost	+$670k
Organisations with a policy to detect shadow AI	37%

Figure 1. Selected indicators of the scale and consequences of shadow AI. Source: [2], [3].

The consequence is twofold. First, data leave the organisation’s perimeter at the moment they are entered into a public service; second, there is no audit record of which data were disclosed and to whom. In public administration the impact is more severe, since it concerns citizens’ data under a special protection regime. A restrictive approach (a blanket ban) appears counterproductive, as it pushes usage even deeper into the shadows; a more appropriate solution is to provide a secure, approved and monitored alternative.

“AI security does not begin with the model, but with the question of which data may enter it and who may trust what comes out of it.”

— the principle of data minimisation and governed trust

4. The Normative and Regulatory Framework

The framework for deploying AI is formed by an overlapping system of binding rules and voluntary standards. In the EU, the binding layer comprises the EU AI Act, NIS2 and GDPR; the voluntary — yet increasingly required in practice — layer is formed by ISO/IEC 42001 and the NIST AI RMF, which provide a methodically validated means of achieving and demonstrating compliance (Table 1).

Framework	Nature	Main requirements for the organisation
EU AI Act^[6]	Binding (EU)	Classification of systems by risk, risk management, technical documentation, human oversight and transparency. Most provisions effective from August 2026, high-risk systems under Annex III from December 2027.
NIS2^[7]	Binding (EU)	Proportionate cyber-risk management measures, encryption, multi-factor authentication, access management and incident reporting; accountability at management level.
GDPR^[8]	Binding (EU)	Restriction of fully automated decision-making (Art. 22), mandatory data protection impact assessment (DPIA, Art. 35), data minimisation.
ISO/IEC 42001^[4]	Standard / certification	Establishment of an AI management system (AIMS) — governance, risk management, transparency and ethics across the lifecycle; independent audit, certificate typically valid for 3 years.
NIST AI RMF^[5]	Voluntary framework	Four functions — Govern, Map, Measure, Manage — from a risk-management culture through identification and measurement to ongoing risk handling over the system’s lifetime.

Table 1. Overview of the relevant regulatory and normative frameworks for deploying AI.

For large enterprises, ISO/IEC 42001 acquires a function analogous to that of ISO/IEC 27001 in information security — it becomes a standard of trust and a criterion in supplier selection^[4]. For public administration, by contrast, the EU AI Act is decisive, since several of its uses (social benefits, border protection, justice, law enforcement) are placed in the high-risk category^[6].

5. Controlled-Trust Architecture: The Heuristic Gate

Regulatory and normative requirements must be translated into a concrete technical control. At the core of the proposed approach is a heuristic gate — a control layer through which every input (prompt), every intermediate operation and every output passes before it reaches an authorised user or a downstream system. Instead of implicit trust in the model, each interaction is verified at six mutually independent levels (Figure 2).

Interaction flow: input / model → 6 levels of control → authorised user or system.

Level 1 — Input control. Analysis of the prompt, detection of prompt injection, jailbreaks and prohibited requests before submission to the model.
Level 2 — Permission management. Verification of role and access rights; both the user and an autonomous agent act only within the scope of the permissions granted.
Level 3 — Data protection. Detection and masking of personal and sensitive data in both input and output, data minimisation in line with the GDPR.
Level 4 — Factual correctness. Heuristic checking of the output, detection of hallucinations and cross-verification of claims and sources.
Level 5 — Content safety. Filtering of toxic, biased and manipulative content and blocking of undesirable actions.
Level 6 — Human oversight. Final approval for decisions with legal or significant effect (GDPR Art. 22) — a human in the decision-making loop.

Figure 2. The six-level heuristic-gate model for controlling interactions with an LLM.

For organisations and large enterprises the gate has one further essential function: every interaction is logged, creating an auditable trail of who used the system, for what purpose and with what outcome, and which controls let the output through. This systematically addresses the blind spot of shadow AI — instead of an uncontrolled leak of data into public services, the security team gains full visibility and a forensic trail.

6. Lifecycle and Risk Management

Security is not a one-off audit before going into operation, but a property of the entire lifecycle, in line with the functional model of the NIST AI RMF (Govern, Map, Measure, Manage)^[5]. In practice it can be operationalised into five recurring phases.

6.1 Assessment and Mapping

Before implementation, the system’s risk class is determined, a DPIA and a legal analysis of the use case are produced, and the set of affected data subjects is mapped. Without this step it is impossible to correctly dimension the controls or the scope of human oversight.

6.2 Design and Validation

Security controls, data minimisation and oversight points are built into the design. Before deployment, the system is tested for bias, accuracy and robustness, including targeted red teaming that attempts to bypass the gate and manipulate the model.

6.3 Deployment and Monitoring

Going into operation is a controlled process, with human oversight and full logging. It is followed by continuous measurement: monitoring of model drift, output quality and anomalies, with incident response in place and 24-hour reporting under NIS2^[7]. A system that ceases to meet the defined criteria is withdrawn or retrained in a controlled manner.

Note on the ordering of steps. From the standpoint of cost-effectiveness it is decisive that controls be part of the design rather than a subsequent retrofit. The cost of remediation rises non-linearly with the lifecycle phase — an incident in operation is orders of magnitude more costly than a control built into the architecture, and this does not even include reputational and regulatory consequences.

7. Implications by Type of Entity

7.1 Public Administration

Public-administration entities process citizens’ data, and several of their AI uses are high-risk by law. The priority is compliance with the EU AI Act, mandatory human oversight in decisions on rights and entitlements, data sovereignty (hosting within the EU) and readiness to demonstrate documentation to supervisory authorities. Given the length of procurement cycles, it is advisable to begin preparation well ahead of the 2026–2030 deadlines.

7.2 Mid-sized Organisations

In this segment the most acute problem is shadow AI and the absence of a usage policy. The most effective first measure is to provide employees with an approved and monitored tool equipped with a heuristic gate, complemented by clear rules and training, thereby eliminating the leakage of data into public services and creating basic audit visibility.

7.3 Large Enterprises

For large enterprises, AI security becomes a matter of comprehensive governance: a register of AI systems, certification under ISO/IEC 42001, management of the model supply chain, control of autonomous agents and integration of AI into the existing cybersecurity strategy and incident-management process. Certification and an auditable architecture also become a competitive advantage in winning customers and public contracts.

8. Discussion and Conclusion

The analysis presented suggests that AI security is by its nature a cross-cutting problem that cannot be reduced to a purely technical or a purely regulatory measure. The proposed controlled-trust architecture — a combination of a multi-level heuristic gate and lifecycle risk management — represents a practical framework that translates the requirements of rules and standards into verifiable controls. A limitation of the approach is that the effectiveness of the individual gate levels depends on the quality of the detection heuristics and on their continuous updating against new attacks; future research should therefore focus on quantitative metrics for the effectiveness of the individual controls and on their standardisation.

It may be concluded that artificial intelligence delivers value only when its outputs can be trusted, while trust cannot be assumed — it must be built and continuously demonstrated. Organisations that approach AI security as an integral part of their architecture, regulatory compliance and risk-management culture gain not only protection against attacks and sanctions, but also the ability to deploy AI faster and at greater scale, because every scenario is auditable and defensible from the outset. Security is thus not a brake on innovation, but a precondition for its sustainable growth.

References

OWASP. OWASP Top 10 for LLM Applications 2025. OWASP GenAI Security Project, 2025.
IBM Security. Cost of a Data Breach Report 2025. IBM / Ponemon Institute, 2025.
Menlo Security et al. State of Shadow AI / surveys on the use of generative AI in enterprises, 2025–2026.
ISO/IEC. ISO/IEC 42001:2023 — Information technology — Artificial intelligence — Management system. Geneva, 2023.
NIST. Artificial Intelligence Risk Management Framework (AI RMF 1.0). National Institute of Standards and Technology, 2023.
European Parliament and Council. Regulation (EU) 2024/1689 (Artificial Intelligence Act). Official Journal of the EU, 2024.
European Parliament and Council. Directive (EU) 2022/2555 (NIS2). Official Journal of the EU, 2022.
European Parliament and Council. Regulation (EU) 2016/679 (GDPR), Art. 22 and 35. Official Journal of the EU, 2016.