Prompt Injection: the SQL Injection of the AI era

Prompt Injection represents an evolution of SQL Injection within the context of generative artificial intelligence. In both cases the underlying issue arises from combining untrusted information with mechanisms capable of altering system behaviour, although in language models this boundary is far less explicit.

In large language models there is no strict separation between data and instructions. Everything is processed as tokens within a shared context, which means that seemingly harmless content such as documents, emails or web pages can directly influence system execution. An attacker can embed hidden instructions within these sources and modify model behaviour without exploiting any traditional vulnerability.

This risk becomes more pronounced in architectures based on retrieval augmented generation or in agent systems connected to external tools such as APIs, databases or automation workflows. In these scenarios the model does not only generate text but can also trigger real actions, which turns Prompt Injection into an operational security concern.

A particularly important case is indirect prompt injection, where malicious instructions are embedded in external content that the system ingests automatically. In such cases the external content effectively behaves like code from a functional perspective.

Mitigation is challenging because these models are not deterministic and tend to prioritise semantic coherence over strict rule enforcement. This reduces the effectiveness of traditional approaches such as filters or blacklist based controls, as they can often be bypassed through contextual manipulation or obfuscation.

For this reason current security strategies are moving towards Zero Trust models where agents operate with minimal permissions, access to tools is tightly controlled and external content is treated as untrusted until it has been validated. Alongside this there is a growing need for advanced observability to record prompts, tool usage and model decisions in order to ensure full traceability.

In parallel, AI based monitoring systems are emerging that supervise other models in order to detect anomalous patterns and block actions before execution.

Within this new context Prompt Injection redefines the attack surface of artificial intelligence systems. The prompt is no longer a simple text input but becomes a potential control vector for the system.

Technical References

Share the Post: