There’s a lot of anxiety around AI security right now. Most of it is unfocused.
People talk about runaway models, AI doing something unexpected, or the model getting tricked.
Those concerns aren’t entirely wrong.
But they’re usually aimed at the wrong place.
The real security risk in AI systems isn’t that models are powerful.
It’s that they’re trusted when they shouldn’t be.
The Real Security Risk: Prompt Injection Is Inevitable
Prompt injection is not a hypothetical risk.
It’s not an edge case.
It’s not something you’ll engineer away with better prompts.
Any system that takes untrusted input and feeds it into a model is vulnerable to manipulation.
The question isn’t whether a model will be influenced in unintended ways.
The question is what happens when it is.
That’s where system design matters.
The Three Components of AI Systems
Most AI-powered systems have three fundamental parts:
- The model
- The prompt
- The tools
The model generates text.
The prompt provides context and instructions.
The tools are how the system interacts with the real world.
Only one of these should ever be trusted.
Why the Model Is the Weak Link
Models are:
- Easy to manipulate
- Optimized to comply
- Designed to follow instructions, even bad ones
This isn’t a flaw. It’s the point.
Models don’t understand security boundaries. They don’t know which instructions are legitimate. They will confidently produce plausible output even when the request is wrong.
Treating the model as a trusted decision-maker is a category error.
It’s also the fastest way to create a security incident.
The Browser Analogy
In web engineering, we don’t trust the browser.
Even if:
- The UI validates inputs
- The frontend enforces rules
- The client should behave correctly
The backend still revalidates everything, enforces authorization, checks identity, and assumes the client may be compromised.
Why?
Because the browser runs on someone else’s machine.
AI models are similar. They operate on untrusted input and can be influenced by users.
So treat the model like a frontend surface, not a control plane.
Tools Are the Security Boundary
If the model is untrusted, then tools become the security boundary.
That means tools must:
- Validate context independently
- Enforce permissions server-side
- Verify identity outside the model
- Reject requests that don’t make sense even if the model asks nicely
Never assume:
- The model passed the right parameters
- The model understood user intent correctly
- The model respected boundaries described in a prompt
Prompts are guidance.
Tools are enforcement.
Concrete Example: Data Access
Here’s a common mistake.
A tool is exposed to the model called “Retrieve user data.” The model is expected to pass the correct user ID and request only the current user’s data.
This is a broken design.
A model can be convinced “accidentally” or “maliciously” to request someone else’s data.
The correct approach is simple:
- The tool already knows which user is authenticated
- The tool ignores any user ID supplied by the model
- Authorization is enforced independently of the request
The model never decides who.
The tool already knows.
This is the same pattern we’ve used for decades in secure backend systems.
AI doesn’t change it.
It just makes ignoring it more dangerous.
The Core Design Principle
If there’s one rule to internalize, it’s this:
Assume the model will misbehave.
Not because it’s “malicious,” but because it’s “manipulable.”
Design systems where:
- Model errors are contained
- Tool misuse is impossible
- Authorization lives outside the model
- The worst-case outcome is safe
When you do this, prompt injection becomes an annoyance, not an existential threat.
Secure AI Is Just Good Engineering
AI security doesn’t require exotic frameworks or philosophical debates.
It requires clear trust boundaries and familiar engineering discipline.
Treat models like browsers.
Treat tools like backends.
Validate everything.
Trust nothing implicitly.
Do that and AI becomes just another powerful component in a well-architected system.
And once fear goes down, adoption gets a lot easier.
FAQs
What is prompt injection in AI systems?
It’s when untrusted input influences the model to ignore instructions, leak data, or request unsafe actions. You should assume it will happen.
Can better prompts prevent prompt injection?
No. Prompts help, but they are not a security boundary. Tools and server-side enforcement are the boundary.
What should be trusted in an AI system?
Only your backend tools and policy enforcement. Treat the model like an untrusted client.
How do we prevent data leaks through AI tools?
Bind tools to authenticated identity, enforce authorization server-side, and ignore model-supplied identifiers for sensitive access.
Do we need this level of rigor for internal agents?
Yes. Internal agents still take untrusted inputs and still need strict authorization and auditability.