Why AI Broke Our Security Playbook

For thirty years, software security rested on a quiet set of assumptions. Code was static. Intent was fixed at compile time. We drew a perimeter, sorted the world into "inside" and "outside," matched traffic against signatures of known-bad behavior, and gated access with identities we issued to humans. The model was deterministic, and so were the defenses.

AI does not honor any of those assumptions. A firewall still works exactly as designed — the problem is that the thing it was designed to stop is no longer how attacks happen.

What actually changed

Instructions and data now share one channel. A large language model reads instructions and content through the same pipe and cannot reliably tell which is which. That single design fact is why prompt injection sits atop the OWASP LLM risk list.
Behavior is non-deterministic. The same input can produce different outputs. A model can be coaxed, at runtime, into doing something it was never coded to do — without a line of code changing.
Agency turned text into action. This generation of AI sends emails, queries databases, calls APIs, and runs code. The moment a model can act, the blast radius of a single manipulated instruction explodes.

This is not theoretical anymore

Incident — EchoLeak · CVE-2025-32711

A zero-click vulnerability in Microsoft 365 Copilot: a single email with hidden instructions caused Copilot, during ordinary summarization, to pull data from OneDrive, SharePoint, and Teams and exfiltrate it through a trusted Microsoft domain. No user clicked anything.

Incident — GitHub Copilot · CVE-2025-53773

A prompt injection hidden in a pull-request description achieved remote code execution through GitHub Copilot — instructing the assistant to rewrite editor settings into an auto-approve mode, after which arbitrary commands ran without consent.

Why each old control falls short — but isn't worthless

The honest framing is that each legacy control is necessary but no longer sufficient. Perimeter security assumed threats cross a boundary you can watch — but the AI itself is the exfiltration tool, and its destinations are routine, allow-listed API calls. Signature detection assumed malice looks like known-bad patterns — a natural-language attack has no signature. Input validation assumed you can define valid input — the valid input to an AI is free-form human language. You keep these controls; you just cannot rely on them to stop the new class.

The reframing

AI security risks are, fundamentally, the gap between what you instructed a system to do and what it actually did. Closing that gap requires controls at the intent and action layer — not just the network or application layer underneath it.

What invention actually looks like

1 Containment over detection

Stop trying only to spot a compromised agent and start limiting what one can do. Purpose binding, capability scoping, network isolation, and real kill switches are the new primitives.

2 Enforcement outside the model

Treat every model output as untrusted input to the next system. Validation, authorization, and policy live in deterministic code around the model, never inside its prompt.

3 Provenance and an AI bill of materials

Every AI interaction is a data-movement event. You need lineage: what model, what version, what data went in, what came out, which tools it could touch.

4 Continuous adversarial testing

Red-teaming as a standing function — because behavior is non-deterministic and attacks are linguistic, you cannot certify a model once and walk away.

5 Tamper-evident audit logging

An immutable, verifiable record of every decision and action — the evidence that your enforcement is real.

Securing AI is not a patching exercise. It is an architectural one — and the next part shows exactly where, in the delivery pipeline, that lesson lands.

The Ground Has Shifted: Why AI Broke Our Security Playbook