The problem extends deep into the application logic itself through prompt injection and insecure output handling. When developers integrate LLMs for chatbots or summarizers, they often create paths where untrusted user input flows directly into the model's context—a new attack class that traditional tools don't understand.
Furthermore, because developers often treat AI-generated content as "trusted" internal data, they render it directly onto the page without proper sanitization. This leads to stored XSS and markdown injection that bypasses standard filters. We are also seeing a dangerous rise in "excessive agency," where agent frameworks like LangChain or CrewAI grant LLMs the power to execute code or modify databases with permissive defaults like allow_dangerous_requests=True. Existing audits aren't designed to check if an AI agent has the permissions to delete a database simply because a framework template suggested it.
Even more subtle is the threat of indirect prompt injection, where an AI processes external data—like a webpage or an email—that contains hidden instructions meant to hijack the model's behavior. The traditional security stack was built to catch human-written errors, not "data that becomes instructions" when processed by a machine. To combat this, a new layer of the security stack is emerging.
Developers are starting to use tools like Lakera Guard or NVIDIA’s Garak to provide real-time guardrails and red-teaming for LLMs. Others are adopting Zero Trust principles for AI agents, ensuring they operate with the absolute minimum permissions necessary. The shift is clear: the old tools catch old bugs, but as machines take over the coding process, we need a defense strategy that understands how those machines actually think.