AI Defenses

Overview

Defending AI systems requires controls at every layer — input validation before the model, guardrails within the model pipeline, output filtering after generation, and monitoring across the full lifecycle. Traditional application security (AuthN, AuthZ, rate limiting) still applies, but AI introduces new attack surfaces that require purpose-built defenses.

MITRE ATLAS publishes mitigations alongside its technique database. OWASP provides implementation guidance through the LLM Top 10 and AI Security guidelines.

Topics in This Section

General Approach

Threat model the AI system — map data flows, trust boundaries, and model capabilities using ATLAS tactics
Apply defense in depth — input sanitization, system prompt hardening, output validation, tool/action restrictions
Monitor for adversarial behavior — anomalous queries, prompt patterns, unexpected model outputs, data drift
Test continuously — red team the AI pipeline with adversarial inputs, jailbreak attempts, and supply chain checks