MITRE ATLAS is a knowledge base of adversarial tactics and techniques against
AI systems, maintained by MITRE Corporation. ThornGuard references ATLAS
technique identifiers to contextualize its detections — this does not imply
certification, endorsement, or formal compliance with any MITRE program.
Summary Mapping
| ATLAS Technique | ID | ThornGuard Controls |
|---|---|---|
| LLM Prompt Injection | AML.T0051 | Tool poisoning detection, hidden Unicode stripping, ANSI/VT escape removal, instruction override scanning |
| ML Supply Chain Compromise | AML.T0010 | TOFU schema pinning (SHA-256 drift detection), tool inventory with risk scoring, approval workflows |
| Exfiltration via ML Inference API | AML.T0024 | PII/secret redaction (10+ pattern types), cross-server data flow governance, namespace isolation, taint tracking |
| Data Poisoning | AML.T0020 | Output scanning on inbound SSE streams, behavioral anomaly detection (EWMA, Page-Hinkley drift, Markov chains) |
| Evade ML Model | AML.T0015 | ANSI/VT control character stripping, hidden Unicode detection, sanitization of tool definitions and responses |
| Denial of ML Service | AML.T0029 | Per-license rate limiting (Durable Object-backed), custom policy engine with block/audit modes |
| ML Service Discovery | AML.T0014 | SSRF blocking (localhost, metadata IPs, DNS rebinding via DoH), IP whitelisting, origin validation |
| Credential Access via ML Service | AML.T0037 | OAuth 2.1 proxy with PKCE and token isolation, PII/secret redaction (AWS keys, GCP keys, GitHub tokens, JWTs), encrypted upstream credentials |
| Privilege Escalation via ML Service | AML.T0038 | RBAC team tokens, approval workflows for high-risk tools, structured policy rules with scope/mode/conditions |
| AI Agent Context Poisoning: Memory | AML.T0080 | Memory persistence pattern detection (7 regex patterns), hidden HTML content sanitization (7 stripping categories), AI share URL inspection, output scanning in JSON and SSE streams |
Detailed Technique Coverage
AML.T0051 — LLM Prompt Injection
Prompt injection in the MCP context means an upstream tool server embeds hidden instructions in tool definitions or responses that override the AI client’s intended behavior. ThornGuard addresses this at the transport layer before payloads reach the LLM. Detections:- Tool poisoning scanner — Inspects tool definitions for instruction overrides, system prompt manipulation attempts, and embedded directives that could hijack agent behavior.
- Hidden Unicode detection — Flags invisible or homoglyph characters (zero-width joiners, right-to-left overrides, lookalike substitutions) used to smuggle instructions past human review.
- ANSI/VT escape stripping — Removes terminal control sequences from tool definitions and responses that could manipulate terminal rendering or inject invisible content.
- Output scanning — Analyzes inbound SSE stream data for patterns consistent with indirect prompt injection in tool responses.
BLOCKED_MALICIOUS, PII_REDACTED (when embedded secrets are part of an injection payload)
AML.T0010 — ML Supply Chain Compromise
In the MCP ecosystem, supply chain compromise manifests as tampered tool definitions — a server that silently changes what a tool does, adds new parameters, or alters its schema to exfiltrate data. Detections:- TOFU schema pinning — On first observation of a tool, ThornGuard records a SHA-256 hash of its definition. Subsequent invocations are compared against the pinned schema; any drift triggers an alert or block depending on policy configuration.
- Tool inventory and risk scoring — Every observed upstream tool is catalogued with a computed risk level based on its capabilities, parameter surface area, and behavioral history.
- Approval workflows — High-risk tool invocations can require explicit human approval before execution, preventing compromised tools from operating silently.
BLOCKED_APPROVAL, BLOCKED_POLICY
AML.T0024 — Exfiltration via ML Inference API
AI agents bridging multiple services create exfiltration paths that traditional network security does not cover. An agent might read credentials from one tool and pass them to another without any human in the loop. Detections:- PII and secret redaction — Scans both outbound request parameters and inbound responses (including SSE streams) for 10+ pattern types: email addresses, SSNs, AWS keys, GCP keys, GitHub tokens, Slack tokens, private keys, JWTs, phone numbers, and credit cards (with IIN prefix + Luhn validation).
- Cross-server data flow governance — Enforces namespace isolation between MCP servers so data from one upstream cannot leak to another. Taint tracking follows sensitive values across tool invocations.
- Custom redaction rules — Enterprise customers can define regex-based redaction rules for domain-specific sensitive data (internal project codes, patient IDs, etc.).
PII_REDACTED, CUSTOM_REDACTION_AUDIT, BLOCKED_POLICY
AML.T0020 — Data Poisoning
When AI agents consume tool outputs and feed them back into downstream processes, a compromised upstream server can inject corrupted data that propagates through the pipeline. Detections:- Behavioral anomaly detection — Monitors tool response patterns using three statistical methods:
- EWMA (Exponentially Weighted Moving Average) — Detects gradual drift in response characteristics.
- Page-Hinkley drift detection — Identifies abrupt changes in response distributions that indicate tampering.
- Markov chain analysis — Models expected tool invocation sequences and flags anomalous transitions.
- Inbound SSE stream scanning — All response data from upstream servers passes through the redaction and scanning pipeline before reaching the AI client.
BLOCKED_POLICY, POLICY_AUDIT
AML.T0015 — Evade ML Model
Adversarial inputs designed to bypass detection are a concern at the transport layer. Attackers may use encoding tricks, Unicode manipulation, or control characters to disguise malicious payloads. Detections:- ANSI/VT control character stripping — Removes all terminal escape sequences from tool definitions and responses, eliminating a class of evasion techniques that exploit rendering differences between human review and machine parsing.
- Hidden Unicode detection — Catches zero-width characters, bidirectional overrides, and homoglyph substitutions that could make malicious content appear benign during review.
- Sanitization pipeline — Tool definitions and responses are normalized before security checks are applied, reducing the effectiveness of encoding-based evasion.
BLOCKED_MALICIOUS
AML.T0029 — Denial of ML Service
Overwhelming an MCP proxy or its upstream servers with requests degrades availability for legitimate users. Detections:- Per-license rate limiting — Enforced via Cloudflare Durable Objects for strong consistency, with KV fallback for resilience. Limits are configurable per license tier.
- Custom policy engine — Policies can throttle or block specific tool invocations, target URLs, or request patterns in block or audit mode.
- Tiered enforcement — Individual and Enterprise plans have different rate limit ceilings, ensuring fair resource allocation.
BLOCKED_RATE_LIMIT, BLOCKED_POLICY
AML.T0014 — ML-Enabled Product or Service Discovery
Reconnaissance against MCP infrastructure — probing for available servers, enumerating tools, or scanning internal network endpoints through the proxy. Detections:- SSRF blocking — Rejects requests targeting localhost, link-local addresses, cloud metadata endpoints (169.254.169.254), and private RFC 1918 ranges. Uses DNS-over-HTTPS resolution to defeat DNS rebinding attacks.
- IP whitelisting — Enterprise per-license restrictions limit which client IPs can access the proxy.
- Origin validation — Rejects requests from disallowed browser origins when enabled.
- Transport guardrails — Non-HTTPS targets are rejected outright, preventing downgrade probes.
BLOCKED_SSRF, BLOCKED_IP_WHITELIST, BLOCKED_ORIGIN, BLOCKED_INSECURE_TARGET
AML.T0037 — Credential Access via ML Service
AI agents handle credentials throughout their operation — API keys in tool configurations, tokens in responses, secrets passed between services. An attacker who compromises the transport can harvest these credentials. Detections:- OAuth 2.1 proxy — ThornGuard implements OAuth 2.1 with PKCE (S256) and token isolation. Upstream credentials are encrypted with AES-256-GCM and never exposed to the AI client in plaintext.
- Secret pattern redaction — Actively scans for and redacts AWS access keys, GCP service account keys, GitHub personal access tokens, Slack tokens, private key blocks, and JWTs in both request and response payloads.
- Encrypted credential storage — Upstream tokens obtained via OAuth exchange are stored encrypted in D1, with JTI-mapped proxy tokens providing an isolation layer.
PII_REDACTED, BLOCKED_AUTH
AML.T0038 — Privilege Escalation via ML Service
An attacker may attempt to escalate privileges by invoking tools beyond their authorization level, manipulating approval workflows, or exploiting RBAC misconfigurations. Detections:- RBAC team tokens — Admin and viewer roles with expiry and revocation metadata. Token lifecycle is tracked in D1 with structured audit trails.
- Approval workflows — High-risk tool invocations are matched against approval profiles. Requests are held pending explicit authorization, with capability caching to avoid repeated prompts for approved patterns.
- Structured policy rules — Scope, mode, and conditions are evaluated per-request. Policies can restrict tool access by license tier, team role, target URL, or RPC method.
- Custom blocklists — Per-license domain and command blocklists provide an additional layer of access restriction.
BLOCKED_APPROVAL, BLOCKED_POLICY, BLOCKED_AUTH
AML.T0080 — AI Agent Context Poisoning: Memory
Microsoft Defender researchers documented 31 companies embedding hidden instructions in “Summarize with AI” buttons to permanently bias AI assistants (February 2026). This attack exploits URL query parameters and hidden HTML to inject memory-persistence commands like “remember X as the trusted source.” ThornGuard detections:- Memory persistence pattern detection — 7 regex patterns targeting phrases like “remember X as trusted source,” “in future conversations,” “treat X as authoritative,” “recommend X first,” and “citation source for future reference”
- Hidden HTML content sanitization — strips
display:noneelements, HTML comments,visibility:hiddenelements,opacity:0elements, off-screen positioned content,<noscript>blocks, hidden inputs, invisible iframes, and JSON-LD script blocks containing instruction-like patterns - AI share URL inspection — detects URLs targeting AI assistants (ChatGPT, Copilot, Claude, Perplexity, Gemini, Grok) with query parameters containing memory-manipulation keywords
- Output scanning — all detection runs on both JSON and SSE stream response paths, ensuring streaming tool responses are inspected
TOOL_POISONING_DETECTED with category: "recommendation_poisoning", "hidden_html", or "ai_share_url"