MITRE ATLAS Mapping

ThornGuard detections are mapped to MITRE ATLAS (Adversarial Threat Landscape for AI Systems) techniques. This mapping helps security teams understand which AI-specific attack vectors ThornGuard addresses at the MCP transport layer and where complementary controls are needed.

MITRE ATLAS is a knowledge base of adversarial tactics and techniques against AI systems, maintained by MITRE Corporation. ThornGuard references ATLAS technique identifiers to contextualize its detections — this does not imply certification, endorsement, or formal compliance with any MITRE program.

Summary Mapping

ATLAS Technique	ID	ThornGuard Controls
LLM Prompt Injection	AML.T0051	Tool poisoning detection, hidden Unicode stripping, ANSI/VT escape removal, instruction override scanning
ML Supply Chain Compromise	AML.T0010	TOFU schema pinning (SHA-256 drift detection), tool inventory with risk scoring, approval workflows
Exfiltration via ML Inference API	AML.T0024	PII/secret redaction (10+ pattern types), cross-server data flow governance, namespace isolation, taint tracking
Data Poisoning	AML.T0020	Output scanning on inbound SSE streams, behavioral anomaly detection (EWMA, Page-Hinkley drift, Markov chains)
Evade ML Model	AML.T0015	ANSI/VT control character stripping, hidden Unicode detection, sanitization of tool definitions and responses
Denial of ML Service	AML.T0029	Per-license rate limiting (Durable Object-backed), custom policy engine with block/audit modes
ML Service Discovery	AML.T0014	SSRF blocking (localhost, metadata IPs, DNS rebinding via DoH), IP whitelisting, origin validation
Credential Access via ML Service	AML.T0037	OAuth 2.1 proxy with PKCE and token isolation, PII/secret redaction (AWS keys, GCP keys, GitHub tokens, JWTs), encrypted upstream credentials
Privilege Escalation via ML Service	AML.T0038	Polar-backed license activations, approval workflows for high-risk tools, structured policy rules with scope/mode/conditions
AI Agent Context Poisoning: Memory	AML.T0080	Memory persistence pattern detection (7 regex patterns), hidden HTML content sanitization (8 stripping categories), AI share URL inspection, output scanning in JSON and SSE streams

Detailed Technique Coverage

AML.T0051 — LLM Prompt Injection

Prompt injection in the MCP context means an upstream tool server embeds hidden instructions in tool definitions or responses that override the AI client’s intended behavior. ThornGuard addresses this at the transport layer before payloads reach the LLM. Detections:

Tool poisoning scanner — Inspects tool definitions for instruction overrides, system prompt manipulation attempts, and embedded directives that could hijack agent behavior.
Hidden Unicode detection — Flags invisible or homoglyph characters (zero-width joiners, right-to-left overrides, lookalike substitutions) used to smuggle instructions past human review.
ANSI/VT escape stripping — Removes terminal control sequences from tool definitions and responses that could manipulate terminal rendering or inject invisible content.
Output scanning — Analyzes inbound SSE stream data for patterns consistent with indirect prompt injection in tool responses.

Audit actions: BLOCKED_MALICIOUS, PII_REDACTED (when embedded secrets are part of an injection payload)

AML.T0010 — ML Supply Chain Compromise

In the MCP ecosystem, supply chain compromise manifests as tampered tool definitions — a server that silently changes what a tool does, adds new parameters, or alters its schema to exfiltrate data. Detections:

TOFU schema pinning — On first observation of a tool, ThornGuard records a SHA-256 hash of its definition. Subsequent invocations are compared against the pinned schema; any drift triggers an alert or block depending on policy configuration.
Tool inventory and risk scoring — Every observed upstream tool is catalogued with a computed risk level based on its capabilities, parameter surface area, and behavioral history.
Approval workflows — High-risk tool invocations can require explicit human approval before execution, preventing compromised tools from operating silently.

Audit actions: BLOCKED_APPROVAL, BLOCKED_POLICY

AML.T0024 — Exfiltration via ML Inference API

AI agents bridging multiple services create exfiltration paths that traditional network security does not cover. An agent might read credentials from one tool and pass them to another without any human in the loop. Detections:

PII and secret redaction — Scans both outbound request parameters and inbound responses (including SSE streams) for 10+ pattern types: email addresses, SSNs, AWS keys, GCP keys, GitHub tokens, Slack tokens, private keys, JWTs, phone numbers, and credit cards (with IIN prefix + Luhn validation).
Cross-server data flow governance — Enforces namespace isolation between MCP servers so data from one upstream cannot leak to another. Taint tracking follows sensitive values across tool invocations.
Custom redaction rules — Enterprise customers can define regex-based redaction rules for domain-specific sensitive data (internal project codes, patient IDs, etc.).

Audit actions: PII_REDACTED, CUSTOM_REDACTION_AUDIT, BLOCKED_POLICY

AML.T0020 — Data Poisoning

When AI agents consume tool outputs and feed them back into downstream processes, a compromised upstream server can inject corrupted data that propagates through the pipeline. Detections:

Behavioral anomaly detection — Monitors tool response patterns using three statistical methods:
- EWMA (Exponentially Weighted Moving Average) — Detects gradual drift in response characteristics.
- Page-Hinkley drift detection — Identifies abrupt changes in response distributions that indicate tampering.
- Markov chain analysis — Models expected tool invocation sequences and flags anomalous transitions.
Inbound SSE stream scanning — All response data from upstream servers passes through the redaction and scanning pipeline before reaching the AI client.

Audit actions: BLOCKED_POLICY, POLICY_AUDIT

AML.T0015 — Evade ML Model

Adversarial inputs designed to bypass detection are a concern at the transport layer. Attackers may use encoding tricks, Unicode manipulation, or control characters to disguise malicious payloads. Detections:

ANSI/VT control character stripping — Removes all terminal escape sequences from tool definitions and responses, eliminating a class of evasion techniques that exploit rendering differences between human review and machine parsing.
Hidden Unicode detection — Catches zero-width characters, bidirectional overrides, and homoglyph substitutions that could make malicious content appear benign during review.
Sanitization pipeline — Tool definitions and responses are normalized before security checks are applied, reducing the effectiveness of encoding-based evasion.

Audit actions: BLOCKED_MALICIOUS

AML.T0029 — Denial of ML Service

Overwhelming an MCP proxy or its upstream servers with requests degrades availability for legitimate users. Detections:

Per-license rate limiting — Enforced via Cloudflare Durable Objects for strong consistency, with KV fallback for resilience. Limits are configurable per license tier.
Custom policy engine — Policies can throttle or block specific tool invocations, target URLs, or request patterns in block or audit mode.
Tiered enforcement — Individual and Enterprise plans have different rate limit ceilings, ensuring fair resource allocation.

Audit actions: BLOCKED_RATE_LIMIT, BLOCKED_POLICY

AML.T0014 — ML-Enabled Product or Service Discovery

Reconnaissance against MCP infrastructure — probing for available servers, enumerating tools, or scanning internal network endpoints through the proxy. Detections:

SSRF blocking — Rejects requests targeting localhost, link-local addresses, cloud metadata endpoints (169.254.169.254), and private RFC 1918 ranges. Uses DNS-over-HTTPS resolution to defeat DNS rebinding attacks.
IP whitelisting — Enterprise per-license restrictions limit which client IPs can access the proxy.
Origin validation — Rejects requests from disallowed browser origins when enabled.
Transport guardrails — Non-HTTPS targets are rejected outright, preventing downgrade probes.

Audit actions: BLOCKED_SSRF, BLOCKED_IP_WHITELIST, BLOCKED_ORIGIN, BLOCKED_INSECURE_TARGET

AML.T0037 — Credential Access via ML Service

AI agents handle credentials throughout their operation — API keys in tool configurations, tokens in responses, secrets passed between services. An attacker who compromises the transport can harvest these credentials. Detections:

OAuth 2.1 proxy — ThornGuard implements OAuth 2.1 with PKCE (S256) and token isolation. Upstream credentials are encrypted with AES-256-GCM and never exposed to the AI client in plaintext.
Secret pattern redaction — Actively scans for and redacts AWS access keys, GCP service account keys, GitHub personal access tokens, Slack tokens, private key blocks, and JWTs in both request and response payloads.
Encrypted credential storage — Upstream tokens obtained via OAuth exchange are stored encrypted in D1, with JTI-mapped proxy tokens providing an isolation layer.

Audit actions: PII_REDACTED, BLOCKED_AUTH

AML.T0038 — Privilege Escalation via ML Service

An attacker may attempt to escalate privileges by invoking tools beyond their authorization level, manipulating approval workflows, or exploiting RBAC misconfigurations. Detections:

License activations — Browser, CLI, and device instances are tracked against the single Polar license key and can be deactivated to free seats.
Approval workflows — High-risk tool invocations are matched against approval profiles. Requests are held pending explicit authorization, with capability caching to avoid repeated prompts for approved patterns.
Structured policy rules — Scope, mode, and conditions are evaluated per-request. Policies can restrict traffic by target URL, RPC method, selected headers, JSON selectors, content patterns, and tool risk level.
Custom blocklists — Per-license domain and command blocklists provide an additional layer of access restriction.

Audit actions: BLOCKED_APPROVAL, BLOCKED_POLICY, BLOCKED_AUTH

AML.T0080 — AI Agent Context Poisoning: Memory

Microsoft Defender researchers documented 31 companies embedding hidden instructions in “Summarize with AI” buttons to permanently bias AI assistants (February 2026). This attack exploits URL query parameters and hidden HTML to inject memory-persistence commands like “remember X as the trusted source.” ThornGuard detections:

Memory persistence pattern detection — 7 regex patterns targeting phrases like “remember X as trusted source,” “in future conversations,” “treat X as authoritative,” “recommend X first,” and “citation source for future reference”
Hidden HTML content sanitization — strips HTML comments, literal hidden attributes, display:none / visibility:hidden / opacity:0 elements, off-screen positioned content, <noscript> blocks, hidden inputs, invisible iframes, and JSON-LD script blocks containing instruction-like patterns
AI share URL inspection — detects URLs targeting AI assistants (ChatGPT, Copilot, Claude, Perplexity, Gemini, Grok) with query parameters containing memory-manipulation keywords
Output scanning — all detection runs on both JSON and SSE stream response paths, ensuring streaming tool responses are inspected

Audit action: TOOL_POISONING_DETECTED with category: "recommendation_poisoning", "hidden_html", or "ai_share_url"

Coverage Gaps

ThornGuard operates at the MCP transport layer — it inspects and governs traffic between AI clients and upstream tool servers. The following ATLAS technique categories are out of scope because they target phases of the AI lifecycle that ThornGuard does not participate in:

Training-phase attacks — Model training data poisoning, backdoor insertion, and training pipeline compromise occur before inference and are outside the proxy’s observation point.
Adversarial ML example generation — Crafting inputs that cause model misclassification (e.g., adversarial images, perturbation attacks) targets the model itself, not the transport layer.
Physical-domain attacks — Sensor spoofing, physical adversarial patches, and environmental manipulation apply to embodied AI systems, not text-based MCP traffic.
Model theft and extraction — Techniques focused on stealing model weights or reverse-engineering model internals via query access are model-layer concerns.

Organizations should pair ThornGuard’s transport-layer controls with model-layer defenses, training pipeline integrity checks, and application-level guardrails for comprehensive AI security coverage.

Getting Started

Security Features

Platform

Compliance & Architecture

Summary Mapping

Detailed Technique Coverage

AML.T0051 — LLM Prompt Injection

AML.T0010 — ML Supply Chain Compromise

AML.T0024 — Exfiltration via ML Inference API

AML.T0020 — Data Poisoning

AML.T0015 — Evade ML Model

AML.T0029 — Denial of ML Service

AML.T0014 — ML-Enabled Product or Service Discovery

AML.T0037 — Credential Access via ML Service

AML.T0038 — Privilege Escalation via ML Service

AML.T0080 — AI Agent Context Poisoning: Memory

Coverage Gaps

Getting Started

Security Features

Platform

Compliance & Architecture

Documentation Index

​Summary Mapping

​Detailed Technique Coverage

​AML.T0051 — LLM Prompt Injection

​AML.T0010 — ML Supply Chain Compromise

​AML.T0024 — Exfiltration via ML Inference API

​AML.T0020 — Data Poisoning

​AML.T0015 — Evade ML Model

​AML.T0029 — Denial of ML Service

​AML.T0014 — ML-Enabled Product or Service Discovery

​AML.T0037 — Credential Access via ML Service

​AML.T0038 — Privilege Escalation via ML Service

​AML.T0080 — AI Agent Context Poisoning: Memory

​Coverage Gaps

Summary Mapping

Detailed Technique Coverage

AML.T0051 — LLM Prompt Injection

AML.T0010 — ML Supply Chain Compromise

AML.T0024 — Exfiltration via ML Inference API

AML.T0020 — Data Poisoning

AML.T0015 — Evade ML Model

AML.T0029 — Denial of ML Service

AML.T0014 — ML-Enabled Product or Service Discovery

AML.T0037 — Credential Access via ML Service

AML.T0038 — Privilege Escalation via ML Service

AML.T0080 — AI Agent Context Poisoning: Memory

Coverage Gaps