Machine Learning Security: Attack Taxonomy, Live CVEs, and Defense Priorities
A technical overview of machine learning security threats in 2026: NIST's adversarial ML taxonomy, MITRE ATLAS attack classes, the CVE-2025-62164 vLLM deserialization flaw, and actionable defense posture for security teams.
Machine learning security has matured into a distinct discipline with its own attack taxonomy, dedicated CVE pipeline, and defense tooling — yet most production ML deployments still treat model endpoints as trusted internal services. Two major public frameworks published within the past 14 months draw a sharper picture of the actual threat landscape, and a high-severity vLLM deserialization flaw from late 2025 illustrates exactly what the gap looks like in code.
The Attack Surface: What NIST and NCSC Now Formally Define
NIST AI 100-2 E2025 ↗, published March 2025, establishes the canonical taxonomy for adversarial machine learning. It organizes attacks along four axes: data poisoning, evasion, privacy, and abuse. Each axis maps to life cycle stage (training vs. inference), attacker goal, and attacker knowledge (white-box vs. black-box). The document explicitly targets the standards gap — it is designed to inform future NIST guidance and procurement language, not just academic classification.
A month later, the UK’s National Cyber Security Centre ↗ published a complementary framework that expands the taxonomy to seven attack classes:
- Model characterisation — probing model architecture and training methodology through query responses.
- Model inversion — recovering training data or model weights from output distributions.
- Training data poisoning — injecting malicious samples to alter decision boundaries.
- Malicious model training — compromising the training pipeline itself (compute infrastructure, dataset hosting, orchestration code).
- Model input manipulation — crafting inputs at inference time, including prompt injection against LLMs.
- Model artefact manipulation — modifying serialized model files post-training, before or during deployment.
- Model hardware attacks — physical-layer exploitation of inference hardware.
The NCSC notes that “the rapid development cycle, unique architectures, large model sizes, and prevalence of open-source components in ML systems create a significantly larger attack surface than traditional software.” That observation is borne out by the CVE record.
CVE-2025-62164: When Unsafe Deserialization Meets a Serving Framework
CVE-2025-62164 ↗, published November 20, 2025, sits at CVSS 3.1 score 8.8 HIGH (vector: AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H). It affects vLLM versions 0.10.2 through 0.11.0.
The root cause: vLLM’s Completions API endpoint accepts user-supplied prompt embeddings and deserializes them with torch.load() without sufficient input validation. PyTorch 2.8.0 disabled sparse tensor integrity checks by default; combined with vLLM’s unvalidated deserialization path, a low-privileged network attacker can craft a malicious tensor payload that triggers an out-of-bounds memory write (CWE-787 / CWE-123), resulting in denial of service or potential remote code execution.
Applicable CWEs: CWE-20 (Improper Input Validation), CWE-123 (Write-what-where Condition), CWE-502 (Deserialization of Untrusted Data), CWE-787 (Out-of-bounds Write).
Fixed version: vLLM 0.11.1. Organizations running vLLM behind an API gateway that strips arbitrary embedding inputs may have partial mitigation, but the fix should be applied regardless. This class of vulnerability — unsafe pickle/torch.load() deserialization — has appeared repeatedly across the ML stack and is not unique to vLLM.
MITRE ATLAS: Operationalizing the Attack Lifecycle
MITRE ATLAS ↗ (Adversarial Threat Landscape for Artificial-Intelligence Systems) provides the practitioner-facing counterpart to the NIST taxonomy: a structured, ATT&CK-style matrix mapping 15 tactics and 66 techniques to real-world case studies. In October 2025, MITRE ATLAS added 14 new techniques specifically for agentic AI systems — autonomous agents that interact with external tools, APIs, and data sources — addressing attack paths that did not exist when the framework launched in 2020.
Relevant tactic categories for ML defenders include: ML attack staging (preparing poisoned datasets, weaponized model files), initial access (exploiting model serving endpoints, compromised model repositories), and impact (model degradation, backdoor activation, data exfiltration through model inversion).
Security teams already using ATT&CK Navigator can import the ATLAS layer directly. For teams building offensive red-team coverage for AI systems ↗, ATLAS provides the structured attack enumeration needed to scope exercises.
Supply Chain: The Open-Source Model Risk
A March 2026 DoD guidance document on AI/ML supply chain risks ↗ flags pre-trained model distribution as a primary vector. The concern is direct: organizations download weights from public repositories and load them into production without verifying provenance or scanning for serialized backdoors. Model artefact manipulation (NCSC class 6) requires no network access to the target environment — a compromised upstream weight file is sufficient.
Concrete risks in this category:
- Pickle-deserialized
.pt/.binfiles with embedded arbitrary Python code (the original BadNets supply-chain attack surface, documented as early as 2017 and still viable). - GGUF/safetensors adoption reduces but does not eliminate deserialization risk; safetensors format is considerably safer by design but ecosystem migration is incomplete.
- Hugging Face Hub model cards do not constitute a security audit; automated scanning (e.g., with ModelScan or equivalent) is required before production use.
Defensive guardrails and scanning tooling ↗ for model inputs and outputs address inference-time threats but do not cover training-time or supply-chain compromise. The control must be applied at ingestion, not just at serving time.
What Defenders Should Do
Based on the NIST taxonomy, NCSC framework, and the live CVE record:
- Patch vLLM to 0.11.1 immediately if running 0.10.2–0.11.0. CVE-2025-62164 is network-exploitable with low privileges required.
- Audit all
torch.load()call sites in inference code. Replace withtorch.load(..., weights_only=True)or migrate model serialization to safetensors format. - Scan pre-trained model files before deployment. Run ModelScan or equivalent against every
.pt,.bin,.pklartifact pulled from external sources, including Hugging Face Hub. - Map your model serving endpoints against MITRE ATLAS using the ML attack staging and initial access tactics. Treat the model API as an untrusted external surface.
- Instrument model outputs for drift and anomaly. Backdoor activations and evasion attacks produce detectable distributional shifts when monitored at inference time ↗. Establish baselines at deployment and alert on deviation.
- Track NIST AI 100-2 E2025 and forthcoming NIST SP 600-series guidance. The taxonomy published in March 2025 is expected to anchor future federal procurement and compliance requirements. Policy timelines and EU AI Act alignment ↗ are tracking on a parallel path.
Sources
-
NIST AI 100-2 E2025 — Adversarial Machine Learning: A Taxonomy and Terminology of Attacks and Mitigations ↗. Published March 2025. Primary reference taxonomy for adversarial ML attacks and mitigations.
-
NCSC — Understanding Adversarial Attacks Against Machine Learning and AI ↗. Published April 29, 2026. Seven-class attack framework with practitioner-focused threat modeling guidance.
-
NVD / NIST — CVE-2025-62164 Detail ↗. vLLM unsafe deserialization, CVSS 8.8 HIGH. Fixed in vLLM 0.11.1.
-
MITRE ATLAS — atlas.mitre.org ↗. ATT&CK-style adversarial ML technique matrix; 15 tactics, 66 techniques as of October 2025 update.
Sources
AI Alert — in your inbox
AI incidents and vulnerabilities — tracked, sourced, dated. — delivered when there's something worth your inbox.
No spam. Unsubscribe anytime.
Related
Generative AI Risks: A Technical Reference for Security and Operations Teams
A practitioner-focused breakdown of generative AI risks mapped against NIST AI 600-1 and the OWASP Top 10 for LLMs — prompt injection, data poisoning, supply-chain compromise, and mitigation priorities.
OpenAI Security: Bug Bounties, CVE Disclosure, and the 2025 Mixpanel Breach
A practitioner's overview of OpenAI security in 2026: their bug bounty program, CNA status, the November 2025 Mixpanel breach, and what security teams operating on OpenAI's platform need to monitor.
ChatGPT Security: Patched Flaws, Persistent Gaps, and What's Still Unsolved
A technical review of ChatGPT security vulnerabilities disclosed in 2025-2026: DNS-based data exfiltration, ZombieAgent prompt injection bypass, Codex command injection, and the credential market driving account takeovers.