AI Alert
disclosure

Machine Learning Security: Attack Taxonomy, Live CVEs, and Defense Priorities

A technical overview of machine learning security threats in 2026: NIST's adversarial ML taxonomy, MITRE ATLAS attack classes, the CVE-2025-62164 vLLM deserialization flaw, and actionable defense posture for security teams.

By AI Alert Desk · · 8 min read

Machine learning security has matured into a distinct discipline with its own attack taxonomy, dedicated CVE pipeline, and defense tooling — yet most production ML deployments still treat model endpoints as trusted internal services. Two major public frameworks published within the past 14 months draw a sharper picture of the actual threat landscape, and a high-severity vLLM deserialization flaw from late 2025 illustrates exactly what the gap looks like in code.

The Attack Surface: What NIST and NCSC Now Formally Define

NIST AI 100-2 E2025, published March 2025, establishes the canonical taxonomy for adversarial machine learning. It organizes attacks along four axes: data poisoning, evasion, privacy, and abuse. Each axis maps to life cycle stage (training vs. inference), attacker goal, and attacker knowledge (white-box vs. black-box). The document explicitly targets the standards gap — it is designed to inform future NIST guidance and procurement language, not just academic classification.

A month later, the UK’s National Cyber Security Centre published a complementary framework that expands the taxonomy to seven attack classes:

  1. Model characterisation — probing model architecture and training methodology through query responses.
  2. Model inversion — recovering training data or model weights from output distributions.
  3. Training data poisoning — injecting malicious samples to alter decision boundaries.
  4. Malicious model training — compromising the training pipeline itself (compute infrastructure, dataset hosting, orchestration code).
  5. Model input manipulation — crafting inputs at inference time, including prompt injection against LLMs.
  6. Model artefact manipulation — modifying serialized model files post-training, before or during deployment.
  7. Model hardware attacks — physical-layer exploitation of inference hardware.

The NCSC notes that “the rapid development cycle, unique architectures, large model sizes, and prevalence of open-source components in ML systems create a significantly larger attack surface than traditional software.” That observation is borne out by the CVE record.

CVE-2025-62164: When Unsafe Deserialization Meets a Serving Framework

CVE-2025-62164, published November 20, 2025, sits at CVSS 3.1 score 8.8 HIGH (vector: AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:H/A:H). It affects vLLM versions 0.10.2 through 0.11.0.

The root cause: vLLM’s Completions API endpoint accepts user-supplied prompt embeddings and deserializes them with torch.load() without sufficient input validation. PyTorch 2.8.0 disabled sparse tensor integrity checks by default; combined with vLLM’s unvalidated deserialization path, a low-privileged network attacker can craft a malicious tensor payload that triggers an out-of-bounds memory write (CWE-787 / CWE-123), resulting in denial of service or potential remote code execution.

Applicable CWEs: CWE-20 (Improper Input Validation), CWE-123 (Write-what-where Condition), CWE-502 (Deserialization of Untrusted Data), CWE-787 (Out-of-bounds Write).

Fixed version: vLLM 0.11.1. Organizations running vLLM behind an API gateway that strips arbitrary embedding inputs may have partial mitigation, but the fix should be applied regardless. This class of vulnerability — unsafe pickle/torch.load() deserialization — has appeared repeatedly across the ML stack and is not unique to vLLM.

MITRE ATLAS: Operationalizing the Attack Lifecycle

MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems) provides the practitioner-facing counterpart to the NIST taxonomy: a structured, ATT&CK-style matrix mapping 15 tactics and 66 techniques to real-world case studies. In October 2025, MITRE ATLAS added 14 new techniques specifically for agentic AI systems — autonomous agents that interact with external tools, APIs, and data sources — addressing attack paths that did not exist when the framework launched in 2020.

Relevant tactic categories for ML defenders include: ML attack staging (preparing poisoned datasets, weaponized model files), initial access (exploiting model serving endpoints, compromised model repositories), and impact (model degradation, backdoor activation, data exfiltration through model inversion).

Security teams already using ATT&CK Navigator can import the ATLAS layer directly. For teams building offensive red-team coverage for AI systems, ATLAS provides the structured attack enumeration needed to scope exercises.

Supply Chain: The Open-Source Model Risk

A March 2026 DoD guidance document on AI/ML supply chain risks flags pre-trained model distribution as a primary vector. The concern is direct: organizations download weights from public repositories and load them into production without verifying provenance or scanning for serialized backdoors. Model artefact manipulation (NCSC class 6) requires no network access to the target environment — a compromised upstream weight file is sufficient.

Concrete risks in this category:

Defensive guardrails and scanning tooling for model inputs and outputs address inference-time threats but do not cover training-time or supply-chain compromise. The control must be applied at ingestion, not just at serving time.

What Defenders Should Do

Based on the NIST taxonomy, NCSC framework, and the live CVE record:

  1. Patch vLLM to 0.11.1 immediately if running 0.10.2–0.11.0. CVE-2025-62164 is network-exploitable with low privileges required.
  2. Audit all torch.load() call sites in inference code. Replace with torch.load(..., weights_only=True) or migrate model serialization to safetensors format.
  3. Scan pre-trained model files before deployment. Run ModelScan or equivalent against every .pt, .bin, .pkl artifact pulled from external sources, including Hugging Face Hub.
  4. Map your model serving endpoints against MITRE ATLAS using the ML attack staging and initial access tactics. Treat the model API as an untrusted external surface.
  5. Instrument model outputs for drift and anomaly. Backdoor activations and evasion attacks produce detectable distributional shifts when monitored at inference time. Establish baselines at deployment and alert on deviation.
  6. Track NIST AI 100-2 E2025 and forthcoming NIST SP 600-series guidance. The taxonomy published in March 2025 is expected to anchor future federal procurement and compliance requirements. Policy timelines and EU AI Act alignment are tracking on a parallel path.

Sources

Sources

  1. NIST AI 100-2 E2025: Adversarial Machine Learning — A Taxonomy and Terminology
  2. NCSC: Understanding Adversarial Attacks Against Machine Learning and AI
  3. CVE-2025-62164 — NVD Detail
  4. MITRE ATLAS — Adversarial Threat Landscape for Artificial-Intelligence Systems
#machine-learning #adversarial-ml #cve #data-poisoning #supply-chain #model-security
Subscribe

AI Alert — in your inbox

AI incidents and vulnerabilities — tracked, sourced, dated. — delivered when there's something worth your inbox.

No spam. Unsubscribe anytime.

Related

Comments