AI/ML CVE Roundup: May 2026 — What Got Patched
A summary of AI and ML-adjacent CVEs disclosed in early–mid 2026 across model serving frameworks, LLM API gateways, agent SDKs, and ML training libraries.
The first half of 2026 produced a busy patch cycle for the ML infrastructure stack. Advisories dropped across model serving frameworks, LLM API gateways, agent SDKs, and ML training libraries. This roundup covers the disclosures worth tracking; we skip low-severity issues with no plausible exploitation path in standard deployments.
Severity labels follow CVSS v3.1 convention. Patch status is as of publication date (2026-05-06); check vendor advisories for updates.
CVE-2026-22778 — vLLM Heap Address Leak Enabling Remote Code Execution Chain
Severity: Critical (CVSS 9.8) Affected component: vLLM 0.8.3 to before 0.14.1, multimodal endpoint CWE: CWE-209 (Information Exposure Through an Error Message)
When an invalid image is sent to vLLM’s multimodal endpoint, the underlying PIL library throws an error and vLLM returns that error to the client — leaking a heap address in the process. With this leak, an attacker reduces ASLR entropy from roughly four billion guesses to about eight. NVD notes the leak can be chained with a heap overflow in the JPEG2000 decoder in OpenCV/FFmpeg to achieve remote code execution.
Exploitation: Network-reachable against any deployment that exposes the multimodal (image) input path to untrusted callers. The information leak alone is unauthenticated; the full RCE chain depends on the vulnerable image-decoding dependencies being present.
Patch status: Fixed in vLLM 0.14.1. Update and pin.
CVE-2026-42208 — LiteLLM SQL Injection via Proxy API Key Check
Severity: Critical (CVSS 9.8) Affected component: LiteLLM 1.81.16 to before 1.83.7, proxy API key check path CWE: CWE-89 (SQL Injection)
A database query used during LiteLLM proxy API key checks mixed the caller-supplied key value into the query text instead of binding it as a separate parameter. An unauthenticated attacker could send a crafted Authorization header to any LLM API route (for example POST /chat/completions) and reach this query through the proxy’s error-handling path, enabling SQL injection against the proxy’s backing database.
Exploitation: Unauthenticated and network-reachable. Any LiteLLM proxy deployment in the affected version range that is exposed to untrusted clients is at risk; the injection is reachable through normal LLM API routes, not an administrative endpoint.
Patch status: Fixed in LiteLLM 1.83.7. This is the priority update for anyone running the LiteLLM gateway.
CVE-2026-27893 — vLLM Remote Code Execution via Hardcoded trust_remote_code
Severity: High (CVSS 8.8) Affected component: vLLM 0.10.1 to before 0.18.0, model implementation files CWE: CWE-94 (Improper Control of Generation of Code)
Two model implementation files in vLLM hardcoded trust_remote_code=True when loading sub-components, silently bypassing a user’s explicit --trust-remote-code=False security opt-out. This enables remote code execution via a malicious model repository even when the operator has deliberately disabled remote code trust — the user’s safety setting is ignored for those code paths.
Exploitation: Requires the server to load an attacker-influenced model repository. Notably, the usual mitigation (disabling trust_remote_code) does not protect against this path, which is what elevates the risk for operators who believed they had opted out.
Patch status: Fixed in vLLM 0.18.0.
Note: This is the recurring pattern in agentic and model-loading frameworks — code execution paths that fire on untrusted model content. The defensive posture is to treat any model artifact loaded from an untrusted source as untrusted code, and to verify that “safe” configuration flags are actually honored across the whole loading path.
CVE-2026-7482 — Ollama Heap Out-of-Bounds Read in GGUF Model Loader
Severity: Critical (CVSS 9.1)
Affected component: Ollama before 0.17.1, GGUF model loader (/api/create)
CWE: CWE-125 (Out-of-Bounds Read)
Ollama’s /api/create endpoint accepts an attacker-supplied GGUF file in which the declared tensor offset and size exceed the file’s actual length. During quantization (in fs/ggml/gguf.go and server/quantization.go’s WriteTo()), the server reads past the allocated heap buffer. The leaked memory may include environment variables, API keys, system prompts, and other users’ conversation data, and can be exfiltrated by pushing the resulting model artifact through /api/push to an attacker-controlled registry. Both endpoints are unauthenticated in the upstream distribution.
Exploitation: Default deployments bind to 127.0.0.1, requiring local access. However, the documented OLLAMA_HOST=0.0.0.0 configuration is widely used in practice, and NVD notes large public-internet exposure has been observed. Network-accessible instances are remotely exploitable without authentication.
Patch status: Fixed in Ollama 0.17.1. This is a critical update for any Ollama deployment reachable beyond localhost.
CVE-2026-26030 — Microsoft Semantic Kernel (Python) Remote Code Execution
Severity: Critical (CVSS 9.9)
Affected component: Semantic Kernel Python SDK before 1.39.4, InMemoryVectorStore filter functionality
CWE: CWE-94 (Improper Control of Generation of Code)
Microsoft’s Semantic Kernel Python SDK contains a remote code execution vulnerability in the InMemoryVectorStore filter functionality. Crafted filter input is evaluated in a way that permits arbitrary code execution within the host process — a serious exposure for any agent or RAG pipeline that builds vector-store filters from caller-influenced data.
Exploitation: Reachable in deployments that expose the in-memory vector store’s filtering to untrusted input. CVSS scoring reflects low attacker privilege and a scope change, consistent with the near-maximum 9.9 base score.
Patch status: Fixed in python-1.39.4. As a workaround, avoid using InMemoryVectorStore for production scenarios.
CVE-2026-1839 — Hugging Face transformers Arbitrary Code Execution via Trainer Checkpoint
Severity: High (CVSS 7.8)
Affected component: Hugging Face transformers (Trainer class) with torch>=2.2 on PyTorch < 2.6
CWE: CWE-502 (Deserialization of Untrusted Data)
The Trainer class’s _load_rng_state() method in src/transformers/trainer.py calls torch.load() without weights_only=True. On PyTorch versions below 2.6, the safe_globals() context manager provides no protection, so an attacker who supplies a malicious checkpoint file (such as rng_state.pth) can execute arbitrary code when the checkpoint is loaded. This affects training and fine-tuning workflows that resume from untrusted or attacker-influenced checkpoints.
Exploitation: Requires the victim to load a malicious checkpoint, so the attack surface is checkpoint provenance — downloaded checkpoints, shared training artifacts, or resumable jobs fed untrusted state. Local vector with user interaction, per the NVD CVSS metrics.
Mitigation / patch: Resolved in transformers v5.0.0rc3. Upgrade PyTorch to 2.6+ where weights_only defaults provide protection, and only resume training from checkpoints whose origin you control.
Summary
| CVE | Component | Severity | Patched |
|---|---|---|---|
| CVE-2026-22778 ↗ | vLLM heap leak → RCE chain | Critical 9.8 | Yes — 0.14.1 |
| CVE-2026-42208 ↗ | LiteLLM SQL injection | Critical 9.8 | Yes — 1.83.7 |
| CVE-2026-27893 ↗ | vLLM hardcoded trust_remote_code | High 8.8 | Yes — 0.18.0 |
| CVE-2026-7482 ↗ | Ollama GGUF heap OOB read | Critical 9.1 | Yes — 0.17.1 |
| CVE-2026-26030 ↗ | Semantic Kernel (Python) RCE | Critical 9.9 | Yes — python-1.39.4 |
| CVE-2026-1839 ↗ | transformers Trainer torch.load | High 7.8 | Yes — v5.0.0rc3 |
The pattern this cycle mirrors the past several quarters: serving infrastructure, model-loading paths, and LLM gateways remain the most exposed surface. The trust_remote_code and checkpoint-deserialization findings are worth treating as a design-level signal for any team running agentic or training workflows — code execution that fires on untrusted model content is a class of risk, not a single CVE.
mlcves.com ↗ tracks ML-specific CVEs with searchable component and severity filters. Subscribe to their feed if you maintain a software bill of materials for your ML stack.
Sources
- NVD CVE Search ↗ — primary CVE record source.
- mlcves.com ↗ — ML-specific CVE database with component organization.
→ This post is part of the AI Security Intelligence Hub — the complete resource index for AI security on ai-alert.org.
For more context, AI incident tracker ↗ covers related topics in depth.
Sources
AI Alert — in your inbox
AI incidents and vulnerabilities — tracked, sourced, dated. — delivered when there's something worth your inbox.
No spam. Unsubscribe anytime.
Related
CVE Roundup: AI/ML Infrastructure Vulnerabilities — Q1 2026
A quarterly review of critical CVEs disclosed in Q1 2026 affecting model serving infrastructure: vLLM, NVIDIA Triton Inference Server, Gradio, LangChain
Machine Learning Security: Attack Taxonomy, CVEs, and Defenses
A technical overview of machine learning security threats in 2026: NIST's adversarial ML taxonomy, MITRE ATLAS attack classes, the CVE-2025-62164 vLLM
Weekly AI Security Digest — May Week 2, 2026
Top five AI security developments from May 5-9, 2026: CISA guidance on AI in critical infrastructure, new prompt injection research, LLM supply chain