AI Alert
Monthly AI/ML CVE roundup covering LiteLLM, Ollama, vLLM vulnerabilities in inference and serving frameworks
cve-roundup

AI/ML CVE Roundup: May 2026 — What Got Patched

A summary of AI and ML-adjacent CVEs disclosed in early–mid 2026 across model serving frameworks, LLM API gateways, agent SDKs, and ML training libraries.

By AI Alert Desk · · 8 min read

The first half of 2026 produced a busy patch cycle for the ML infrastructure stack. Advisories dropped across model serving frameworks, LLM API gateways, agent SDKs, and ML training libraries. This roundup covers the disclosures worth tracking; we skip low-severity issues with no plausible exploitation path in standard deployments.

Severity labels follow CVSS v3.1 convention. Patch status is as of publication date (2026-05-06); check vendor advisories for updates.


CVE-2026-22778 — vLLM Heap Address Leak Enabling Remote Code Execution Chain

Severity: Critical (CVSS 9.8) Affected component: vLLM 0.8.3 to before 0.14.1, multimodal endpoint CWE: CWE-209 (Information Exposure Through an Error Message)

When an invalid image is sent to vLLM’s multimodal endpoint, the underlying PIL library throws an error and vLLM returns that error to the client — leaking a heap address in the process. With this leak, an attacker reduces ASLR entropy from roughly four billion guesses to about eight. NVD notes the leak can be chained with a heap overflow in the JPEG2000 decoder in OpenCV/FFmpeg to achieve remote code execution.

Exploitation: Network-reachable against any deployment that exposes the multimodal (image) input path to untrusted callers. The information leak alone is unauthenticated; the full RCE chain depends on the vulnerable image-decoding dependencies being present.

Patch status: Fixed in vLLM 0.14.1. Update and pin.


CVE-2026-42208 — LiteLLM SQL Injection via Proxy API Key Check

Severity: Critical (CVSS 9.8) Affected component: LiteLLM 1.81.16 to before 1.83.7, proxy API key check path CWE: CWE-89 (SQL Injection)

A database query used during LiteLLM proxy API key checks mixed the caller-supplied key value into the query text instead of binding it as a separate parameter. An unauthenticated attacker could send a crafted Authorization header to any LLM API route (for example POST /chat/completions) and reach this query through the proxy’s error-handling path, enabling SQL injection against the proxy’s backing database.

Exploitation: Unauthenticated and network-reachable. Any LiteLLM proxy deployment in the affected version range that is exposed to untrusted clients is at risk; the injection is reachable through normal LLM API routes, not an administrative endpoint.

Patch status: Fixed in LiteLLM 1.83.7. This is the priority update for anyone running the LiteLLM gateway.


CVE-2026-27893 — vLLM Remote Code Execution via Hardcoded trust_remote_code

Severity: High (CVSS 8.8) Affected component: vLLM 0.10.1 to before 0.18.0, model implementation files CWE: CWE-94 (Improper Control of Generation of Code)

Two model implementation files in vLLM hardcoded trust_remote_code=True when loading sub-components, silently bypassing a user’s explicit --trust-remote-code=False security opt-out. This enables remote code execution via a malicious model repository even when the operator has deliberately disabled remote code trust — the user’s safety setting is ignored for those code paths.

Exploitation: Requires the server to load an attacker-influenced model repository. Notably, the usual mitigation (disabling trust_remote_code) does not protect against this path, which is what elevates the risk for operators who believed they had opted out.

Patch status: Fixed in vLLM 0.18.0.

Note: This is the recurring pattern in agentic and model-loading frameworks — code execution paths that fire on untrusted model content. The defensive posture is to treat any model artifact loaded from an untrusted source as untrusted code, and to verify that “safe” configuration flags are actually honored across the whole loading path.


CVE-2026-7482 — Ollama Heap Out-of-Bounds Read in GGUF Model Loader

Severity: Critical (CVSS 9.1) Affected component: Ollama before 0.17.1, GGUF model loader (/api/create) CWE: CWE-125 (Out-of-Bounds Read)

Ollama’s /api/create endpoint accepts an attacker-supplied GGUF file in which the declared tensor offset and size exceed the file’s actual length. During quantization (in fs/ggml/gguf.go and server/quantization.go’s WriteTo()), the server reads past the allocated heap buffer. The leaked memory may include environment variables, API keys, system prompts, and other users’ conversation data, and can be exfiltrated by pushing the resulting model artifact through /api/push to an attacker-controlled registry. Both endpoints are unauthenticated in the upstream distribution.

Exploitation: Default deployments bind to 127.0.0.1, requiring local access. However, the documented OLLAMA_HOST=0.0.0.0 configuration is widely used in practice, and NVD notes large public-internet exposure has been observed. Network-accessible instances are remotely exploitable without authentication.

Patch status: Fixed in Ollama 0.17.1. This is a critical update for any Ollama deployment reachable beyond localhost.


CVE-2026-26030 — Microsoft Semantic Kernel (Python) Remote Code Execution

Severity: Critical (CVSS 9.9) Affected component: Semantic Kernel Python SDK before 1.39.4, InMemoryVectorStore filter functionality CWE: CWE-94 (Improper Control of Generation of Code)

Microsoft’s Semantic Kernel Python SDK contains a remote code execution vulnerability in the InMemoryVectorStore filter functionality. Crafted filter input is evaluated in a way that permits arbitrary code execution within the host process — a serious exposure for any agent or RAG pipeline that builds vector-store filters from caller-influenced data.

Exploitation: Reachable in deployments that expose the in-memory vector store’s filtering to untrusted input. CVSS scoring reflects low attacker privilege and a scope change, consistent with the near-maximum 9.9 base score.

Patch status: Fixed in python-1.39.4. As a workaround, avoid using InMemoryVectorStore for production scenarios.


CVE-2026-1839 — Hugging Face transformers Arbitrary Code Execution via Trainer Checkpoint

Severity: High (CVSS 7.8) Affected component: Hugging Face transformers (Trainer class) with torch>=2.2 on PyTorch < 2.6 CWE: CWE-502 (Deserialization of Untrusted Data)

The Trainer class’s _load_rng_state() method in src/transformers/trainer.py calls torch.load() without weights_only=True. On PyTorch versions below 2.6, the safe_globals() context manager provides no protection, so an attacker who supplies a malicious checkpoint file (such as rng_state.pth) can execute arbitrary code when the checkpoint is loaded. This affects training and fine-tuning workflows that resume from untrusted or attacker-influenced checkpoints.

Exploitation: Requires the victim to load a malicious checkpoint, so the attack surface is checkpoint provenance — downloaded checkpoints, shared training artifacts, or resumable jobs fed untrusted state. Local vector with user interaction, per the NVD CVSS metrics.

Mitigation / patch: Resolved in transformers v5.0.0rc3. Upgrade PyTorch to 2.6+ where weights_only defaults provide protection, and only resume training from checkpoints whose origin you control.


Summary

CVEComponentSeverityPatched
CVE-2026-22778vLLM heap leak → RCE chainCritical 9.8Yes — 0.14.1
CVE-2026-42208LiteLLM SQL injectionCritical 9.8Yes — 1.83.7
CVE-2026-27893vLLM hardcoded trust_remote_codeHigh 8.8Yes — 0.18.0
CVE-2026-7482Ollama GGUF heap OOB readCritical 9.1Yes — 0.17.1
CVE-2026-26030Semantic Kernel (Python) RCECritical 9.9Yes — python-1.39.4
CVE-2026-1839transformers Trainer torch.loadHigh 7.8Yes — v5.0.0rc3

The pattern this cycle mirrors the past several quarters: serving infrastructure, model-loading paths, and LLM gateways remain the most exposed surface. The trust_remote_code and checkpoint-deserialization findings are worth treating as a design-level signal for any team running agentic or training workflows — code execution that fires on untrusted model content is a class of risk, not a single CVE.

mlcves.com tracks ML-specific CVEs with searchable component and severity filters. Subscribe to their feed if you maintain a software bill of materials for your ML stack.

Sources


→ This post is part of the AI Security Intelligence Hub — the complete resource index for AI security on ai-alert.org.

For more context, AI incident tracker covers related topics in depth.

Sources

  1. NVD CVE Search
  2. mlcves.com — ML CVE Database
Subscribe

AI Alert — in your inbox

AI incidents and vulnerabilities — tracked, sourced, dated. — delivered when there's something worth your inbox.

No spam. Unsubscribe anytime.

Related

Comments