AI Alert
cve

CVE-2026-7845: Hash collision in Langchain-Chatchat lets attackers swap pasted images

A weak-hash flaw in Langchain-Chatchat up to 0.3.1.3 lets an adjacent attacker overwrite pasted images by colliding MD5 hashes computed from PIL.Image.tobytes. No vendor patch has shipped.

By Theo Voss · · 8 min read

A weak-hash vulnerability tracked as CVE-2026-7845 affects every release of chatchat-space/Langchain-Chatchat up to and including 0.3.1.3. The flaw sits in the Vision Chat paste-image handler, where the server uses an MD5 hash computed over PIL.Image.tobytes() output as the on-disk filename. Because tobytes() discards palette and metadata information, an attacker on the same network can craft two visually different PNGs that hash to the same name, then overwrite a victim’s uploaded image so the vision LLM analyzes attacker-controlled content.

The bug was reported to the project on April 13, 2026 as issue #5462. The maintainers have not responded. A working proof-of-concept and exploitation walk-through are public.

Affected

The vulnerability

Langchain-Chatchat is a self-hosted RAG and chat platform built on top of LangChain, with a Streamlit-based web UI. Its dialogue page accepts pasted images for vision-model conversations. The handler computes an identifier for each paste with a single line of code, paraphrased from the public write-up:

name = hashlib.md5(paste_image.image_data.tobytes()).hexdigest() + ".png"

There are two problems stacked on top of each other.

First, MD5 has been a broken hash for collision resistance for nearly two decades. Using it for any kind of integrity-binding identifier is a CWE-328 finding on its own.

Second, and operationally more important: PIL’s Image.tobytes() method serializes only the raw pixel buffer. For an image opened in palette mode (P), that buffer is the array of palette indices. The actual color palette, the image dimensions in some cases, the color mode, transparency tables, and any ancillary PNG chunks are all dropped. That means two PNGs with identical index arrays but different palettes — one drawing an Apple logo, the other drawing a Google logo, to use the proof-of-concept’s example — produce byte-for-byte identical input to MD5. They get the same filename.

The tobytes() choice converts a weak-hash gripe into a working filename collision. The exploit chain described in the public PoC is straightforward:

  1. The attacker pre-builds a pair of P-mode PNGs that share the same pixel-index array but render different images.
  2. A victim uploads or pastes image A into a Vision Chat session. The server stores it under the MD5-of-tobytes() filename.
  3. The attacker, present on the same network and authenticated to the instance, uploads image B via the platform’s /v1/files endpoint.
  4. Image B writes to the same filename, overwriting A on disk.
  5. When the backend later fetches the image to send to the vision model, the model receives the attacker’s content. The chat session believes it is reasoning about the victim’s image.

The CVSS profile reflects the constraints: AV:A (adjacent network), AC:H (the attacker has to land the overwrite during a timing window), PR:L (the attacker needs an authenticated account on the instance), and VI:L (integrity-only impact, no direct confidentiality or availability damage). NVD scores it 2.6. That is mathematically correct and operationally misleading. For deployments that use Langchain-Chatchat as a multi-user vision assistant — the configuration the feature was built for — the practical effect is that one tenant can poison another tenant’s image inputs to the LLM.

The exploit also opens a race-condition channel: if the upload-then-fetch window is short, the attacker can target the brief interval between the victim’s paste and the vision model’s image fetch. The PoC documents API call sequences for this scenario.

Mitigation

There is no patched release. The upstream issue remains open and the project has not commented since April. The recommended source-level fix from the disclosure is to hash the complete encoded PNG byte stream with SHA-256, which preserves palette and metadata in the input to the hash:

buffer = io.BytesIO()
paste_image.image_data.save(buffer, format="png")
name = hashlib.sha256(buffer.getvalue()).hexdigest() + ".png"

Operators running Langchain-Chatchat in any multi-user configuration should apply that change locally or restrict the deployment until upstream merges a fix. Specifically:

  1. Patch in place. Replace the MD5-of-tobytes() line in libs/chatchat-server/chatchat/webui_pages/dialogue/dialogue.py with a SHA-256 hash over the full PNG-encoded buffer, or namespace filenames per session/user so collisions cannot cross trust boundaries.
  2. Treat the upload directory as a per-session boundary. Even with a hashed filename, store pasted images under a per-session or per-user prefix so a collision in another tenant’s namespace cannot overwrite anyone else’s content.
  3. Restrict instance access. The bug requires PR:L and AV:A. Limit who has accounts on the instance and which networks can reach it; do not expose multi-tenant Langchain-Chatchat instances to untrusted users without the patch.
  4. Watch upstream. Pin to 0.3.1.3 or the version you have audited, monitor issue #5462 for a maintainer response, and re-audit before bumping.

The advisory is a low-severity CVE on paper, but it is also a clear illustration of why hash inputs in ML platforms need to include the metadata the model will actually see. A vision LLM does not consume tobytes(); it consumes the rendered PNG, including the palette. Hashing anything less than the artifact you serve is an integrity gap waiting for a collision PoC, and this one already has one.

Sources

Sources

  1. NVD — CVE-2026-7845
  2. 3em0/cve_repo — Vuln-1 tobytes Hash Collision write-up
  3. Langchain-Chatchat issue #5462 (security report)
#cve #langchain-chatchat #weak-hash #vision-llm #md5 #image-collision
Subscribe

AI Alert — in your inbox

AI incidents and vulnerabilities — tracked, sourced, dated. — delivered when there's something worth your inbox.

No spam. Unsubscribe anytime.

Related

Comments