GHSA-6PR9-RP53-2PMC
Vulnerability from github – Published: 2026-06-17 14:06 – Updated: 2026-06-17 14:06Summary
vLLM's /v1/audio/transcriptions endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. Tested on vLLM v0.19.0.
Details
SpeechToTextProcessor rejects uploads over VLLM_MAX_AUDIO_CLIP_FILESIZE_MB (default 25MB) based on compressed byte length, but the audio decoder in audio.py accumulates all decoded frames into memory with no size limit before returning:
# speech_to_text.py L184-189
if len(audio_data) / 1024 ** 2 > self.max_audio_filesize_mb:
raise VLLMValidationError(...)
y, sr = load_audio(buf, sr=self.asr_config.sample_rate) # decoded size unchecked
# audio.py L77-107
chunks: list[npt.NDArray] = []
for frame in container.decode(stream):
chunks.append(frame.to_ndarray())
audio = np.concatenate(chunks, axis=-1).astype(np.float32) # single contiguous allocation
A 25MB OPUS file at 6kbps encodes ~8.7 hours of audio. Decoding produces ~5.7GB of float32 PCM (232x amplification), and np.concatenate then allocates a second contiguous array, bringing peak RSS to ~14.9GB from a single request. SpeechToTextConfig.max_audio_clip_s (default 30s) applies only after the full decode and does not prevent the allocation.
Impact
An unauthenticated attacker can exhaust server memory with a small number of concurrent requests, each a valid upload within the documented size limit. Severity was assessed with reference to prior OOM vulnerability reports in vLLM.
Fix
A fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/44970
{
"affected": [
{
"package": {
"ecosystem": "PyPI",
"name": "vllm"
},
"ranges": [
{
"events": [
{
"introduced": "0"
},
{
"last_affected": "0.23.0"
}
],
"type": "ECOSYSTEM"
}
]
}
],
"aliases": [
"CVE-2026-54233"
],
"database_specific": {
"cwe_ids": [
"CWE-409"
],
"github_reviewed": true,
"github_reviewed_at": "2026-06-17T14:06:22Z",
"nvd_published_at": null,
"severity": "MODERATE"
},
"details": "### Summary\nvLLM\u0027s `/v1/audio/transcriptions` endpoint limits compressed upload size but not decoded PCM output. A 25MB OPUS file expands to ~14.9GB of float32 PCM at decode time. Tested on vLLM v0.19.0.\n\n### Details\n`SpeechToTextProcessor` rejects uploads over `VLLM_MAX_AUDIO_CLIP_FILESIZE_MB` (default 25MB) based on compressed byte length, but the audio decoder in `audio.py` accumulates all decoded frames into memory with no size limit before returning:\n\n```python\n# speech_to_text.py L184-189\nif len(audio_data) / 1024 ** 2 \u003e self.max_audio_filesize_mb:\n raise VLLMValidationError(...)\ny, sr = load_audio(buf, sr=self.asr_config.sample_rate) # decoded size unchecked\n\n# audio.py L77-107\nchunks: list[npt.NDArray] = []\nfor frame in container.decode(stream):\n chunks.append(frame.to_ndarray())\naudio = np.concatenate(chunks, axis=-1).astype(np.float32) # single contiguous allocation\n```\n\nA 25MB OPUS file at 6kbps encodes ~8.7 hours of audio. Decoding produces ~5.7GB of float32 PCM (232x amplification), and `np.concatenate` then allocates a second contiguous array, bringing peak RSS to ~14.9GB from a single request. `SpeechToTextConfig.max_audio_clip_s` (default 30s) applies only after the full decode and does not prevent the allocation.\n\n### Impact\nAn unauthenticated attacker can exhaust server memory with a small number of concurrent requests, each a valid upload within the documented size limit. Severity was assessed with reference to prior OOM vulnerability reports in vLLM.\n\n### Fix\n\nA fix for this vulnerability was merged here: https://github.com/vllm-project/vllm/pull/44970",
"id": "GHSA-6pr9-rp53-2pmc",
"modified": "2026-06-17T14:06:22Z",
"published": "2026-06-17T14:06:22Z",
"references": [
{
"type": "WEB",
"url": "https://github.com/vllm-project/vllm/security/advisories/GHSA-6pr9-rp53-2pmc"
},
{
"type": "WEB",
"url": "https://github.com/vllm-project/vllm/pull/44970"
},
{
"type": "WEB",
"url": "https://github.com/vllm-project/vllm/commit/1b1359c33269446f13c05da9a90c25174cbea590"
},
{
"type": "PACKAGE",
"url": "https://github.com/vllm-project/vllm"
},
{
"type": "WEB",
"url": "https://github.com/vllm-project/vllm/releases/tag/v0.23.1rc0"
}
],
"schema_version": "1.4.0",
"severity": [
{
"score": "CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:H",
"type": "CVSS_V3"
}
],
"summary": "vLLM: OOM Denial of Service via Audio Decompression Bomb"
}
Sightings
| Author | Source | Type | Date | Other |
|---|
Nomenclature
- Seen: The vulnerability was mentioned, discussed, or observed by the user.
- Confirmed: The vulnerability has been validated from an analyst's perspective.
- Published Proof of Concept: A public proof of concept is available for this vulnerability.
- Exploited: The vulnerability was observed as exploited by the user who reported the sighting.
- Patched: The vulnerability was observed as successfully patched by the user who reported the sighting.
- Not exploited: The vulnerability was not observed as exploited by the user who reported the sighting.
- Not confirmed: The user expressed doubt about the validity of the vulnerability.
- Not patched: The vulnerability was not observed as successfully patched by the user who reported the sighting.