fix(weather): stop logging API keys and auth headers#188
fix(weather): stop logging API keys and auth headers#188pdettori wants to merge 2 commits intokagenti:mainfrom
Conversation
esnible
left a comment
There was a problem hiding this comment.
This PR seems to solve the problem, but it seems nearly as large at the weather agent itself. Do we want a large weather agent? I think it should be very small, as it is our most typical example, so that humans and AI can easily read and understand it.
pdettori
left a comment
There was a problem hiding this comment.
Review: Key validation and redaction with non-OpenAI providers
The key validation logic (has_valid_api_key) looks good — it only rejects exact placeholder matches (dummy, changeme, etc.), so real keys from any provider will pass fine.
However, the SecretRedactionFilter has a gap for non-OpenAI providers:
agent.py:31 — _API_KEY_RE only matches sk-* prefixed keys:
_API_KEY_RE = re.compile(r"(sk-[a-zA-Z0-9]{3})[a-zA-Z0-9]+")In CI we use RHOAI MaaS (<model>--maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443) where keys are 32-char lowercase alphanumeric strings with no sk- prefix. These would not be redacted if they appear in log messages outside of Authorization headers.
The Bearer \S+ pattern covers the auth header case, but httpx/openai debug logs or error messages may surface the key in other formats.
Suggestion: Broaden the API key regex to catch generic long alphanumeric tokens, or consider an approach that redacts the configured llm_api_key value directly (since Configuration already has it). For example:
# Option A: redact any long alphanumeric string that looks like a secret
_API_KEY_RE = re.compile(r"(sk-[a-zA-Z0-9]{3})[a-zA-Z0-9]+|(?<=[^a-zA-Z0-9])([a-z0-9]{4})[a-z0-9]{28,}(?=[^a-zA-Z0-9]|$)")
# Option B: redact the actual configured key value
_configured_key = os.environ.get("LLM_API_KEY", "")
if len(_configured_key) > 8:
_LITERAL_KEY_RE = re.compile(re.escape(_configured_key))Not a CI blocker — the validation itself is fine — but worth hardening before merge.
mrsabath
left a comment
There was a problem hiding this comment.
LGTM — clean security fix with solid defense-in-depth.
Well-structured fix that addresses a real vulnerability (API keys in pod logs via the log_authorization_header middleware and DEBUG-level logging). The three-layer redaction approach (Bearer pattern, sk-* pattern, literal configured key) is good defense-in-depth. Test coverage is thorough with 17 new tests covering edge cases including RHOAI MaaS keys.
Areas reviewed: Python, Tests, Security, Commit conventions, PR format
Commits: 4 commits, all signed-off ✓
CI status: all passing ✓
Minor suggestions below — none are blockers.
| args = record.args if isinstance(record.args, tuple) else (record.args,) | ||
| new_args = [] | ||
| for arg in args: | ||
| if isinstance(arg, str): |
There was a problem hiding this comment.
suggestion: record.args can also be a dict (for %(key)s-style log formatting). The current code wraps non-tuple values in a tuple, which would wrap a dict rather than iterating its values. Low risk since dict-style logging is rare, but for completeness:
if isinstance(record.args, dict):
record.args = {k: self._redact(v) if isinstance(v, str) else v for k, v in record.args.items()}
elif record.args:
args = record.args if isinstance(record.args, tuple) else (record.args,)
...Not a blocker — could be a follow-up.
| """Check if the API key is usable. | ||
|
|
||
| Local LLMs (Ollama, vLLM, etc.) accept any key including placeholders, | ||
| so placeholder keys are only flagged when pointing at a remote API. |
There was a problem hiding this comment.
nit: from urllib.parse import urlparse — consider moving to top-level imports for consistency. Inline imports are fine for avoiding circular deps, but that's not the case here.
Remove the middleware that logged Authorization headers on every request, exposing API keys in pod logs. Defense-in-depth changes: - Change root log level from DEBUG to INFO - Add SecretRedactionFilter (Bearer tokens + literal configured key) - Add has_valid_api_key check with clear user-facing error message - Simplified from prior iteration per review feedback (esnible, mrsabath): dropped sk-* regex (literal key covers it), inlined is_local_llm, removed log_warnings(), handled dict args in filter 18 new tests cover redaction and API key validation edge cases. Closes kagenti#119 Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Paolo Dettori <dettori@us.ibm.com>
e822c2d to
13e4882
Compare
Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com> Signed-off-by: Paolo Dettori <dettori@us.ibm.com>
Summary
LogAuthorizationMiddlewarethat logged full Authorization headers (including API keys) on every requestDEBUGtoINFOto prevent httpx/openai libraries from dumping request headersSecretRedactionFilteron the root logger — redactsBearer ...tokens and the literal configuredLLM_API_KEYvalue from any log messagehas_valid_api_keycheck inConfiguration— returns a clear user-facing error when the API key is a placeholder and the API base is remoteSimplified from prior iteration per review feedback: dropped the
sk-*regex (literal key + Bearer cover all cases), inlinedis_local_llm, removedlog_warnings(), and properly handle dict args in the filter.Root Cause
Three issues combined to expose the OpenAI API key in K8s pod logs:
agent.py— middleware explicitly logged the Authorization header valueagent.py—DEBUGlog level caused httpx/openai to also dump headersconfiguration.py— no validation of the API key, so blank/dummy keys produced confusing errorsTest Plan
tests/a2a/test_weather_secret_redaction.py:SecretRedactionFilter: redacts Bearer tokens (case-insensitive), literal configured key, handles tuple and dict args, preserves non-secret messagesConfigurationApiKeyValidation: placeholder/empty keys invalid for remote APIs, valid for local LLMs, real keys validkubectl logsCloses #119