-
Notifications
You must be signed in to change notification settings - Fork 966
Add TLS client certificate expiry monitoring #2914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
61e3bdc to
3b19b38
Compare
Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
192984e to
9bb21dc
Compare
| # Emit a warning log when a client-presented TLS certificate gets close to its | ||
| # expiration date. The value is expressed in days. Set to 0 to disable the | ||
| # warning; no connections are rejected when the threshold is crossed. | ||
| # | ||
| # tls-client-cert-expiry-warn-threshold 10 | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very clear why this should be the responsibility of the server to emit these types of events. Can clients not also track this information?
Today, LL_WARNING is reserved for events which need immediate server operator intervention, in this cause it's not the server that has an issue but the end client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @madolson, thanks for the feedback! I understand the concern. ideally clients could track this themselves, but many client libraries or environments don’t expose the right hooks. A lightweight server-side signal helps operators be proactive, especially when client visibility is limited.
It’s also not uncommon for database cores to surface these kinds of client-side behaviors. for example, Cassandra emits similar warnings to help operators understand issues that can directly impact customer traffic, even if the server itself is healthy.
I’m open to adjusting the severity if LL_WARNING feels too strong; the main goal is simply to provide centralized visibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @madolson, thanks again for the feedback. I’ve lowered the log level to LL_NOTICEso it no longer reads like a server error. The KPI in INFO stays the same to drive proactive alerting.
This signal helps the server as well as clients: if a shared client cert quietly expires, thousands of clients may start reconnecting and re-handshaking at once, creating a connection storm. With the INFO metric + NOTICE log in place, our SRE partners can wire up alerts and rotate the cert ahead of time, keeping both the clients and the server stable.
Signed-off-by: Yiwen Zhang <zhangyiwen1221@gmail.com>
client_cert_min_seconds_until_expiry(initially -1).Configurable warnings: Introduce
tls-client-cert-expiry-warn-thresholdso operators can enable proactive alerts. Example configuration:tls-client-cert-expiry-warn-threshold 10Example warning:
TLS client certificate for id=147 addr=10.1.2.3:54128 fd=15 name=*redacted* expires in 4 days (threshold 10 days).24‑hour deduplication: To avoid flooding logs, each certificate is fingerprinted (SHA‑256) and stored in client_cert_expiry_warned with a 24‑hour suppression window. The same certificate will trigger at most one warning per day.