MON-4036: Add NodeExporterConfig to ClusterMonitoring API#2744
MON-4036: Add NodeExporterConfig to ClusterMonitoring API#2744danielmellado wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@danielmellado: This pull request references MON-4036 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Hello @danielmellado! Some important instructions when contributing to openshift/api: |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new optional 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Review Summary by QodoAdd NodeExporterConfig to ClusterMonitoring API with collector configuration support
WalkthroughsDescription• Added NodeExporterConfig to ClusterMonitoringSpec with comprehensive configuration support for the node-exporter agent • Implemented 10 individual metric collectors (buddyInfo, cpuFreq, ethtool, ksmd, mountStats, netClass, netDev, processes, systemd, tcpStat) with per-collector enable/disable toggles and optional configuration • Added support for nodeSelector, resources, tolerations, maxProcs, and ignoredNetworkDevices configuration options • Generated OpenAPI schemas, Swagger documentation, and deepcopy methods for all new node-exporter configuration types • Updated CRD manifests (main, feature-gated, and payload) with complete schema definitions and validation rules • Added 13 comprehensive test cases validating both valid configurations and constraint violations Diagramflowchart LR
A["ClusterMonitoringSpec"] -->|"adds field"| B["NodeExporterConfig"]
B -->|"configures"| C["Node-Exporter Agent"]
B -->|"supports"| D["10 Collectors"]
B -->|"supports"| E["Pod Configuration"]
D -->|"examples"| F["buddyInfo, cpuFreq, ethtool, ksmd, mountStats"]
E -->|"includes"| G["nodeSelector, resources, tolerations, maxProcs"]
File Changes1. config/v1alpha1/types_cluster_monitoring.go
|
Code Review by Qodo
1. nodeExporterConfig not feature-gated
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| // nodeExporterConfig is an optional field that can be used to configure the node-exporter agent | ||
| // that runs as a DaemonSet in the openshift-monitoring namespace. The node-exporter agent collects | ||
| // hardware and OS-level metrics from every node in the cluster. | ||
| // When omitted, this means no opinion and the platform is left to choose a reasonable default, which is subject to change over time. | ||
| // +optional | ||
| NodeExporterConfig NodeExporterConfig `json:"nodeExporterConfig,omitempty,omitzero"` |
There was a problem hiding this comment.
1. nodeexporterconfig not feature-gated 📘 Rule violation ✓ Correctness
The new spec.nodeExporterConfig API surface is introduced without an +openshift:enable:FeatureGate=... marker (and corresponding FeatureGate definition). This can make the new API available without controlled rollout.
Agent Prompt
## Issue description
`spec.nodeExporterConfig` is a newly introduced API surface but is not behind an OpenShift FeatureGate (missing `+openshift:enable:FeatureGate=...` and the FeatureGate registration).
## Issue Context
The compliance checklist requires new APIs to be feature-gated for safe rollout.
## Fix Focus Areas
- config/v1alpha1/types_cluster_monitoring.go[106-116]
- features/features.go[1-120]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| // NodeExporterIgnoredNetworkDevice is a regular expression pattern that matches a network device name | ||
| // to be excluded from node-exporter metric collection for collectors such as netdev, netclass, and ethtool. | ||
| // Must be a valid regular expression and at most 1024 characters. | ||
| // +kubebuilder:validation:MinLength=1 | ||
| // +kubebuilder:validation:MaxLength=1024 | ||
| type NodeExporterIgnoredNetworkDevice string |
There was a problem hiding this comment.
2. nodeexporterignorednetworkdevice minlength undocumented 📘 Rule violation ✓ Correctness
The comment for NodeExporterIgnoredNetworkDevice does not document the +kubebuilder:validation:MinLength=1 constraint. This makes the API behavior for empty strings unclear to users.
Agent Prompt
## Issue description
`NodeExporterIgnoredNetworkDevice` has a `MinLength=1` validation marker but its comment does not document the minimum-length / non-empty requirement.
## Issue Context
API documentation must reflect all kubebuilder validation markers so users understand constraints and omitted/empty behavior.
## Fix Focus Areas
- config/v1alpha1/types_cluster_monitoring.go[673-678]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| units: | ||
| description: |- | ||
| units is a list of regular expression patterns that match systemd units to be included | ||
| by the systemd collector. | ||
| units is optional. | ||
| By default, the list is empty, so the collector exposes no metrics for systemd units. | ||
| Each entry must be a valid regular expression and at most 1024 characters. | ||
| Maximum length for this list is 50. | ||
| Minimum length for this list is 1. | ||
| Entries in this list must be unique. | ||
| items: | ||
| description: |- | ||
| NodeExporterSystemdUnit is a regular expression pattern that matches a systemd unit name. | ||
| Must be at most 1024 characters. | ||
| maxLength: 1024 | ||
| minLength: 1 | ||
| type: string | ||
| maxItems: 50 | ||
| minItems: 1 | ||
| type: array | ||
| x-kubernetes-list-type: set | ||
| required: | ||
| - enabled | ||
| type: object | ||
| tcpStat: | ||
| description: |- | ||
| tcpStat configures the tcpstat collector, which collects TCP connection statistics. | ||
| tcpStat is optional. | ||
| When omitted, this means no opinion and the platform is left to choose a reasonable default, | ||
| which is subject to change over time. The current default is disabled. | ||
| properties: | ||
| enabled: | ||
| description: |- | ||
| enabled enables or disables the tcpstat collector. | ||
| This field is required. | ||
| Valid values are "Enabled" and "Disabled". | ||
| When set to "Enabled", the tcpstat collector is active and TCP connection statistics are collected. | ||
| When set to "Disabled", the tcpstat collector is inactive. | ||
| enum: | ||
| - Enabled | ||
| - Disabled | ||
| type: string | ||
| required: | ||
| - enabled | ||
| type: object | ||
| type: object | ||
| ignoredNetworkDevices: | ||
| description: |- | ||
| ignoredNetworkDevices is a list of regular expression patterns that match network devices | ||
| to be excluded from the relevant collector configuration such as netdev, netclass, and ethtool. | ||
| ignoredNetworkDevices is optional. | ||
|
|
||
| When omitted, the Cluster Monitoring Operator uses a predefined list of devices to be excluded | ||
| to minimize the impact on memory usage. | ||
| When set as an empty list, no devices are excluded. | ||
| If you modify this setting, monitor the prometheus-k8s deployment closely for excessive memory usage. | ||
| Maximum length for this list is 50. | ||
| Minimum length for this list is 1. | ||
| Each entry must be at most 1024 characters long. | ||
| items: | ||
| description: |- | ||
| NodeExporterIgnoredNetworkDevice is a regular expression pattern that matches a network device name | ||
| to be excluded from node-exporter metric collection for collectors such as netdev, netclass, and ethtool. | ||
| Must be a valid regular expression and at most 1024 characters. | ||
| maxLength: 1024 | ||
| minLength: 1 | ||
| type: string |
There was a problem hiding this comment.
3. Regex validity not enforced 📘 Rule violation ✓ Correctness
The API documentation states certain fields/entries must be valid regular expressions, but the generated CRD schema only enforces length and does not enforce regex validity. This can lead to invalid configurations being accepted by the API server and failing later at runtime.
Agent Prompt
## Issue description
The CRD documentation claims some strings must be valid regular expressions, but the schema only enforces length. This violates the requirement that documented constraints be enforced by validation rules.
## Issue Context
Kubernetes schema validation can enforce patterns or use CEL/XValidation rules; if regex validity cannot be reliably validated at schema level, the documentation should be adjusted to match actual enforcement.
## Fix Focus Areas
- config/v1alpha1/types_cluster_monitoring.go[655-678]
- config/v1alpha1/types_cluster_monitoring.go[932-951]
- config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings.crd.yaml[1521-1587]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| // ignoredNetworkDevices is a list of regular expression patterns that match network devices | ||
| // to be excluded from the relevant collector configuration such as netdev, netclass, and ethtool. | ||
| // ignoredNetworkDevices is optional. | ||
| // | ||
| // When omitted, the Cluster Monitoring Operator uses a predefined list of devices to be excluded | ||
| // to minimize the impact on memory usage. | ||
| // When set as an empty list, no devices are excluded. | ||
| // If you modify this setting, monitor the prometheus-k8s deployment closely for excessive memory usage. | ||
| // Maximum length for this list is 50. | ||
| // Minimum length for this list is 1. | ||
| // Each entry must be at most 1024 characters long. | ||
| // +kubebuilder:validation:MaxItems=50 | ||
| // +kubebuilder:validation:MinItems=1 | ||
| // +listType=set |
There was a problem hiding this comment.
4. Empty ignorednetworkdevices blocked 🐞 Bug ✓ Correctness
The API says users can set ignoredNetworkDevices to an empty list to exclude no devices, but the Go validation tags generate CRD minItems: 1, making an empty list invalid. This breaks the documented configuration knob and prevents users from clearing the platform default exclusion list.
Agent Prompt
## Issue description
`NodeExporterConfig.ignoredNetworkDevices` is documented to allow an empty list (`[]`) to mean “exclude no devices”, but validation enforces `minItems: 1`, so the empty list is rejected by the CRD. This is an API contract/behavior mismatch.
## Issue Context
- The Go type comment promises a special meaning for an empty list.
- The Go kubebuilder marker `MinItems=1` generates CRD `minItems: 1`, which blocks `[]`.
- Tests currently do not cover the empty-list case.
## Fix Focus Areas
- config/v1alpha1/types_cluster_monitoring.go[655-670]
- config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings.crd.yaml[1567-1591]
- payload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml[1567-1591]
- config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml[1567-1591]
- config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml[580-789]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml (1)
661-789: Add a contract test forignoredNetworkDevices: [].The new suite does not assert the documented empty-list behavior for
ignoredNetworkDevices. A dedicated case will prevent schema/docs drift from slipping through.🧪 Suggested test case
+ - name: Should be able to create NodeExporterConfig with empty ignoredNetworkDevices + initial: | + apiVersion: config.openshift.io/v1alpha1 + kind: ClusterMonitoring + spec: + nodeExporterConfig: + ignoredNetworkDevices: [] + expected: | + apiVersion: config.openshift.io/v1alpha1 + kind: ClusterMonitoring + spec: + nodeExporterConfig: + ignoredNetworkDevices: []🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml` around lines 661 - 789, Add a new contract test case for NodeExporterConfig to assert the documented behavior when ignoredNetworkDevices is an empty array: create a test entry (similar to the existing cases) named something like "Should accept NodeExporterConfig with empty ignoredNetworkDevices" where spec.nodeExporterConfig.ignoredNetworkDevices: [] in the initial YAML and the expected output mirrors that (or no expectedError), ensuring the test references NodeExporterConfig and ignoredNetworkDevices so the suite validates empty-list handling and prevents schema/docs drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/v1alpha1/types_cluster_monitoring.go`:
- Around line 655-668: The comment and validation for ignoredNetworkDevices
conflict: the docs state an empty list is valid (meaning "no devices are
excluded") but the kubebuilder tag +kubebuilder:validation:MinItems=1 prevents
[] from being used. Fix by removing or relaxing the MinItems constraint on
ignoredNetworkDevices (i.e., delete the +kubebuilder:validation:MinItems=1 tag
or change it to allow 0) so an empty slice is accepted; keep the MaxItems
constraint and listType as-is and update any unit tests or schema consumers if
they assume MinItems=1.
In
`@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml`:
- Around line 1522-1541: The schema currently only enforces length/count for
systemd.units[] (NodeExporterSystemdUnit) and ignoredNetworkDevices[]; add
OpenAPI pattern validation to each items schema to ensure values are valid
regular expressions (e.g., add a "pattern" field under the items for
systemd.units and for ignoredNetworkDevices that encodes the allowed regex
syntax you want to accept—preferably aligned to the RE2 flavor used downstream),
keep maxLength/minLength and descriptions, and update the
NodeExporterSystemdUnit/ignoredNetworkDevices items to reference the new pattern
so invalid regex strings are rejected by the CRD validation.
- Around line 1568-1590: The description for ignoredNetworkDevices says an empty
list is allowed but the schema sets minItems: 1; update the schema to match the
description by changing the minItems constraint for ignoredNetworkDevices to 0
(or remove the minItems line) so an empty array is accepted, and keep maxItems:
50 and the items block (NodeExporterIgnoredNetworkDevice,
maxLength/minLength/type) unchanged; ensure any references or validation that
rely on minItems in code or docs are adjusted accordingly.
---
Nitpick comments:
In
`@config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml`:
- Around line 661-789: Add a new contract test case for NodeExporterConfig to
assert the documented behavior when ignoredNetworkDevices is an empty array:
create a test entry (similar to the existing cases) named something like "Should
accept NodeExporterConfig with empty ignoredNetworkDevices" where
spec.nodeExporterConfig.ignoredNetworkDevices: [] in the initial YAML and the
expected output mirrors that (or no expectedError), ensuring the test references
NodeExporterConfig and ignoredNetworkDevices so the suite validates empty-list
handling and prevents schema/docs drift.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f41b0979-fba6-4d38-bfe5-c83eb2d1eb31
⛔ Files ignored due to path filters (3)
config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings.crd.yamlis excluded by!**/zz_generated.crd-manifests/*config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**openapi/generated_openapi/zz_generated.openapi.gois excluded by!openapi/**
📒 Files selected for processing (5)
config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yamlconfig/v1alpha1/types_cluster_monitoring.goconfig/v1alpha1/zz_generated.deepcopy.goconfig/v1alpha1/zz_generated.swagger_doc_generated.gopayload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml
| description: |- | ||
| units is a list of regular expression patterns that match systemd units to be included | ||
| by the systemd collector. | ||
| units is optional. | ||
| By default, the list is empty, so the collector exposes no metrics for systemd units. | ||
| Each entry must be a valid regular expression and at most 1024 characters. | ||
| Maximum length for this list is 50. | ||
| Minimum length for this list is 1. | ||
| Entries in this list must be unique. | ||
| items: | ||
| description: |- | ||
| NodeExporterSystemdUnit is a regular expression pattern that matches a systemd unit name. | ||
| Must be at most 1024 characters. | ||
| maxLength: 1024 | ||
| minLength: 1 | ||
| type: string | ||
| maxItems: 50 | ||
| minItems: 1 | ||
| type: array | ||
| x-kubernetes-list-type: set |
There was a problem hiding this comment.
Regex validity is documented but not enforced in schema.
systemd.units[] and ignoredNetworkDevices[] are documented as “valid regular expression” values, but current validation only checks length/count. Invalid regex strings can be admitted and fail downstream.
Also applies to: 1581-1588
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml`
around lines 1522 - 1541, The schema currently only enforces length/count for
systemd.units[] (NodeExporterSystemdUnit) and ignoredNetworkDevices[]; add
OpenAPI pattern validation to each items schema to ensure values are valid
regular expressions (e.g., add a "pattern" field under the items for
systemd.units and for ignoredNetworkDevices that encodes the allowed regex
syntax you want to accept—preferably aligned to the RE2 flavor used downstream),
keep maxLength/minLength and descriptions, and update the
NodeExporterSystemdUnit/ignoredNetworkDevices items to reference the new pattern
so invalid regex strings are rejected by the CRD validation.
payload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml
Show resolved
Hide resolved
There was a problem hiding this comment.
- qodo's finding about a missing FeatureGate is a false positive. The entire ClusterMonitoring CRD is already gated behind
ClusterMonitoringConfig(line 39), so no per-field gate is needed. - both bots suggest adding OpenAPI pattern validation to enforce regex syntax on
ignoredNetworkDevicesandsystemd.unitsentries. This is not feasible; there is no OpenAPI pattern or CEL function that can validate whether a string is a valid regex. The realistic fix is to soften the documentation language (see inline comment).
a54edbe to
98a2cb7
Compare
|
Thanks for the review! Good catch on the MinItems=1 contradiction! I initially tried just removing the marker, but make lint isn't happy without it: So I went with It explicitly allows empty lists, keeps the linter happy, and the documented "When set as an empty list, no devices are excluded" behavior now works as expected. Also added a test for |
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
config/v1alpha1/types_cluster_monitoring.go (1)
652-655:⚠️ Potential issue | 🟡 Minor
maxProcs: 0still serializes as “omitted” for typed clients.With
json:"maxProcs,omitempty", a generated Go client cannot send an explicit0; it gets dropped before theMinimum=1validation runs. If the API contract intends to reject0instead of silently taking the default, this field needs presence tracking (for example*int32), and the new NodeExporter coverage should include amaxProcs: 0case once that is fixed.In Go's JSON encoding, does a field tagged `json:"maxProcs,omitempty"` omit a zero-valued `int32`, making an explicit `0` indistinguishable from an omitted field?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/v1alpha1/types_cluster_monitoring.go` around lines 652 - 655, Summary: a plain int32 with `json:"...,omitempty"` hides explicit 0 values, so change MaxProcs to a pointer type to track presence. Replace the MaxProcs field type from int32 to *int32 (i.e., MaxProcs *int32) in the ClusterMonitoring spec so callers can send explicit 0 vs omitted; keep or adjust the kubebuilder validation tags to apply to the pointer as needed and ensure validation still enforces Minimum=1 when present. Update any related code paths that dereference MaxProcs (e.g., validators, defaults) to handle nil vs non-nil, and add a NodeExporter test case that sends maxProcs: 0 to verify the API rejects or accepts per the contract.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/v1alpha1/types_cluster_monitoring.go`:
- Around line 656-670: Replace the json tag on the IgnoredNetworkDevices field
in types_cluster_monitoring.go so a non-nil empty slice is preserved (use the
omitzero tag supported in Go 1.24+); i.e., change the struct tag for
IgnoredNetworkDevices to include ",omitzero" (so nil slices are omitted but
non-nil empty slices marshal to []), and add a typed round-trip regression test
(e.g., TestRoundTripIgnoredNetworkDevices) that constructs a CR with
IgnoredNetworkDevices set to a non-nil empty slice, kube-type
serializes/deserializes it, and asserts the field remains an empty slice (not
nil) to verify the semantic difference is preserved.
---
Duplicate comments:
In `@config/v1alpha1/types_cluster_monitoring.go`:
- Around line 652-655: Summary: a plain int32 with `json:"...,omitempty"` hides
explicit 0 values, so change MaxProcs to a pointer type to track presence.
Replace the MaxProcs field type from int32 to *int32 (i.e., MaxProcs *int32) in
the ClusterMonitoring spec so callers can send explicit 0 vs omitted; keep or
adjust the kubebuilder validation tags to apply to the pointer as needed and
ensure validation still enforces Minimum=1 when present. Update any related code
paths that dereference MaxProcs (e.g., validators, defaults) to handle nil vs
non-nil, and add a NodeExporter test case that sends maxProcs: 0 to verify the
API rejects or accepts per the contract.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 5ab267a4-337e-406f-b418-aad8042b31a1
⛔ Files ignored due to path filters (6)
config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings.crd.yamlis excluded by!**/zz_generated.crd-manifests/*config/v1alpha1/zz_generated.deepcopy.gois excluded by!**/zz_generated*config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**config/v1alpha1/zz_generated.swagger_doc_generated.gois excluded by!**/zz_generated*openapi/generated_openapi/zz_generated.openapi.gois excluded by!openapi/**,!**/zz_generated*openapi/openapi.jsonis excluded by!openapi/**
📒 Files selected for processing (3)
config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yamlconfig/v1alpha1/types_cluster_monitoring.gopayload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml
98a2cb7 to
4a67cf2
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@config/v1alpha1/types_cluster_monitoring.go`:
- Around line 667-671: The JSON struct tag for the IgnoredNetworkDevices field
currently uses `json:"ignoredNetworkDevices,omitempty"`, which causes non-nil
empty slices to be omitted and prevents Go clients from expressing an explicit
empty list; update the tag on the IgnoredNetworkDevices field (type
NodeExporterIgnoredNetworkDevice) to
`json:"ignoredNetworkDevices,omitempty,omitzero"` so non-nil empty slices
marshal as [] while preserving backward compatibility with omitted nil slices.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 8a1585c7-2ece-47bc-9d61-8ed9d50910c3
⛔ Files ignored due to path filters (4)
config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings.crd.yamlis excluded by!**/zz_generated.crd-manifests/*config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**openapi/generated_openapi/zz_generated.openapi.gois excluded by!openapi/**openapi/openapi.jsonis excluded by!openapi/**
📒 Files selected for processing (5)
config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yamlconfig/v1alpha1/types_cluster_monitoring.goconfig/v1alpha1/zz_generated.deepcopy.goconfig/v1alpha1/zz_generated.swagger_doc_generated.gopayload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml
4a67cf2 to
d58b16d
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
payload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml (1)
1520-1541:⚠️ Potential issue | 🟠 MajorRegex inputs are still accepted without schema-level validation
systemd.units[]andignoredNetworkDevices[]are documented as regex patterns, but the CRD schema only validates type/length. Invalid regex values can still be admitted and only fail later in controller reconciliation. Please add admission-time validation for regex compilability to fail fast.Also applies to: 1580-1588
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml` around lines 1520 - 1541, Add admission-time validation so regex entries are rejected if they don't compile: for the systemd.units[] items (NodeExporterSystemdUnit) and ignoredNetworkDevices[] items, add x-kubernetes-validations CEL rules on the string item schema that exercise regex compilation (e.g. a CEL rule that calls matches with a constant like self.matches("") or equivalent) so invalid regex patterns fail admission instead of being accepted by type/length-only checks; update both the items blocks for NodeExporterSystemdUnit and the corresponding ignoredNetworkDevices item schema accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml`:
- Around line 1520-1541: Add admission-time validation so regex entries are
rejected if they don't compile: for the systemd.units[] items
(NodeExporterSystemdUnit) and ignoredNetworkDevices[] items, add
x-kubernetes-validations CEL rules on the string item schema that exercise regex
compilation (e.g. a CEL rule that calls matches with a constant like
self.matches("") or equivalent) so invalid regex patterns fail admission instead
of being accepted by type/length-only checks; update both the items blocks for
NodeExporterSystemdUnit and the corresponding ignoredNetworkDevices item schema
accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 4f705a8f-5f6c-42c4-8faf-87b5822c3ff7
⛔ Files ignored due to path filters (3)
config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings.crd.yamlis excluded by!**/zz_generated.crd-manifests/*config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**openapi/generated_openapi/zz_generated.openapi.gois excluded by!openapi/**
📒 Files selected for processing (5)
config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yamlconfig/v1alpha1/types_cluster_monitoring.goconfig/v1alpha1/zz_generated.deepcopy.goconfig/v1alpha1/zz_generated.swagger_doc_generated.gopayload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml
saschagrunert
left a comment
There was a problem hiding this comment.
Two new comments on undocumented cross-field interactions. The previously acknowledged items (omitzero on ignoredNetworkDevices, min length doc) are still pending a push.
| // When omitted, this means no opinion and the platform is left to choose a reasonable default, | ||
| // which is subject to change over time. The current default is "Enabled". | ||
| // +optional | ||
| UseNetlink NodeExporterNetlinkState `json:"useNetlink,omitempty"` |
There was a problem hiding this comment.
The comment doesn't explain what happens when enabled is Disabled but useNetlink is set. Per conventions, field interactions must be documented. Add a line like:
// When the netclass collector is disabled, this field is ignored.
Consider also enforcing this with:
// +kubebuilder:validation:XValidation:rule="!has(self.useNetlink) || self.enabled == 'Enabled'",message="useNetlink can only be specified when the netclass collector is enabled"There was a problem hiding this comment.
Good catch, thanks! Added the cross-field documentation to the useNetlink comment:
// When the netclass collector is disabled, this field is ignored and must not be set.
| // +kubebuilder:validation:MinItems=1 | ||
| // +listType=set | ||
| // +optional | ||
| Units []NodeExporterSystemdUnit `json:"units,omitempty"` |
There was a problem hiding this comment.
Same as useNetlink: the comment doesn't explain what happens when enabled is Disabled but units is populated. Add:
// When the systemd collector is disabled, this field is ignored.
Consider enforcing with:
// +kubebuilder:validation:XValidation:rule="!has(self.units) || self.enabled == 'Enabled'",message="units can only be specified when the systemd collector is enabled"There was a problem hiding this comment.
Similar pattern here ;)
|
Yes, was about to push ;) Let me add these comments and regenerate. Thanks! |
d58b16d to
dc35dac
Compare
|
@danielmellado: This pull request references MON-4036 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "4.22." or "openshift-4.22.", but it targets "openshift-5.0" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
♻️ Duplicate comments (2)
payload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml (1)
1535-1543:⚠️ Potential issue | 🟠 MajorReject invalid regex values at admission time
systemd.units[](Line 1539) andignoredNetworkDevices[](Line 1594) still only validate length/type, so malformed regex strings can be admitted and then fail later at controller reconcile time.Proposed schema-level validation
items: description: |- NodeExporterSystemdUnit is a string that is interpreted as a Go regular expression pattern by the controller to match systemd unit names. Invalid regular expressions will cause a controller-level error at runtime. Must be at least 1 character and at most 1024 characters. maxLength: 1024 minLength: 1 type: string + x-kubernetes-validations: + - message: must be a valid regular expression + rule: '"".matches(self) || !"".matches(self)' @@ items: description: |- NodeExporterIgnoredNetworkDevice is a string that is interpreted as a Go regular expression pattern by the controller to match network device names to exclude from node-exporter metric collection for collectors such as netdev, netclass, and ethtool. Invalid regular expressions will cause a controller-level error at runtime. Must be at least 1 character and at most 1024 characters. maxLength: 1024 minLength: 1 type: string + x-kubernetes-validations: + - message: must be a valid regular expression + rule: '"".matches(self) || !"".matches(self)'Please verify this CEL pattern behavior against your target kube-apiserver version; if not supported, enforce regex parsing in admission/controller before persisting invalid specs. As per coding guidelines, "Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity."
Also applies to: 1589-1598
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml` around lines 1535 - 1543, The CRD currently only enforces min/max length for NodeExporterSystemdUnit strings (used by systemd.units[] and ignoredNetworkDevices[]) so invalid regexes can be persisted and later fail; update validation to reject malformed regex at admission by adding a schema-level CEL (or OpenAPI x-kubernetes-validations/CEL) expression that attempts to compile the regex (or uses a safe pattern check supported by your target kube-apiserver) for both fields (systemd.units and ignoredNetworkDevices) so only valid regex strings are admitted, or if CEL is unsupported for your kube-apiserver, add admission/validation logic in the controller/admission webhook that parses/compiles NodeExporterSystemdUnit values and returns a rejection with a clear error when compilation fails.config/v1alpha1/types_cluster_monitoring.go (1)
292-306:⚠️ Potential issue | 🟠 MajorDrop
omitemptyfromignoredNetworkDevicesto preserve explicit[].Line 298 gives
[]different semantics from omission, but Line 306 still usesomitempty, which drops non-nil empty slices during marshaling. That makes the documented override unreachable from typed Go clients. Usejson:"ignoredNetworkDevices,omitzero"here;omitempty,omitzerowould still strip[]. The YAML fixture mentioned in the PR objectives would not catch this typed-client path.Proposed fix
- IgnoredNetworkDevices []NodeExporterIgnoredNetworkDevice `json:"ignoredNetworkDevices,omitempty"` + IgnoredNetworkDevices []NodeExporterIgnoredNetworkDevice `json:"ignoredNetworkDevices,omitzero"`Read-only verification
Expected: the field tag resolves to
json:"ignoredNetworkDevices,omitzero", and there is Go-side round-trip coverage in addition to the YAML fixture coverage.#!/bin/bash set -euo pipefail echo "Go version:" rg -n '^go ' go.mod echo echo "Current field tag:" rg -n 'IgnoredNetworkDevices.*json:"ignoredNetworkDevices' config/v1alpha1/types_cluster_monitoring.go echo echo "Nearby omitzero usage in this API:" rg -n 'omitzero' config/v1alpha1/types_cluster_monitoring.go | head -n 20 echo echo "Coverage related to ignoredNetworkDevices:" rg -n -C2 'ignoredNetworkDevices|IgnoredNetworkDevices' config/v1alpha1 echo echo "Go-side typed coverage hints:" rg -n -C2 'IgnoredNetworkDevices|ignoredNetworkDevices|Marshal|Unmarshal' --type go config/v1alpha1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@config/v1alpha1/types_cluster_monitoring.go` around lines 292 - 306, The JSON tag on the IgnoredNetworkDevices field uses `omitempty` which drops non-nil empty slices and prevents preserving an explicit [] override; change the struct tag on IgnoredNetworkDevices to `json:"ignoredNetworkDevices,omitzero"` (remove `omitempty`) in types_cluster_monitoring.go so typed Go clients will round-trip an explicit empty slice, and add a small Go unit test that marshals and unmarshals a struct with IgnoredNetworkDevices set to an empty slice to verify the field is preserved through JSON round-trip.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@config/v1alpha1/types_cluster_monitoring.go`:
- Around line 292-306: The JSON tag on the IgnoredNetworkDevices field uses
`omitempty` which drops non-nil empty slices and prevents preserving an explicit
[] override; change the struct tag on IgnoredNetworkDevices to
`json:"ignoredNetworkDevices,omitzero"` (remove `omitempty`) in
types_cluster_monitoring.go so typed Go clients will round-trip an explicit
empty slice, and add a small Go unit test that marshals and unmarshals a struct
with IgnoredNetworkDevices set to an empty slice to verify the field is
preserved through JSON round-trip.
In
`@payload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml`:
- Around line 1535-1543: The CRD currently only enforces min/max length for
NodeExporterSystemdUnit strings (used by systemd.units[] and
ignoredNetworkDevices[]) so invalid regexes can be persisted and later fail;
update validation to reject malformed regex at admission by adding a
schema-level CEL (or OpenAPI x-kubernetes-validations/CEL) expression that
attempts to compile the regex (or uses a safe pattern check supported by your
target kube-apiserver) for both fields (systemd.units and ignoredNetworkDevices)
so only valid regex strings are admitted, or if CEL is unsupported for your
kube-apiserver, add admission/validation logic in the controller/admission
webhook that parses/compiles NodeExporterSystemdUnit values and returns a
rejection with a clear error when compilation fails.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: 1a7543b4-c0b1-4731-9c5d-cd9ac2c226fe
⛔ Files ignored due to path filters (3)
config/v1alpha1/zz_generated.crd-manifests/0000_10_config-operator_01_clustermonitorings.crd.yamlis excluded by!**/zz_generated.crd-manifests/*config/v1alpha1/zz_generated.featuregated-crd-manifests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**openapi/generated_openapi/zz_generated.openapi.gois excluded by!openapi/**
📒 Files selected for processing (5)
config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yamlconfig/v1alpha1/types_cluster_monitoring.goconfig/v1alpha1/zz_generated.deepcopy.goconfig/v1alpha1/zz_generated.swagger_doc_generated.gopayload-manifests/crds/0000_10_config-operator_01_clustermonitorings.crd.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
- config/v1alpha1/tests/clustermonitorings.config.openshift.io/ClusterMonitoringConfig.yaml
Add configuration for the node-exporter agent that runs as a DaemonSet in openshift-monitoring, collecting hardware and OS-level metrics from every node in the cluster. Signed-off-by: Daniel Mellado <dmellado@fedoraproject.org>
dc35dac to
4c675db
Compare
|
@danielmellado: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Add configuration for the node-exporter agent that runs as a
DaemonSet in openshift-monitoring, collecting hardware and
OS-level metrics from every node in the cluster.
Signed-off-by: Daniel Mellado dmellado@fedoraproject.org