Skip to content

Add hv-metrics role#781

Merged
openshift-merge-bot[bot] merged 1 commit intoredhat-performance:mainfrom
agurenko:hv-metrics
Mar 23, 2026
Merged

Add hv-metrics role#781
openshift-merge-bot[bot] merged 1 commit intoredhat-performance:mainfrom
agurenko:hv-metrics

Conversation

@agurenko
Copy link
Copy Markdown
Contributor

@agurenko agurenko commented Mar 6, 2026

This PR adds two new roles to setup metrics collection from the hypervisor using prometheus server setup on a bastion node and node_exporter running on hypervisors. Optionally (on by default) it will install grafana with a basic dashboard.

hv-metrics-server role for the bastion
hv-metrics-exporter role for the hypervisors

This is a lighter version of proposed #779 that will be shelved for now.

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Mar 6, 2026

Hi @agurenko. Thanks for your PR.

I'm waiting for a redhat-performance member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@agurenko
Copy link
Copy Markdown
Contributor Author

agurenko commented Mar 6, 2026

/cc @akrzos @afcollins @mcornea please take a look

@openshift-ci openshift-ci Bot requested review from afcollins, akrzos and mcornea March 6, 2026 11:57
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Mar 6, 2026

@agurenko: GitHub didn't allow me to request PR reviews from the following users: please, take, a, look.

Note that only redhat-performance members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @akrzos @afcollins @mcornea please take a look

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@agurenko
Copy link
Copy Markdown
Contributor Author

agurenko commented Mar 9, 2026

Since @afcollins successfully tested role on his setup, I've made final changes to add it to the setup-bastion.yml and setup-hv.yaml as disabled by default. I've also updated the all.yaml.sample and hv.yaml.sample along with the vmno documentation (which is a prime candidate for this). Please let me know if there are any other comments

@agurenko agurenko marked this pull request as ready for review March 9, 2026 12:05
@openshift-ci openshift-ci Bot requested a review from rsevilla87 March 9, 2026 12:05
Copy link
Copy Markdown
Member

@akrzos akrzos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already providing great insight to hypervisors I am working with, left some comments for discussion. I think there was more experiments between using quadlet vs podman pod IIUC before we intend to merge this.

Comment thread ansible/roles/hv-metrics/README.md Outdated
Comment thread ansible/roles/hv-metrics/vars/main.yaml Outdated
Comment thread ansible/roles/hv-metrics/tasks/install_client.yaml Outdated
@agurenko
Copy link
Copy Markdown
Contributor Author

@akrzos @afcollins I think this is ready for prime time, except maybe look into the dashboard metrics themselves. Let's get a clear go on those before we merge. Functionality-wise it's ready and looks solid for clean installs.

Comment thread ansible/roles/hv-metrics/tasks/cleanup.yaml Outdated
Comment thread ansible/roles/hv-metrics-server/tasks/main.yaml Outdated
Comment thread ansible/roles/hv-metrics/tasks/sanity.yaml Outdated
Comment thread ansible/roles/hv-metrics/tasks/install_client.yaml Outdated
Comment thread ansible/roles/hv-metrics/tasks/install_client.yaml Outdated
Comment thread docs/troubleshooting.md Outdated
Comment thread ansible/roles/hv-metrics/tasks/main.yaml Outdated
Comment thread ansible/vars/all.sample.yml Outdated
Comment thread ansible/vars/all.sample.yml Outdated
Comment thread ansible/roles/hv-metrics-server/tasks/main.yaml Outdated
Copy link
Copy Markdown
Member

@akrzos akrzos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase and squash the commits here so we have a single clean commit for review.

Comment thread docs/deploy-vmno.md Outdated
Comment thread ansible/vars/all.sample.yml Outdated
Comment thread ansible/roles/hv-metrics-server/vars/main.yaml Outdated
Comment thread ansible/roles/hv-metrics-exporter/tasks/main.yaml Outdated
Comment thread ansible/roles/hv-metrics-server/tasks/main.yaml Outdated
Comment thread ansible/roles/hv-metrics-server/tasks/main.yaml Outdated
Comment thread ansible/roles/hv-metrics-server/templates/hv-metrics.kube.j2 Outdated
@agurenko agurenko force-pushed the hv-metrics branch 2 times, most recently from 471ca7e to 3dc7eaf Compare March 19, 2026 16:03
Comment thread ansible/roles/hv-metrics-server/vars/main.yaml Outdated
Comment thread ansible/roles/hv-metrics-server/README.md Outdated
Comment thread docs/deploy-vmno.md Outdated
Comment thread ansible/roles/hv-metrics-server/templates/prometheus.container.j2 Outdated
Comment thread ansible/roles/hv-metrics-server/templates/grafana.container.j2 Outdated
Comment thread ansible/roles/hv-metrics-server/tasks/main.yaml Outdated
Comment thread ansible/roles/hv-metrics-server/templates/grafana.container.j2 Outdated
@mcornea
Copy link
Copy Markdown
Collaborator

mcornea commented Mar 20, 2026

What do you think about moving the role vars to role defaults? Right now role vars take higher precedence than the play vars_files which prevents users from overriding them in hv.yml.

Comment thread ansible/roles/hv-metrics-server/templates/prometheus.yaml.j2 Outdated
@agurenko
Copy link
Copy Markdown
Contributor Author

What do you think about moving the role vars to role defaults? Right now role vars take higher precedence than the play vars_files which prevents users from overriding them in hv.yml.

I don't think that's true according to https://docs.ansible.com/projects/ansible/latest/playbook_guide/playbooks_variables.html#id44

12. Play vars

13. Play vars_prompt

14. Play vars_files

Defaults are lower priority, but vars_files (that are used in setup-bastion.yml and hv-setup.yml) still takes precedence.

@mcornea
Copy link
Copy Markdown
Collaborator

mcornea commented Mar 20, 2026

What do you think about moving the role vars to role defaults? Right now role vars take higher precedence than the play vars_files which prevents users from overriding them in hv.yml.

I don't think that's true according to https://docs.ansible.com/projects/ansible/latest/playbook_guide/playbooks_variables.html#id44

Hmm, according to the doc:

14. Play vars_files
15. Role vars (as defined in Role directory structure)

@akrzos
Copy link
Copy Markdown
Member

akrzos commented Mar 20, 2026

What do you think about moving the role vars to role defaults? Right now role vars take higher precedence than the play vars_files which prevents users from overriding them in hv.yml.

I don't think that's true according to https://docs.ansible.com/projects/ansible/latest/playbook_guide/playbooks_variables.html#id44

Hmm, according to the doc:

14. Play vars_files
15. Role vars (as defined in Role directory structure)

We should follow the same pattern here and have vars under defaults instead of a vars directory. This is what enables us to easily override the vars for different roles from the vars_files at the play level.

The reason it overrides is because:

In general, Ansible gives precedence to variables that were defined more recently, more actively, and with more explicit scope. Variables in the defaults folder inside a role are easily overridden. Anything in the vars directory of the role overrides previous versions of that variable in the namespace. Host or inventory variables override role defaults, but explicit includes such as the vars directory or an include_vars task override inventory variables.

@agurenko agurenko force-pushed the hv-metrics branch 4 times, most recently from 0503d9e to e4b0d21 Compare March 20, 2026 14:53
@agurenko agurenko requested review from akrzos and mcornea March 20, 2026 14:55
@agurenko agurenko force-pushed the hv-metrics branch 3 times, most recently from febda33 to 60236a8 Compare March 23, 2026 14:49
Copy link
Copy Markdown
Member

@akrzos akrzos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@openshift-ci openshift-ci Bot added the lgtm label Mar 23, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Mar 23, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: akrzos

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit 6dadf0b into redhat-performance:main Mar 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants