Add prometheus exporter for nova-bigvm by leust · Pull Request #388 · sapcc/nova

leust · 2022-12-20T10:58:22Z

Currently exposing the following metrics:

Counter nova_bigvm_host_errors{error, vc, host, rp}
Counter nova_bigvm_no_candidate_error{hv_size}
Gauge nova_bigvm_host_freeing_up{vc, host, rp}
Gauge nova_bigvm_free_hosts_count{}

joker-at-work

This is part 1 of the review. I don't think I can check the logic stuff today.

joker-at-work · 2023-02-22T10:50:57Z

nova/conf/base.py

+        'bigvm_exporter_listen_port',
+        default=9847,
+        help="""
+Port where the BigVM prometheus exporter to listen for HTTP requests.


typo "to listen" doesn't fit with "where". needs to be something like "exporter listens"

joker-at-work · 2023-02-22T10:53:32Z

nova/bigvm/exporter.py

+
+
+def start_bigvm_exporter():
+    port = int(CONF.bigvm_exporter_listen_port or 9847)


Is this int necessary? The option is defined as IntOpt, so I would assume we don't need it here.

joker-at-work · 2023-02-22T10:56:21Z

nova/bigvm/exporter.py

+
+def start_bigvm_exporter():
+    port = int(CONF.bigvm_exporter_listen_port or 9847)
+    start_http_server(port, registry=REGISTRY)


Does this start a new process or a new thread?

If it starts a thread, does it start an eventlet greenthread (because eventlet patched the threading module) or does it spawn a native thread?

If it spawns a greenthread, does the prometheus_client library do anything blocking that could hinder the manager to run properly?

If it spawns a native thread, we cannot use logging (or anything else take takes a threading.Lock) anywhere inside that native thread or we risk a hanging service.

Everything behind is based on daemonic threading.Thread, which to my understanding is patched by eventlet.

You missed to answer one question: Does prometheus_client use anything that would block the process. eventlet greenthreads are not preempted, but give up the CPU when they would do blocking operations. This needs library support (or usage of one of the eventlet-patched functions). If the greenthread doesn't give up the CPU on blocking operations, no other greenthread will run.

This code adds one more greenthread to the pgt() output. Here is it:

2 <greenlet.greenlet object at 0x7f802a44d510 (otid=0x7f8031506a00) suspended active started> File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/green/thread.py", line 42, in __thread_body func(*args, **kwargs) File "/usr/lib/python3.8/threading.py", line 890, in _bootstrap self._bootstrap_inner() File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/green/thread.py", line 63, in wrap_bootstrap_inner bootstrap_inner() File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/usr/lib/python3.8/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python3.8/socketserver.py", line 232, in serve_forever ready = selector.select(poll_interval) File "/usr/lib/python3.8/selectors.py", line 323, in select r, w, _ = self._select(self._readers, self._writers, [], timeout) File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/green/select.py", line 80, in select return hub.switch() File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/hubs/hub.py", line 313, in switch return self.greenlet.switch()

Looking at the stack trace there is the socketserver.py which is the module that's being used by the prometheus library to expose the endpoint.

So this runs in a greenthread.

joker-at-work · 2023-02-22T10:58:20Z

nova/bigvm/exporter.py

+    def __init__(self, registry=REGISTRY):
+        self.host_errors_counter = \
+            Counter('nova_bigvm_host_errors',
+                    'Nova BigVM errors counter.',


I think this needs to be more descriptive as I could not tell you, what this means without looking into the code that increases the counter.

joker-at-work · 2023-02-22T11:01:16Z

nova/bigvm/exporter.py

+
+        self.free_hosts_count_gauge = \
+            Gauge('nova_bigvm_free_hosts_count',
+                  'Nova BigVM hosts available.',


Could you expand on the description a little more, please? hosts available for what? Is this the count of the resource-providers that are ready for a BigVM deployment?

joker-at-work · 2023-02-22T11:03:39Z

nova/bigvm/exporter.py

+        self.host_freeing_up_gauge.remove(
+            rp['vc'], rp['host'], rp['rp']['name'])


What happens if that provider is not in here, because we just restarted and don't have any data about this provider anymore?

grandchild

https://github.com/prometheus/client_python#disabling-default-collector-metrics

It sounds like we have to disable these automatic metrics? Or do we want them? I don't think we need GC and process stats.

nova/bigvm/exporter.py

leust · 2023-04-26T10:37:33Z

It sounds like we have to disable these automatic metrics?

I haven't seen any other metric apart from the ones defined by us, while testing the exporter endpoint.

joker-at-work

Looks good, but still 2 questions remaining.

joker-at-work · 2023-08-25T08:24:02Z

nova/bigvm/exporter.py

+
+def start_bigvm_exporter():
+    port = int(CONF.bigvm_exporter_listen_port or 9847)
+    start_http_server(port, registry=REGISTRY)


You missed to answer one question: Does prometheus_client use anything that would block the process. eventlet greenthreads are not preempted, but give up the CPU when they would do blocking operations. This needs library support (or usage of one of the eventlet-patched functions). If the greenthread doesn't give up the CPU on blocking operations, no other greenthread will run.

joker-at-work · 2023-08-25T08:30:10Z

nova/bigvm/manager.py

+            else:
+                free_hosts += 1
+
+        bigvm_metrics.set_free_hosts_count(free_hosts)


Would it make sense to show the free hosts per hv_size instead of the overall number, because it depends on the size if we can spawn certain instances?

Added hv_size label.

Exposing the following metrics: Counter nova_bigvm_host_errors{error, vc, host, rp} Counter nova_bigvm_no_candidate_error{hv_size} Gauge nova_bigvm_host_freeing_up{vc, host, rp} Gauge nova_bigvm_free_hosts_count{hv_size} Change-Id: I050eeb1036910c03428eaa8aad7e992f241f6f51

leust mentioned this pull request Dec 20, 2022

Add nova-bigvm prometheus exporter sapcc/helm-charts#4457

Open

grandchild approved these changes Dec 23, 2022

View reviewed changes

leust requested a review from joker-at-work January 11, 2023 16:43

joker-at-work reviewed Feb 22, 2023

View reviewed changes

grandchild requested changes Apr 14, 2023

View reviewed changes

nova/bigvm/exporter.py Outdated Show resolved Hide resolved

nova/bigvm/exporter.py Outdated Show resolved Hide resolved

nova/bigvm/exporter.py Outdated Show resolved Hide resolved

leust force-pushed the bigvm_exporter branch from 6c26778 to 254e045 Compare May 2, 2023 11:39

joker-at-work reviewed Aug 25, 2023

View reviewed changes

leust force-pushed the bigvm_exporter branch 2 times, most recently from f3ffa01 to 60b8dd1 Compare December 11, 2023 14:52

leust force-pushed the bigvm_exporter branch from 60b8dd1 to be76d9d Compare December 13, 2023 15:43

leust requested a review from joker-at-work December 13, 2023 15:44

joker-at-work approved these changes Jan 24, 2024

View reviewed changes

grandchild approved these changes Jan 24, 2024

View reviewed changes



		def start_bigvm_exporter():
		port = int(CONF.bigvm_exporter_listen_port or 9847)

		self.host_freeing_up_gauge.remove(
		rp['vc'], rp['host'], rp['rp']['name'])

Conversation

leust commented Dec 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joker-at-work left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

grandchild left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

leust commented Apr 26, 2023

Uh oh!

joker-at-work left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

leust commented Dec 20, 2022 •

edited

Loading