Conversation
joker-at-work
left a comment
There was a problem hiding this comment.
This is part 1 of the review. I don't think I can check the logic stuff today.
nova/conf/base.py
Outdated
| 'bigvm_exporter_listen_port', | ||
| default=9847, | ||
| help=""" | ||
| Port where the BigVM prometheus exporter to listen for HTTP requests. |
There was a problem hiding this comment.
typo "to listen" doesn't fit with "where". needs to be something like "exporter listens"
nova/bigvm/exporter.py
Outdated
|
|
||
|
|
||
| def start_bigvm_exporter(): | ||
| port = int(CONF.bigvm_exporter_listen_port or 9847) |
There was a problem hiding this comment.
Is this int necessary? The option is defined as IntOpt, so I would assume we don't need it here.
|
|
||
| def start_bigvm_exporter(): | ||
| port = int(CONF.bigvm_exporter_listen_port or 9847) | ||
| start_http_server(port, registry=REGISTRY) |
There was a problem hiding this comment.
Does this start a new process or a new thread?
If it starts a thread, does it start an eventlet greenthread (because eventlet patched the threading module) or does it spawn a native thread?
If it spawns a greenthread, does the prometheus_client library do anything blocking that could hinder the manager to run properly?
If it spawns a native thread, we cannot use logging (or anything else take takes a threading.Lock) anywhere inside that native thread or we risk a hanging service.
There was a problem hiding this comment.
Everything behind is based on daemonic threading.Thread, which to my understanding is patched by eventlet.
There was a problem hiding this comment.
You missed to answer one question: Does prometheus_client use anything that would block the process. eventlet greenthreads are not preempted, but give up the CPU when they would do blocking operations. This needs library support (or usage of one of the eventlet-patched functions). If the greenthread doesn't give up the CPU on blocking operations, no other greenthread will run.
There was a problem hiding this comment.
This code adds one more greenthread to the pgt() output. Here is it:
2 <greenlet.greenlet object at 0x7f802a44d510 (otid=0x7f8031506a00) suspended active started>
File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/green/thread.py", line 42, in __thread_body
func(*args, **kwargs)
File "/usr/lib/python3.8/threading.py", line 890, in _bootstrap
self._bootstrap_inner()
File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/green/thread.py", line 63, in wrap_bootstrap_inner
bootstrap_inner()
File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.8/socketserver.py", line 232, in serve_forever
ready = selector.select(poll_interval)
File "/usr/lib/python3.8/selectors.py", line 323, in select
r, w, _ = self._select(self._readers, self._writers, [], timeout)
File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/green/select.py", line 80, in select
return hub.switch()
File "/var/lib/openstack/lib/python3.8/site-packages/eventlet/hubs/hub.py", line 313, in switch
return self.greenlet.switch()
Looking at the stack trace there is the socketserver.py which is the module that's being used by the prometheus library to expose the endpoint.
So this runs in a greenthread.
nova/bigvm/exporter.py
Outdated
| def __init__(self, registry=REGISTRY): | ||
| self.host_errors_counter = \ | ||
| Counter('nova_bigvm_host_errors', | ||
| 'Nova BigVM errors counter.', |
There was a problem hiding this comment.
I think this needs to be more descriptive as I could not tell you, what this means without looking into the code that increases the counter.
nova/bigvm/exporter.py
Outdated
|
|
||
| self.free_hosts_count_gauge = \ | ||
| Gauge('nova_bigvm_free_hosts_count', | ||
| 'Nova BigVM hosts available.', |
There was a problem hiding this comment.
Could you expand on the description a little more, please? hosts available for what? Is this the count of the resource-providers that are ready for a BigVM deployment?
nova/bigvm/exporter.py
Outdated
| self.host_freeing_up_gauge.remove( | ||
| rp['vc'], rp['host'], rp['rp']['name']) |
There was a problem hiding this comment.
What happens if that provider is not in here, because we just restarted and don't have any data about this provider anymore?
grandchild
left a comment
There was a problem hiding this comment.
https://github.com/prometheus/client_python#disabling-default-collector-metrics
It sounds like we have to disable these automatic metrics? Or do we want them? I don't think we need GC and process stats.
I haven't seen any other metric apart from the ones defined by us, while testing the exporter endpoint. |
joker-at-work
left a comment
There was a problem hiding this comment.
Looks good, but still 2 questions remaining.
|
|
||
| def start_bigvm_exporter(): | ||
| port = int(CONF.bigvm_exporter_listen_port or 9847) | ||
| start_http_server(port, registry=REGISTRY) |
There was a problem hiding this comment.
You missed to answer one question: Does prometheus_client use anything that would block the process. eventlet greenthreads are not preempted, but give up the CPU when they would do blocking operations. This needs library support (or usage of one of the eventlet-patched functions). If the greenthread doesn't give up the CPU on blocking operations, no other greenthread will run.
nova/bigvm/manager.py
Outdated
| else: | ||
| free_hosts += 1 | ||
|
|
||
| bigvm_metrics.set_free_hosts_count(free_hosts) |
There was a problem hiding this comment.
Would it make sense to show the free hosts per hv_size instead of the overall number, because it depends on the size if we can spawn certain instances?
f3ffa01 to
60b8dd1
Compare
Exposing the following metrics:
Counter nova_bigvm_host_errors{error, vc, host, rp}
Counter nova_bigvm_no_candidate_error{hv_size}
Gauge nova_bigvm_host_freeing_up{vc, host, rp}
Gauge nova_bigvm_free_hosts_count{hv_size}
Change-Id: I050eeb1036910c03428eaa8aad7e992f241f6f51
60b8dd1 to
be76d9d
Compare
Currently exposing the following metrics:
Counter nova_bigvm_host_errors{error, vc, host, rp}
Counter nova_bigvm_no_candidate_error{hv_size}
Gauge nova_bigvm_host_freeing_up{vc, host, rp}
Gauge nova_bigvm_free_hosts_count{}