diff --git a/LICENSE-3rdparty.csv b/LICENSE-3rdparty.csv index cd7f8f2a8ef49..423d28d0365e4 100644 --- a/LICENSE-3rdparty.csv +++ b/LICENSE-3rdparty.csv @@ -80,4 +80,5 @@ tuf,PyPI,Apache-2.0,Copyright (c) 2010 New York University tuf,PyPI,MIT,Copyright (c) 2010 New York University urllib3,PyPI,MIT,Copyright (c) 2008-2020 Andrey Petrov and contributors. vertica-python,PyPI,Apache-2.0,"Copyright 2013 Justin Berka, Alex Kim, Siting Ren" +voltdbclient,PyPI,MIT,Copyright (c) Volt Active Data wrapt,PyPI,BSD-3-Clause,"Copyright (c) 2013-2026, Graham Dumpleton" diff --git a/agent_requirements.in b/agent_requirements.in index 8075c8d95ef92..af73daa91eba6 100644 --- a/agent_requirements.in +++ b/agent_requirements.in @@ -66,4 +66,5 @@ supervisor==4.3.0 tuf==4.0.0 urllib3==2.6.3 vertica-python==1.4.0 +voltdbclient==14.2.0 wrapt==2.1.2 diff --git a/voltdb/README.md b/voltdb/README.md index 2f8255ca420bc..b6d7ef670b332 100644 --- a/voltdb/README.md +++ b/voltdb/README.md @@ -30,55 +30,81 @@ No additional installation is needed on your server. 2. Edit the `voltdb.d/conf.yaml` file, in the `conf.d/` folder at the root of your Agent's configuration directory to start collecting your VoltDB performance data. See the [sample voltdb.d/conf.yaml][4] for all available configuration options. + The integration supports two transports. Pick the one that matches your network topology: + + **Native binary client** - direct connection to a database node on the VoltDB client port (default `21212`), using the [VoltDB Python client][12]. Recommended when the Agent host can reach the database nodes directly: + ```yaml init_config: instances: - - url: http://localhost:8080 + - host: localhost + port: 21212 username: datadog-agent password: "" ``` -3. [Restart the Agent][5]. + For failover across cluster members, use `hosts` instead of `host`. The Agent connects to the first reachable entry and silently fails over to the others if the active node becomes unavailable: -#### TLS support - -If [TLS/SSL][6] is enabled on the client HTTP port: + ```yaml + instances: + - hosts: + - voltdb-1.example:21212 + - voltdb-2.example:21212 + - voltdb-3.example:21212 + username: datadog-agent + password: "" + ``` -1. Export your certificate CA file in PEM format: + **HTTP/JSON via the VoltDB Management Center (VMC)** - useful when database nodes aren't directly reachable but the VMC endpoint is, or when you prefer to keep the existing HTTP/JSON wire format. Set `url` to the VMC endpoint: - ```bash - keytool -exportcert -file /path/to/voltdb-ca.pem -keystore -storepass -alias voltdb -rfc + ```yaml + instances: + - url: http://vmc.example.com:8080 + username: datadog-agent + password: "" ``` -1. Export your certificate in PEM format: + The check picks the transport based on which option is set: `url` selects HTTP mode, `host`/`hosts` selects native mode. HTTP mode keeps all the options earlier releases supported (`password_hashed`, `tls_cert`, `tls_ca_cert`, `tls_verify`, `proxy`, `headers`, etc.) - see the [sample config][4] for the full list. - ```bash - openssl pkcs12 -nodes -in -out /path/to/voltdb.pem -password pass: - ``` + **Backwards compatibility**: existing configurations that point `url` at the legacy HTTP endpoint continue to work without changes. This release adds the native binary transport as an option; it does not remove the HTTP one. The `url`-style config still emits the same metrics and service checks against the same VMC or HTTP-enabled database node it always pointed at. - The resulting file should contain the _unencrypted_ private key and the certificate: +3. [Restart the Agent][5]. - ``` - -----BEGIN PRIVATE KEY----- - - -----END PRIVATE KEY----- - -----BEGIN CERTIFICATE----- - - -----END CERTIFICATE----- - ``` +#### TLS support -2. In your instance configuration, point `url` to the TLS-enabled client endpoint, and set the `tls_cert` and `tls_ca_cert` options. For example: +If [TLS/SSL][6] is enabled on the VoltDB client port, set `use_ssl: true` and point `ssl_config_file` at a VoltDB SSL properties file that describes how to locate the truststore (and optionally a client keystore for mutual TLS): - ```yaml - instances: - - # ... - url: https://localhost:8443 - tls_cert: /path/to/voltdb.pem - tls_ca_cert: /path/to/voltdb-ca.pem - ``` +```yaml +instances: + - host: localhost + port: 21212 + username: datadog-agent + password: "" + use_ssl: true + ssl_config_file: /etc/voltdb/ssl.properties +``` -3. [Restart the Agent][5]. +The properties file is the same format VoltDB's own tools (`sqlcmd`, `voltadmin`) consume. The native Python client supports Java keystores (`.jks`), PKCS12 (`.p12`/`.pfx`), and PEM. A minimal one-way TLS configuration looks like: + +```properties +# /etc/voltdb/ssl.properties +truststore=/etc/voltdb/certs/truststore.jks +truststorepassword= +``` + +For mutual TLS, also add a keystore that identifies the Agent to the server: + +```properties +truststore=/etc/voltdb/certs/truststore.jks +truststorepassword= +keystore=/etc/voltdb/certs/agent-keystore.jks +keystorepassword= +``` + +If you have a PEM CA bundle instead of a Java keystore, you can either point `ssl_config_file` directly at the PEM file (it is treated as the truststore), or reference it explicitly with `cacerts=` inside the properties file. + +When the Agent runs in a container, make sure the properties file and every path it references are mounted into the container. See the [VoltDB TLS/SSL documentation][6] for details on generating keystores with `keytool` and rotating certificates. #### Log collection @@ -140,3 +166,5 @@ Need help? Contact [Datadog support][11]. [9]: https://github.com/DataDog/integrations-core/blob/master/voltdb/metadata.csv [10]: https://github.com/DataDog/integrations-core/blob/master/voltdb/assets/service_checks.json [11]: https://docs.datadoghq.com/help/ +[12]: https://pypi.org/project/voltdbclient/ +[13]: https://docs.datadoghq.com/agent/configuration/secrets-management/ diff --git a/voltdb/assets/configuration/spec.yaml b/voltdb/assets/configuration/spec.yaml index 1fe8b77e61950..0837d3375da1f 100644 --- a/voltdb/assets/configuration/spec.yaml +++ b/voltdb/assets/configuration/spec.yaml @@ -11,24 +11,115 @@ files: - template: instances options: + - name: host + description: | + Host of the VoltDB cluster member to connect to via the native binary + protocol. Use this (with `port`) for direct connections to a database + node. Set `url` instead to talk to the VoltDB Management Center (VMC) + over HTTP/JSON. For failover across multiple cluster members, use + `hosts` instead. + fleet_configurable: true + display_priority: 5 + value: + type: string + example: localhost + + - name: hosts + description: | + List of VoltDB cluster members to try when connecting via the native + binary protocol. Each entry is either `hostname` (uses the global + `port`) or `hostname:port`. The Agent connects to the first + reachable entry and silently fails over to subsequent entries if + the active node becomes unavailable. Takes precedence over `host`. + fleet_configurable: true + display_priority: 5 + value: + type: array + items: + type: string + example: + - voltdb-1.example:21212 + - voltdb-2.example:21212 + - voltdb-3.example:21212 + - name: url - description: URL to a VoltDB client endpoint. + description: | + URL of a VoltDB HTTP/JSON endpoint, typically served by the + VoltDB Management Center (VMC). When set, the integration uses the + HTTP/JSON transport instead of the native binary client. The + `username` and `password` options are required in this mode. + + For direct connections to a database node, prefer `host` (and + optionally `port`), which uses the native Python client. + display_priority: 5 + value: + type: string + example: http://localhost:8080 + + - name: port + description: | + Native client port of the VoltDB cluster member. + + See: https://docs.voltdb.com/UsingVoltDB/HostConfigPortOpts.php fleet_configurable: true - formats: ["url"] - required: true + display_priority: 4 + value: + type: integer + example: 21212 + + - name: username + description: The username to use to authenticate with VoltDB. display_priority: 3 value: type: string - example: http://localhost:8080 + example: - - name: password_hashed + - name: password + description: The password to use to authenticate with VoltDB. + display_priority: 3 + secret: true + value: + type: string + example: + + - name: use_ssl description: | - Set to true if the `password` value refers to a hashed version of the password. - display_priority: 1 + Set to `true` to connect to VoltDB using TLS. + + See: https://docs.voltdb.com/UsingVoltDB/SecuritySSL.php + display_priority: 2 value: type: boolean example: false + - name: ssl_config_file + description: | + Path to a VoltDB SSL configuration file that defines the Java keystore + and truststore files used by the native Python client. + + See: https://docs.voltdb.com/UsingVoltDB/SecuritySSL.php + display_priority: 2 + value: + type: string + example: + + - name: connect_timeout + description: Connection timeout (in seconds) when establishing the native client connection. + display_priority: 1 + value: + type: number + example: 8 + + - name: procedure_timeout + description: | + Timeout (in seconds) for individual stored procedure calls. Set to + `0` to wait indefinitely for a response. + display_priority: 1 + value: + type: number + example: 60 + default: 60 + - name: statistics_components fleet_configurable: true description: | @@ -70,14 +161,19 @@ files: - SNAPSHOTSTATUS - TABLE + - name: password_hashed + description: | + Only applicable to the HTTP/VMC transport (`url` is set). Set to + `true` if the `password` value is the SHA-256 hex digest of the + password instead of the cleartext. The native binary client does + not support pre-hashed passwords. + display_priority: 1 + value: + type: boolean + example: false + - template: instances/http overrides: - username.display_priority: 2 - username.required: true - username.description: The username to use to authenticate with VoltDB. - password.display_priority: 2 - password.required: true - password.description: The password to use to authenticate with VoltDB. auth_type.hidden: true ntlm_domain.hidden: true kerberos_auth.hidden: true @@ -91,6 +187,9 @@ files: aws_region.hidden: true aws_host.hidden: true aws_service.hidden: true + username.hidden: true + password.hidden: true + connect_timeout.hidden: true - template: instances/db - template: instances/default diff --git a/voltdb/changelog.d/23667.added b/voltdb/changelog.d/23667.added new file mode 100644 index 0000000000000..a3b544955d773 --- /dev/null +++ b/voltdb/changelog.d/23667.added @@ -0,0 +1 @@ +Add support for the native [VoltDB Python client](https://pypi.org/project/voltdbclient/) (binary protocol on the VoltDB client port, default `21212`) for direct database-node connections. Configure with `host` (single node) or `hosts` (a list of cluster members for connect-time failover). Optional `port` provides a default for entries without one. TLS is configured through `use_ssl` and `ssl_config_file` pointing at a VoltDB SSL properties file (JKS, PKCS12, or PEM). Statistics columns are now resolved by name against the VoltDB response metadata so the check tolerates VoltDB releases that add or drop columns to `@Statistics` outputs. The existing HTTP/JSON transport (with `url`, `password_hashed`, PEM-based TLS, and proxy options) is fully preserved for users connecting through the VoltDB Management Center (VMC) — existing `url`-based configurations continue to work unchanged. diff --git a/voltdb/datadog_checks/voltdb/check.py b/voltdb/datadog_checks/voltdb/check.py index c868d06967763..d111cd1765111 100644 --- a/voltdb/datadog_checks/voltdb/check.py +++ b/voltdb/datadog_checks/voltdb/check.py @@ -3,13 +3,12 @@ # Licensed under a 3-clause BSD style license (see LICENSE) from typing import Any, List, Optional, cast # noqa: F401 -import requests # noqa: F401 - from datadog_checks.base import AgentCheck from datadog_checks.base.utils.db import QueryManager from .client import Client -from .config import Config +from .config import MODE_HTTP, Config +from .http_client import HttpClient from .types import Instance @@ -20,15 +19,32 @@ def __init__(self, name, init_config, instances): # type: (str, dict, list) -> None super(VoltDBCheck, self).__init__(name, init_config, instances) - self._config = Config(cast(Instance, self.instance), debug=self.log.debug) - self.register_secret(self._config.password) - self._client = Client( - url=self._config.url, - http_get=self.http.get, - username=self._config.username, - password=self._config.password, - password_hashed=self._config.password_hashed, + self._config = Config( + cast(Instance, self.instance), + debug=self.log.debug, + warning=self.log.warning, ) + if self._config.password: + self.register_secret(self._config.password) + if self._config.mode == MODE_HTTP: + self._client = HttpClient( + url=self._config.url, + http_get=self.http.get, + username=self._config.username, + password=self._config.password, + password_hashed=self._config.password_hashed, + ) + else: + self._client = Client( + endpoints=self._config.endpoints, + username=self._config.username, + password=self._config.password, + use_ssl=self._config.use_ssl, + ssl_config_file=self._config.ssl_config_file, + connect_timeout=self._config.connect_timeout, + procedure_timeout=self._config.procedure_timeout, + log=self.log, + ) self._query_manager = QueryManager( self, @@ -38,37 +54,22 @@ def __init__(self, name, init_config, instances): ) self.check_initializations.append(self._query_manager.compile_queries) - def _raise_for_status_with_details(self, response): - # type: (requests.Response) -> None - try: - response.raise_for_status() - except Exception as exc: - message = 'Error response from VoltDB: {}'.format(exc) - try: - # Try including detailed error message from response. - details = response.json()['statusstring'] - except Exception: - pass - else: - message += ' (details: {})'.format(details) - raise Exception(message) from exc - def _fetch_version(self): # type: () -> Optional[str] # See: https://docs.voltdb.com/UsingVoltDB/sysprocsysteminfo.php#sysprocsysinforetvalovervw - response = self._client.request('@SystemInformation', parameters=['OVERVIEW']) - self._raise_for_status_with_details(response) + response = self._client.call_procedure('@SystemInformation', ['OVERVIEW']) + self._client.raise_for_status(response) - data = response.json() - rows = data['results'][0]['data'] # type: List[tuple] + rows = response.tables[0].tuples # type: List[list] # NOTE: there will be one VERSION row per server in the cluster. # Arbitrarily use the first one we see. - for _, column, value in rows: + for row in rows: + _, column, value = row[0], row[1], row[2] if column == 'VERSION': return self._transform_version(value) - self.log.debug('VERSION column not found: %s', [column for _, column, _ in rows]) + self.log.debug('VERSION column not found: %s', [row[1] for row in rows]) return None def _transform_version(self, raw): @@ -109,17 +110,58 @@ def _check_can_connect_and_submit_version(self): def _execute_query_raw(self, query): # type: (str) -> List[tuple] - # Ad-hoc format, close to the HTTP API format. - # Eg 'A:[B, C]' -> '?Procedure=A&Parameters=[B, C]' - procedure, _, parameters = query.partition(":") - - response = self._client.request(procedure, parameters=parameters) - self._raise_for_status_with_details(response) - - data = response.json() - return data['results'][0]['data'] + # Ad-hoc format: 'A:[B, C]' -> procedure A called with parameters [B, C]. + procedure, params = _parse_query(query) + + response = self._client.call_procedure(procedure, params) + self._client.raise_for_status(response) + + table = response.tables[0] + sources = self._config.query_sources.get(query) + if not sources: + # Custom query or no source mapping: return rows as-is for QueryManager + # to consume positionally. + return [tuple(row) for row in table.tuples] + + # Project the response onto the source columns declared in queries.py, + # looking them up by name. Missing columns become None so newer/older + # VoltDB releases that add or drop columns don't break the check. + col_index = {col.name: i for i, col in enumerate(table.columns)} + indices = [col_index.get(source) if source else None for source in sources] + missing = [s for s, i in zip(sources, indices) if s and i is None] + if missing: + self.log.debug( + 'VoltDB response for %s is missing columns %s; values will be reported as None.', + procedure, + missing, + ) + + return [tuple(row[i] if i is not None else None for i in indices) for row in table.tuples] + + def cancel(self): + # type: () -> None + self._client.close() def check(self, _): # type: (Any) -> None self._check_can_connect_and_submit_version() self._query_manager.execute() + + +def _parse_query(query): + # type: (str) -> tuple + procedure, _, params_str = query.partition(':') + procedure = procedure.strip() + params_str = params_str.strip() + if not params_str: + return procedure, [] + if params_str.startswith('[') and params_str.endswith(']'): + params_str = params_str[1:-1] + parts = [p.strip() for p in params_str.split(',') if p.strip()] + params = [] + for part in parts: + try: + params.append(int(part)) + except ValueError: + params.append(part) + return procedure, params diff --git a/voltdb/datadog_checks/voltdb/client.py b/voltdb/datadog_checks/voltdb/client.py index 29efb4a241a04..0531823ced578 100644 --- a/voltdb/datadog_checks/voltdb/client.py +++ b/voltdb/datadog_checks/voltdb/client.py @@ -1,50 +1,169 @@ # (C) Datadog, Inc. 2020-present # All rights reserved # Licensed under a 3-clause BSD style license (see LICENSE) -import json -from typing import Callable, Union # noqa: F401 -from urllib.parse import urljoin +from typing import List, Optional, Tuple # noqa: F401 -import requests +import voltdbclient + + +class VoltDBError(Exception): + """Raised when a VoltDB procedure call returns a non-success status.""" + + def __init__(self, status, status_string): + # type: (int, Optional[str]) -> None + super().__init__('VoltDB procedure failed (status={}): {}'.format(status, status_string)) + self.status = status + self.status_string = status_string class Client(object): """ - A wrapper around the VoltDB HTTP JSON interface. + A wrapper around the VoltDB native Python client. - See: https://docs.voltdb.com/UsingVoltDB/ProgLangJson.php + Accepts one or more `(host, port)` endpoints. On a connect failure the + client transparently tries the next endpoint, so the Agent can keep + collecting metrics as long as at least one cluster member is reachable. + + See: https://pypi.org/project/voltdbclient/ """ - def __init__(self, url, http_get, username, password, password_hashed=False): - # type: (str, Callable[..., requests.Response], str, str, bool) -> None - self._api_url = urljoin(url, '/api/1.0/') - self._auth = VoltDBAuth(username, password, password_hashed) - self._http_get = http_get - - def request(self, procedure, parameters=None): - # type: (str, Union[str, list]) -> requests.Response - url = self._api_url - auth = self._auth - params = {'Procedure': procedure} - - if parameters: - if not isinstance(parameters, str): - parameters = json.dumps(parameters) - params['Parameters'] = parameters - - return self._http_get(url, auth=auth, params=params) # SKIP_HTTP_VALIDATION - - -class VoltDBAuth(requests.auth.AuthBase): - def __init__(self, username, password, password_hashed): - # type: (str, str, bool) -> None - self._username = username - self._password = password - self._password_hashed = password_hashed - - def __call__(self, r): - # type: (requests.PreparedRequest) -> requests.PreparedRequest - # See: https://docs.voltdb.com/UsingVoltDB/ProgLangJson.php - params = {'User': self._username, 'Hashedpassword' if self._password_hashed else 'Password': self._password} - r.prepare_url(r.url, params) - return r + # ClientResponse status code for success. + # See: voltdbclient.VoltResponse.status + SUCCESS = 1 + + def __init__( + self, + endpoints, + username='', + password='', + use_ssl=False, + ssl_config_file=None, + connect_timeout=8, + procedure_timeout=None, + log=None, + ): + # type: (List[Tuple[str, int]], str, str, bool, Optional[str], Optional[float], Optional[float], object) -> None + if not endpoints: + raise ValueError('Client requires at least one (host, port) endpoint') + self._endpoints = list(endpoints) + self._username = username or '' + self._password = password or '' + self._use_ssl = use_ssl + self._ssl_config_file = ssl_config_file + self._connect_timeout = connect_timeout + self._procedure_timeout = procedure_timeout + self._fser = None # type: Optional[voltdbclient.FastSerializer] + self._active = None # type: Optional[Tuple[str, int]] + self._log = log + + def _log_debug(self, *args): + if self._log is not None: + self._log.debug(*args) + + def _log_warning(self, *args): + if self._log is not None: + self._log.warning(*args) + + def _open(self, host, port): + # type: (str, int) -> voltdbclient.FastSerializer + return voltdbclient.FastSerializer( + host=host, + port=port, + usessl=self._use_ssl, + ssl_config_file=self._ssl_config_file, + username=self._username, + password=self._password, + connect_timeout=self._connect_timeout, + procedure_timeout=self._procedure_timeout, + default_cacerts=False, + ) + + def _connect_any(self): + # type: () -> voltdbclient.FastSerializer + """Try each configured endpoint until one connects. Raises the last + exception if every endpoint fails.""" + last_exc = None + for host, port in self._endpoints: + try: + fser = self._open(host, port) + except Exception as exc: # noqa: BLE001 + self._log_warning('VoltDB endpoint %s:%d unreachable (%s); trying the next one.', host, port, exc) + last_exc = exc + continue + self._active = (host, port) + self._log_debug('VoltDB connected to %s:%d', host, port) + return fser + # Exhausted all endpoints. + assert last_exc is not None + raise last_exc + + def _get_connection(self): + # type: () -> voltdbclient.FastSerializer + if self._fser is None: + self._fser = self._connect_any() + return self._fser + + def close(self): + # type: () -> None + if self._fser is not None: + try: + self._fser.close() + except Exception: + pass + self._fser = None + self._active = None + + @property + def endpoints(self): + # type: () -> List[Tuple[str, int]] + return list(self._endpoints) + + @property + def active_endpoint(self): + # type: () -> Optional[Tuple[str, int]] + return self._active + + def call_procedure(self, procedure, params=None): + # type: (str, Optional[list]) -> voltdbclient.VoltResponse + params = list(params) if params else [] + param_types = [_infer_volt_type(p) for p in params] + # If we already have a connection, try it first. If it errors, close + # and retry once against the full endpoint list. This handles the + # common case where the active node went down between check runs. + had_connection = self._fser is not None + try: + fser = self._get_connection() + proc = voltdbclient.VoltProcedure(fser, procedure, param_types) + return proc.call(params) + except Exception: + self.close() + if not had_connection: + # First attempt already iterated every endpoint via _connect_any. + raise + + # Second attempt: reconnect to any endpoint and retry the call once. + self._log_debug('VoltDB call to %s failed; reconnecting and retrying once.', procedure) + fser = self._get_connection() + try: + proc = voltdbclient.VoltProcedure(fser, procedure, param_types) + return proc.call(params) + except Exception: + self.close() + raise + + def raise_for_status(self, response): + # type: (voltdbclient.VoltResponse) -> None + if response.status != self.SUCCESS: + raise VoltDBError(response.status, response.statusString) + + +def _infer_volt_type(value): + # type: (object) -> int + fs = voltdbclient.FastSerializer + if isinstance(value, bool): + return fs.VOLTTYPE_TINYINT + if isinstance(value, int): + return fs.VOLTTYPE_INTEGER + if isinstance(value, float): + return fs.VOLTTYPE_FLOAT + return fs.VOLTTYPE_STRING diff --git a/voltdb/datadog_checks/voltdb/config.py b/voltdb/datadog_checks/voltdb/config.py index 483ef1408d296..38859ba959f97 100644 --- a/voltdb/datadog_checks/voltdb/config.py +++ b/voltdb/datadog_checks/voltdb/config.py @@ -1,7 +1,7 @@ # (C) Datadog, Inc. 2020-present # All rights reserved # Licensed under a 3-clause BSD style license (see LICENSE) -from typing import Callable, List, Optional # noqa: F401 +from typing import Callable, List, Optional, Tuple # noqa: F401 from urllib.parse import urlparse from datadog_checks.base import ConfigurationError, is_affirmative @@ -10,71 +10,174 @@ from .types import Instance # noqa: F401 DEFAULT_STATISTICS_COMPONENTS = [ - "COMMANDLOG", - "CPU", - "GC", - "INDEX", - "IOSTATS", - "LATENCY", - "MEMORY", - "PROCEDURE", - "SNAPSHOTSTATUS", - "TABLE", + 'COMMANDLOG', + 'CPU', + 'GC', + 'INDEX', + 'IOSTATS', + 'LATENCY', + 'MEMORY', + 'PROCEDURE', + 'SNAPSHOTSTATUS', + 'TABLE', ] STATISTICS_COMPONENTS_MAP = { - "COMMANDLOG": queries.CommandLogMetrics, - "CPU": queries.CPUMetrics, - "EXPORT": queries.ExportMetrics, - "GC": queries.GCMetrics, - "IDLETIME": queries.IdleTimeMetrics, - "IMPORT": queries.ImportMetrics, - "INDEX": queries.IndexMetrics, - "IOSTATS": queries.IOStatsMetrics, - "LATENCY": queries.LatencyMetrics, - "MEMORY": queries.MemoryMetrics, - "PROCEDURE": queries.ProcedureMetrics, - "PROCEDUREOUTPUT": queries.ProcedureOutputMetrics, - "PROCEDUREPROFILE": queries.ProcedureProfileMetrics, - "QUEUE": queries.QueueMetrics, - "SNAPSHOTSTATUS": queries.SnapshotStatusMetrics, - "TABLE": queries.TableMetrics, + 'COMMANDLOG': queries.CommandLogMetrics, + 'CPU': queries.CPUMetrics, + 'EXPORT': queries.ExportMetrics, + 'GC': queries.GCMetrics, + 'IDLETIME': queries.IdleTimeMetrics, + 'IMPORT': queries.ImportMetrics, + 'INDEX': queries.IndexMetrics, + 'IOSTATS': queries.IOStatsMetrics, + 'LATENCY': queries.LatencyMetrics, + 'MEMORY': queries.MemoryMetrics, + 'PROCEDURE': queries.ProcedureMetrics, + 'PROCEDUREOUTPUT': queries.ProcedureOutputMetrics, + 'PROCEDUREPROFILE': queries.ProcedureProfileMetrics, + 'QUEUE': queries.QueueMetrics, + 'SNAPSHOTSTATUS': queries.SnapshotStatusMetrics, + 'TABLE': queries.TableMetrics, } +DEFAULT_PORT = 21212 + + +def _strip_sources(query_def): + """Split the `source` annotations out of a query definition. + + Returns a tuple of: + - a copy of the query definition suitable for QueryManager (no `source` keys), and + - a list of VoltDB source column names, one per column entry. + """ + columns = query_def['columns'] + cleaned_columns = [] + sources = [] + for column in columns: + if isinstance(column, dict) and 'source' in column: + source = column['source'] + sources.append(source) + cleaned_columns.append({k: v for k, v in column.items() if k != 'source'}) + else: + sources.append(None) + cleaned_columns.append(column) + cleaned = dict(query_def) + cleaned['columns'] = cleaned_columns + return cleaned, sources + + +MODE_NATIVE = 'native' +MODE_HTTP = 'http' + + +def _parse_hostport(entry, default_port): + # type: (str, int) -> Tuple[str, int] + """Parse a `host` or `host:port` string into a (host, port) tuple.""" + entry = entry.strip() + if not entry: + raise ConfigurationError("'hosts' entries must be non-empty 'host' or 'host:port' strings") + if ':' in entry: + host, _, port_str = entry.rpartition(':') + host = host.strip() + try: + port = int(port_str) + except ValueError: + raise ConfigurationError("'hosts' entry {!r} has an invalid port".format(entry)) + else: + host = entry + port = default_port + if not host: + raise ConfigurationError("'hosts' entry {!r} has an empty hostname".format(entry)) + if port <= 0: + raise ConfigurationError("'hosts' entry {!r} has a non-positive port".format(entry)) + return host, port + + +def _resolve_endpoints(host, hosts, default_port): + # type: (Optional[str], Optional[List[str]], int) -> List[Tuple[str, int]] + """Build the ordered endpoint list for the native client. + + `hosts` (a list) takes precedence over `host` (a single string) when both + are set so users can opt into failover by adding a `hosts:` entry without + having to remove their existing `host:`. + """ + if hosts: + if not isinstance(hosts, list): + raise ConfigurationError("'hosts' must be a list of 'host' or 'host:port' strings") + return [_parse_hostport(entry, default_port) for entry in hosts] + if host: + return [(host, default_port)] + return [] + class Config(object): - def __init__(self, instance, debug=lambda *args: None): - # type: (Instance, Callable) -> None + def __init__(self, instance, debug=lambda *args: None, warning=lambda *args: None): + # type: (Instance, Callable, Callable) -> None self._debug = debug + host = instance.get('host') # type: Optional[str] + port = instance.get('port') # type: Optional[int] + hosts = instance.get('hosts') # type: Optional[List[str]] url = instance.get('url') # type: Optional[str] - username = instance.get('username') # type: Optional[str] - password = instance.get('password') # type: Optional[str] - statistics_components = instance.get('statistics_components', DEFAULT_STATISTICS_COMPONENTS) + username = instance.get('username', '') # type: str + password = instance.get('password', '') # type: str password_hashed = is_affirmative(instance.get('password_hashed', False)) # type: bool + use_ssl = instance.get('use_ssl') + ssl_config_file = instance.get('ssl_config_file') # type: Optional[str] + connect_timeout = instance.get('connect_timeout', 8) # type: float + # `procedure_timeout` of 0 (or negative) disables the timeout, matching + # the `voltdbclient` semantics where `None` means wait forever. + procedure_timeout = instance.get('procedure_timeout', 60) # type: Optional[float] + if procedure_timeout is not None and procedure_timeout <= 0: + procedure_timeout = None + statistics_components = instance.get('statistics_components', DEFAULT_STATISTICS_COMPONENTS) tags = instance.get('tags', []) # type: List[str] - if not url: - raise ConfigurationError('url is required') - - if not username or not password: - raise ConfigurationError('username and password are required') - - parsed_url = urlparse(url) - - host = parsed_url.hostname - if not host: # pragma: no cover # Mostly just type safety. - raise ConfigurationError('URL must contain a host') - - port = parsed_url.port - if not port: - port = 443 if parsed_url.scheme == 'https' else 80 - self._debug('No port detected, defaulting to port %d', port) + # Mode selection: presence of `url` activates the HTTP transport (talks + # to the VoltDB Management Center HTTP/JSON endpoint), otherwise we use + # the native binary client against `host`/`port`. The two transports + # share the rest of the configuration (auth, statistics components, + # custom queries, tags). + if url: + mode = MODE_HTTP + parsed = urlparse(url) + url_host = parsed.hostname + if not url_host: # pragma: no cover + raise ConfigurationError("URL must contain a host") + url_port = parsed.port + if not url_port: + url_port = 443 if parsed.scheme == 'https' else 80 + self._debug('No port detected in url, defaulting to port %d', url_port) + if not username or not password: + raise ConfigurationError("'username' and 'password' are required when 'url' is set") + netloc = (url_host, url_port) + else: + mode = MODE_NATIVE + if port is None: + port = DEFAULT_PORT + elif not isinstance(port, int) or port <= 0: + raise ConfigurationError('port must be a positive integer') + use_ssl = is_affirmative(use_ssl) if use_ssl is not None else False + endpoints = _resolve_endpoints(host, hosts, port) + if not endpoints: + raise ConfigurationError( + "either 'host' or 'hosts' is required for the native transport " + "(or set 'url' to use the HTTP/VMC transport)" + ) + netloc = endpoints[0] + # Keep `host`/`port` reflecting the *first* endpoint so log/tag messages + # match what users see in their config when they specified a single host. + host = netloc[0] + port = netloc[1] if not isinstance(statistics_components, list): raise ConfigurationError("'statistics_components' must be a list of strings") self.queries = [] + # Map from query string to the ordered list of VoltDB column names that + # back each output column. Used at runtime to look up values by name. + self.query_sources = {} # type: dict for elem in statistics_components: if not isinstance(elem, str): raise ConfigurationError( @@ -83,14 +186,27 @@ def __init__(self, instance, debug=lambda *args: None): if elem not in STATISTICS_COMPONENTS_MAP: raise ConfigurationError( "Statistic component '{}' is not supported. Must be one of [{}].".format( - elem, ", ".join(STATISTICS_COMPONENTS_MAP.keys()) + elem, ', '.join(STATISTICS_COMPONENTS_MAP.keys()) ) ) - self.queries.append(STATISTICS_COMPONENTS_MAP[elem]) + query_def, sources = _strip_sources(STATISTICS_COMPONENTS_MAP[elem]) + self.queries.append(query_def) + if sources: + self.query_sources[query_def['query']] = sources + self.mode = mode self.url = url - self.netloc = (host, port) + self.host = host + self.port = port + self.netloc = netloc + # Endpoints list — populated only for native mode. HTTP mode connects to + # a single URL. + self.endpoints = endpoints if mode == MODE_NATIVE else [netloc] self.username = username self.password = password self.password_hashed = password_hashed + self.use_ssl = use_ssl + self.ssl_config_file = ssl_config_file + self.connect_timeout = connect_timeout + self.procedure_timeout = procedure_timeout self.tags = tags diff --git a/voltdb/datadog_checks/voltdb/config_models/defaults.py b/voltdb/datadog_checks/voltdb/config_models/defaults.py index ab0af4f6e3968..f19875bc1d164 100644 --- a/voltdb/datadog_checks/voltdb/config_models/defaults.py +++ b/voltdb/datadog_checks/voltdb/config_models/defaults.py @@ -24,6 +24,10 @@ def instance_auth_type(): return 'basic' +def instance_connect_timeout(): + return 8 + + def instance_disable_generic_tags(): return False @@ -36,6 +40,10 @@ def instance_enable_legacy_tags_normalization(): return True +def instance_host(): + return 'localhost' + + def instance_kerberos_auth(): return 'disabled' @@ -68,6 +76,14 @@ def instance_persist_connections(): return False +def instance_port(): + return 21212 + + +def instance_procedure_timeout(): + return 60 + + def instance_request_size(): return 16 @@ -92,9 +108,17 @@ def instance_tls_verify(): return True +def instance_url(): + return 'http://localhost:8080' + + def instance_use_global_custom_queries(): return 'true' def instance_use_legacy_auth_encoding(): return True + + +def instance_use_ssl(): + return False diff --git a/voltdb/datadog_checks/voltdb/config_models/instance.py b/voltdb/datadog_checks/voltdb/config_models/instance.py index 9f74e0f5c7d79..d2e54a8e5c90f 100644 --- a/voltdb/datadog_checks/voltdb/config_models/instance.py +++ b/voltdb/datadog_checks/voltdb/config_models/instance.py @@ -85,6 +85,8 @@ class InstanceConfig(BaseModel): enable_legacy_tags_normalization: Optional[bool] = None extra_headers: Optional[MappingProxyType[str, Any]] = None headers: Optional[MappingProxyType[str, Any]] = None + host: Optional[str] = None + hosts: Optional[tuple[str, ...]] = None kerberos_auth: Optional[Literal['required', 'optional', 'disabled']] = None kerberos_cache: Optional[str] = None kerberos_delegate: Optional[bool] = None @@ -97,14 +99,17 @@ class InstanceConfig(BaseModel): min_collection_interval: Optional[float] = None ntlm_domain: Optional[str] = None only_custom_queries: Optional[bool] = None - password: str + password: Optional[str] = None password_hashed: Optional[bool] = None persist_connections: Optional[bool] = None + port: Optional[int] = None + procedure_timeout: Optional[float] = None proxy: Optional[Proxy] = None read_timeout: Optional[float] = None request_size: Optional[float] = None service: Optional[str] = None skip_proxy: Optional[bool] = None + ssl_config_file: Optional[str] = None statistics_components: Optional[tuple[str, ...]] = None tags: Optional[tuple[str, ...]] = None timeout: Optional[float] = None @@ -116,10 +121,11 @@ class InstanceConfig(BaseModel): tls_protocols_allowed: Optional[tuple[str, ...]] = None tls_use_host_header: Optional[bool] = None tls_verify: Optional[bool] = None - url: str + url: Optional[str] = None use_global_custom_queries: Optional[str] = None use_legacy_auth_encoding: Optional[bool] = None - username: str + use_ssl: Optional[bool] = None + username: Optional[str] = None @model_validator(mode='before') def _initial_validation(cls, values): diff --git a/voltdb/datadog_checks/voltdb/data/conf.yaml.example b/voltdb/datadog_checks/voltdb/data/conf.yaml.example index 5c10b52eb79af..078357c3f7114 100644 --- a/voltdb/datadog_checks/voltdb/data/conf.yaml.example +++ b/voltdb/datadog_checks/voltdb/data/conf.yaml.example @@ -57,25 +57,81 @@ init_config: # instances: - ## @param url - string - required - ## URL to a VoltDB client endpoint. + - + ## @param host - string - optional - default: localhost + ## Host of the VoltDB cluster member to connect to via the native binary + ## protocol. Use this (with `port`) for direct connections to a database + ## node. Set `url` instead to talk to the VoltDB Management Center (VMC) + ## over HTTP/JSON. For failover across multiple cluster members, use + ## `hosts` instead. + # + # host: localhost + + ## @param hosts - list of strings - optional + ## List of VoltDB cluster members to try when connecting via the native + ## binary protocol. Each entry is either `hostname` (uses the global + ## `port`) or `hostname:port`. The Agent connects to the first + ## reachable entry and silently fails over to subsequent entries if + ## the active node becomes unavailable. Takes precedence over `host`. + # + # hosts: + # - voltdb-1.example:21212 + # - voltdb-2.example:21212 + # - voltdb-3.example:21212 + + ## @param url - string - optional - default: http://localhost:8080 + ## URL of a VoltDB HTTP/JSON endpoint, typically served by the + ## VoltDB Management Center (VMC). When set, the integration uses the + ## HTTP/JSON transport instead of the native binary client. The + ## `username` and `password` options are required in this mode. + ## + ## For direct connections to a database node, prefer `host` (and + ## optionally `port`), which uses the native Python client. # - - url: http://localhost:8080 + # url: http://localhost:8080 - ## @param username - string - required + ## @param port - integer - optional - default: 21212 + ## Native client port of the VoltDB cluster member. + ## + ## See: https://docs.voltdb.com/UsingVoltDB/HostConfigPortOpts.php + # + # port: 21212 + + ## @param username - string - optional ## The username to use to authenticate with VoltDB. # - username: + # username: - ## @param password - string - required + ## @param password - string - optional ## The password to use to authenticate with VoltDB. # - password: + # password: - ## @param password_hashed - boolean - optional - default: false - ## Set to true if the `password` value refers to a hashed version of the password. + ## @param use_ssl - boolean - optional - default: false + ## Set to `true` to connect to VoltDB using TLS. + ## + ## See: https://docs.voltdb.com/UsingVoltDB/SecuritySSL.php # - # password_hashed: false + # use_ssl: false + + ## @param ssl_config_file - string - optional + ## Path to a VoltDB SSL configuration file that defines the Java keystore + ## and truststore files used by the native Python client. + ## + ## See: https://docs.voltdb.com/UsingVoltDB/SecuritySSL.php + # + # ssl_config_file: + + ## @param connect_timeout - number - optional - default: 8 + ## Connection timeout (in seconds) when establishing the native client connection. + # + # connect_timeout: 8 + + ## @param procedure_timeout - number - optional - default: 60 + ## Timeout (in seconds) for individual stored procedure calls. Set to + ## `0` to wait indefinitely for a response. + # + # procedure_timeout: 60 ## @param statistics_components - list of strings - optional ## List VoltDB components to collect metrics. A subset of components are collected by default. @@ -111,6 +167,14 @@ instances: # - SNAPSHOTSTATUS # - TABLE + ## @param password_hashed - boolean - optional - default: false + ## Only applicable to the HTTP/VMC transport (`url` is set). Set to + ## `true` if the `password` value is the SHA-256 hex digest of the + ## password instead of the cleartext. The native binary client does + ## not support pre-hashed passwords. + # + # password_hashed: false + ## @param proxy - mapping - optional ## This overrides the `proxy` setting in `init_config`. ## @@ -233,11 +297,6 @@ instances: # # timeout: 10 - ## @param connect_timeout - number - optional - ## The connect timeout for accessing services. Defaults to `timeout`. - # - # connect_timeout: - ## @param read_timeout - number - optional ## The read timeout for accessing services. Defaults to `timeout`. # diff --git a/voltdb/datadog_checks/voltdb/http_client.py b/voltdb/datadog_checks/voltdb/http_client.py new file mode 100644 index 0000000000000..2a624063f9a1d --- /dev/null +++ b/voltdb/datadog_checks/voltdb/http_client.py @@ -0,0 +1,110 @@ +# (C) Datadog, Inc. 2026-present +# All rights reserved +# Licensed under a 3-clause BSD style license (see LICENSE) +""" +HTTP/JSON client for the VoltDB Management Center (VMC). + +Used when the integration is configured with a `url` option. The wire format +matches https://docs.voltdb.com/UsingVoltDB/ProgLangJson.php and is exposed +through the same response shape as the native client (`response.tables[i]. +columns[j].name`, `response.tables[i].tuples`) so the check code can be +agnostic to which transport is in use. +""" + +import json +from typing import Callable, List, Optional, Union # noqa: F401 +from urllib.parse import urljoin + +import requests + +from .client import VoltDBError + + +class HttpColumn(object): + __slots__ = ('name',) + + def __init__(self, name): + # type: (str) -> None + self.name = name + + +class HttpTable(object): + __slots__ = ('columns', 'tuples') + + def __init__(self, schema, data): + # type: (Optional[list], list) -> None + self.columns = [HttpColumn(entry['name']) for entry in (schema or [])] + self.tuples = data or [] + + +class HttpResponse(object): + __slots__ = ('status', 'statusString', 'tables') + + SUCCESS = 1 + + def __init__(self, json_data): + # type: (dict) -> None + self.status = json_data.get('status') + self.statusString = json_data.get('statusstring') + self.tables = [HttpTable(r.get('schema'), r.get('data')) for r in json_data.get('results') or []] + + +class HttpClient(object): + """A wrapper around the VoltDB HTTP/JSON interface (port 8080 by default, + typically served through the VoltDB Management Center). + + See: https://docs.voltdb.com/UsingVoltDB/ProgLangJson.php + """ + + SUCCESS = HttpResponse.SUCCESS + + def __init__(self, url, http_get, username, password, password_hashed=False): + # type: (str, Callable[..., requests.Response], str, str, bool) -> None + self._api_url = urljoin(url, '/api/1.0/') + self._auth = VoltDBAuth(username, password, password_hashed) + self._http_get = http_get + + def call_procedure(self, procedure, params=None): + # type: (str, Union[str, list, None]) -> HttpResponse + if params is None: + parameters = '' + elif isinstance(params, str): + parameters = params + else: + parameters = json.dumps(list(params)) + + query = {'Procedure': procedure} + if parameters: + query['Parameters'] = parameters + + response = self._http_get(self._api_url, auth=self._auth, params=query) # SKIP_HTTP_VALIDATION + response.raise_for_status() + return HttpResponse(response.json()) + + def raise_for_status(self, response): + # type: (HttpResponse) -> None + if response.status != self.SUCCESS: + raise VoltDBError(response.status, response.statusString) + + def close(self): + # type: () -> None + # Connection pooling is handled by the underlying requests Session. + return None + + +class VoltDBAuth(requests.auth.AuthBase): + def __init__(self, username, password, password_hashed): + # type: (str, str, bool) -> None + self._username = username + self._password = password + self._password_hashed = password_hashed + + def __call__(self, r): + # type: (requests.PreparedRequest) -> requests.PreparedRequest + # See: https://docs.voltdb.com/UsingVoltDB/ProgLangJson.php + params = { + 'User': self._username, + 'Hashedpassword' if self._password_hashed else 'Password': self._password, + } + r.prepare_url(r.url, params) + return r diff --git a/voltdb/datadog_checks/voltdb/queries.py b/voltdb/datadog_checks/voltdb/queries.py index c05ed6f63e68a..555dd1745e4a0 100644 --- a/voltdb/datadog_checks/voltdb/queries.py +++ b/voltdb/datadog_checks/voltdb/queries.py @@ -1,6 +1,14 @@ # (C) Datadog, Inc. 2020-present # All rights reserved # Licensed under a 3-clause BSD style license (see LICENSE) +# +# Each column declares a `source` field — the VoltDB column name to read from the +# `@Statistics` response. The check looks up these names against the VoltTable +# column metadata at runtime, so the integration tolerates VoltDB versions that +# add or remove columns. Columns missing on the server are submitted as None +# (tags drop out, numeric metrics are skipped). +# +# See: https://docs.voltdb.com/UsingVoltDB/sysprocstatistics.php # See: https://docs.voltdb.com/UsingVoltDB/sysprocstatistics.php#sysprocstatcpu # One row per server. @@ -8,10 +16,9 @@ 'name': 'cpu', 'query': '@Statistics:[CPU]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'cpu.percent_used', 'type': 'gauge'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'PERCENT_USED', 'name': 'cpu.percent_used', 'type': 'gauge'}, ], } @@ -21,20 +28,19 @@ 'name': 'memory', 'query': '@Statistics:[MEMORY]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'memory.rss', 'type': 'gauge'}, - {'name': 'memory.java.used', 'type': 'gauge'}, - {'name': 'memory.java.unused', 'type': 'gauge'}, - {'name': 'memory.tuple_data', 'type': 'gauge'}, - {'name': 'memory.tuple_allocated', 'type': 'gauge'}, - {'name': 'memory.index', 'type': 'gauge'}, - {'name': 'memory.string', 'type': 'gauge'}, - {'name': 'memory.tuple_count', 'type': 'gauge'}, - {'name': 'memory.pooled', 'type': 'gauge'}, - {'name': 'memory.physical', 'type': 'gauge'}, - {'name': 'memory.java.max_heap', 'type': 'gauge'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'RSS', 'name': 'memory.rss', 'type': 'gauge'}, + {'source': 'JAVAUSED', 'name': 'memory.java.used', 'type': 'gauge'}, + {'source': 'JAVAUNUSED', 'name': 'memory.java.unused', 'type': 'gauge'}, + {'source': 'TUPLEDATA', 'name': 'memory.tuple_data', 'type': 'gauge'}, + {'source': 'TUPLEALLOCATED', 'name': 'memory.tuple_allocated', 'type': 'gauge'}, + {'source': 'INDEXMEMORY', 'name': 'memory.index', 'type': 'gauge'}, + {'source': 'STRINGMEMORY', 'name': 'memory.string', 'type': 'gauge'}, + {'source': 'TUPLECOUNT', 'name': 'memory.tuple_count', 'type': 'gauge'}, + {'source': 'POOLEDMEMORY', 'name': 'memory.pooled', 'type': 'gauge'}, + {'source': 'PHYSICALMEMORY', 'name': 'memory.physical', 'type': 'gauge'}, + {'source': 'JAVAMAXHEAP', 'name': 'memory.java.max_heap', 'type': 'gauge'}, ], } @@ -44,21 +50,14 @@ 'name': 'snapshot_status', 'query': '@Statistics:[SNAPSHOTSTATUS]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'table', 'type': 'tag'}, - None, # PATH - {'name': 'filename', 'type': 'tag'}, - None, # NONCE - None, # TXNID (Transaction ID) - None, # START_TIME - None, # END_TIME - {'name': 'snapshot_status.size', 'type': 'gauge'}, - {'name': 'snapshot_status.duration', 'type': 'gauge'}, - {'name': 'snapshot_status.throughput', 'type': 'gauge'}, - None, # RESULT (Can't translate string to int yet) - {'name': 'type', 'type': 'tag'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'TABLE', 'name': 'table', 'type': 'tag'}, + {'source': 'FILENAME', 'name': 'filename', 'type': 'tag'}, + {'source': 'SIZE', 'name': 'snapshot_status.size', 'type': 'gauge'}, + {'source': 'DURATION', 'name': 'snapshot_status.duration', 'type': 'gauge'}, + {'source': 'THROUGHPUT', 'name': 'snapshot_status.throughput', 'type': 'gauge'}, + {'source': 'TYPE', 'name': 'type', 'type': 'tag'}, ], } @@ -69,14 +68,33 @@ 'name': 'commandlog', 'query': '@Statistics:[COMMANDLOG, 1]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'commandlog.outstanding_bytes', 'type': 'gauge'}, - {'name': 'commandlog.outstanding_transactions', 'type': 'gauge'}, - {'name': 'commandlog.in_use_segment_count', 'type': 'gauge'}, - {'name': 'commandlog.segment_count', 'type': 'gauge'}, - {'name': 'commandlog.fsync_interval', 'type': 'gauge'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + { + 'source': 'OUTSTANDING_BYTES', + 'name': 'commandlog.outstanding_bytes', + 'type': 'gauge', + }, + { + 'source': 'OUTSTANDING_TXNS', + 'name': 'commandlog.outstanding_transactions', + 'type': 'gauge', + }, + { + 'source': 'IN_USE_SEGMENT_COUNT', + 'name': 'commandlog.in_use_segment_count', + 'type': 'gauge', + }, + { + 'source': 'SEGMENT_COUNT', + 'name': 'commandlog.segment_count', + 'type': 'gauge', + }, + { + 'source': 'FSYNC_INTERVAL', + 'name': 'commandlog.fsync_interval', + 'type': 'gauge', + }, ], } @@ -86,26 +104,68 @@ 'name': 'procedure', 'query': '@Statistics:[PROCEDURE, 1]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'site_id', 'type': 'tag'}, - {'name': 'partition_id', 'type': 'tag'}, - {'name': 'procedure', 'type': 'tag'}, - {'name': 'procedure.invocations', 'type': 'monotonic_count'}, - {'name': 'procedure.timed_invocations', 'type': 'monotonic_count'}, - {'name': 'procedure.min_execution_time', 'type': 'gauge'}, - {'name': 'procedure.max_execution_time', 'type': 'gauge'}, - {'name': 'procedure.avg_execution_time', 'type': 'gauge'}, - {'name': 'procedure.min_result_size', 'type': 'gauge'}, - {'name': 'procedure.max_result_size', 'type': 'gauge'}, - {'name': 'procedure.avg_result_size', 'type': 'gauge'}, - {'name': 'procedure.min_parameter_set_size', 'type': 'gauge'}, - {'name': 'procedure.max_parameter_set_size', 'type': 'gauge'}, - {'name': 'procedure.avg_parameter_set_size', 'type': 'gauge'}, - {'name': 'procedure.aborts', 'type': 'monotonic_count'}, - {'name': 'procedure.failures', 'type': 'monotonic_count'}, - None, # TRANSACTIONAL + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'SITE_ID', 'name': 'site_id', 'type': 'tag'}, + {'source': 'PARTITION_ID', 'name': 'partition_id', 'type': 'tag'}, + {'source': 'PROCEDURE', 'name': 'procedure', 'type': 'tag'}, + { + 'source': 'INVOCATIONS', + 'name': 'procedure.invocations', + 'type': 'monotonic_count', + }, + { + 'source': 'TIMED_INVOCATIONS', + 'name': 'procedure.timed_invocations', + 'type': 'monotonic_count', + }, + { + 'source': 'MIN_EXECUTION_TIME', + 'name': 'procedure.min_execution_time', + 'type': 'gauge', + }, + { + 'source': 'MAX_EXECUTION_TIME', + 'name': 'procedure.max_execution_time', + 'type': 'gauge', + }, + { + 'source': 'AVG_EXECUTION_TIME', + 'name': 'procedure.avg_execution_time', + 'type': 'gauge', + }, + { + 'source': 'MIN_RESULT_SIZE', + 'name': 'procedure.min_result_size', + 'type': 'gauge', + }, + { + 'source': 'MAX_RESULT_SIZE', + 'name': 'procedure.max_result_size', + 'type': 'gauge', + }, + { + 'source': 'AVG_RESULT_SIZE', + 'name': 'procedure.avg_result_size', + 'type': 'gauge', + }, + { + 'source': 'MIN_PARAMETER_SET_SIZE', + 'name': 'procedure.min_parameter_set_size', + 'type': 'gauge', + }, + { + 'source': 'MAX_PARAMETER_SET_SIZE', + 'name': 'procedure.max_parameter_set_size', + 'type': 'gauge', + }, + { + 'source': 'AVG_PARAMETER_SET_SIZE', + 'name': 'procedure.avg_parameter_set_size', + 'type': 'gauge', + }, + {'source': 'ABORTS', 'name': 'procedure.aborts', 'type': 'monotonic_count'}, + {'source': 'FAILURES', 'name': 'procedure.failures', 'type': 'monotonic_count'}, ], 'extras': [ { @@ -122,19 +182,19 @@ 'name': 'latency', 'query': '@Statistics:[LATENCY]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'latency.interval', 'type': 'gauge'}, - {'name': 'latency.count', 'type': 'gauge'}, - {'name': 'latency.transactions_per_sec', 'type': 'gauge'}, - {'name': 'latency.p50', 'type': 'gauge'}, - {'name': 'latency.p95', 'type': 'gauge'}, - {'name': 'latency.p99', 'type': 'gauge'}, - {'name': 'latency.p999', 'type': 'gauge'}, - {'name': 'latency.p9999', 'type': 'gauge'}, - {'name': 'latency.p99999', 'type': 'gauge'}, - {'name': 'latency.max', 'type': 'gauge'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'INTERVAL', 'name': 'latency.interval', 'type': 'gauge'}, + {'source': 'COUNT', 'name': 'latency.count', 'type': 'gauge'}, + {'source': 'TPS', 'name': 'latency.transactions_per_sec', 'type': 'gauge'}, + {'source': 'P50', 'name': 'latency.p50', 'type': 'gauge'}, + {'source': 'P95', 'name': 'latency.p95', 'type': 'gauge'}, + {'source': 'P99', 'name': 'latency.p99', 'type': 'gauge'}, + # VoltDB names these percentile columns with dots (P99.9, P99.99, P99.999). + {'source': 'P99.9', 'name': 'latency.p999', 'type': 'gauge'}, + {'source': 'P99.99', 'name': 'latency.p9999', 'type': 'gauge'}, + {'source': 'P99.999', 'name': 'latency.p99999', 'type': 'gauge'}, + {'source': 'MAX', 'name': 'latency.max', 'type': 'gauge'}, ], } @@ -144,13 +204,28 @@ 'name': 'gc', 'query': '@Statistics:[GC, 1]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'gc.newgen_gc_count', 'type': 'monotonic_count'}, - {'name': 'gc.newgen_avg_gc_time', 'type': 'gauge'}, - {'name': 'gc.oldgen_gc_count', 'type': 'monotonic_count'}, - {'name': 'gc.oldgen_avg_gc_time', 'type': 'gauge'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + { + 'source': 'NEWGEN_GC_COUNT', + 'name': 'gc.newgen_gc_count', + 'type': 'monotonic_count', + }, + { + 'source': 'NEWGEN_AVG_GC_TIME', + 'name': 'gc.newgen_avg_gc_time', + 'type': 'gauge', + }, + { + 'source': 'OLDGEN_GC_COUNT', + 'name': 'gc.oldgen_gc_count', + 'type': 'monotonic_count', + }, + { + 'source': 'OLDGEN_AVG_GC_TIME', + 'name': 'gc.oldgen_avg_gc_time', + 'type': 'gauge', + }, ], } @@ -160,15 +235,25 @@ 'name': 'iostats', 'query': '@Statistics:[IOSTATS, 1]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - None, # CONNECTION_ID - {'name': 'connection_hostname', 'type': 'tag'}, - {'name': 'io.bytes_read', 'type': 'monotonic_count'}, - {'name': 'io.messages_read', 'type': 'monotonic_count'}, - {'name': 'io.bytes_written', 'type': 'monotonic_count'}, - {'name': 'io.messages_written', 'type': 'monotonic_count'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'CONNECTION_HOSTNAME', 'name': 'connection_hostname', 'type': 'tag'}, + {'source': 'BYTES_READ', 'name': 'io.bytes_read', 'type': 'monotonic_count'}, + { + 'source': 'MESSAGES_READ', + 'name': 'io.messages_read', + 'type': 'monotonic_count', + }, + { + 'source': 'BYTES_WRITTEN', + 'name': 'io.bytes_written', + 'type': 'monotonic_count', + }, + { + 'source': 'MESSAGES_WRITTEN', + 'name': 'io.messages_written', + 'type': 'monotonic_count', + }, ], } @@ -178,23 +263,34 @@ 'name': 'table', 'query': '@Statistics:[TABLE, 1]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'site_id', 'type': 'tag'}, - {'name': 'partition_id', 'type': 'tag'}, - {'name': 'table', 'type': 'tag'}, - {'name': 'table_type', 'type': 'tag'}, - {'name': 'table.tuple_count', 'type': 'gauge'}, - {'name': 'table.tuple_allocated_memory', 'type': 'gauge'}, - {'name': 'table.tuple_data_memory', 'type': 'gauge'}, - {'name': 'table.string_data_memory', 'type': 'gauge'}, - {'name': 'table.tuple_limit', 'type': 'gauge'}, # May be null. - {'name': 'table.percent_full', 'type': 'gauge'}, - # The following two columns were added in V10 only. Leave them out for now, as we target v8.4. - # See: https://docs.voltdb.com/ReleaseNotes/index.php - # {'name': 'distributed_replication', 'type': 'tag', 'boolean': True}, - # None, # EXPORT + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'SITE_ID', 'name': 'site_id', 'type': 'tag'}, + {'source': 'PARTITION_ID', 'name': 'partition_id', 'type': 'tag'}, + {'source': 'TABLE_NAME', 'name': 'table', 'type': 'tag'}, + {'source': 'TABLE_TYPE', 'name': 'table_type', 'type': 'tag'}, + {'source': 'TUPLE_COUNT', 'name': 'table.tuple_count', 'type': 'gauge'}, + { + 'source': 'TUPLE_ALLOCATED_MEMORY', + 'name': 'table.tuple_allocated_memory', + 'type': 'gauge', + }, + { + 'source': 'TUPLE_DATA_MEMORY', + 'name': 'table.tuple_data_memory', + 'type': 'gauge', + }, + { + 'source': 'STRING_DATA_MEMORY', + 'name': 'table.string_data_memory', + 'type': 'gauge', + }, + { + 'source': 'TUPLE_LIMIT', + 'name': 'table.tuple_limit', + 'type': 'gauge', + }, # May be null. + {'source': 'PERCENT_FULL', 'name': 'table.percent_full', 'type': 'gauge'}, ], } @@ -204,18 +300,22 @@ 'name': 'index', 'query': '@Statistics:[INDEX, 1]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'site_id', 'type': 'tag'}, - {'name': 'partition_id', 'type': 'tag'}, - {'name': 'index', 'type': 'tag'}, - {'name': 'table', 'type': 'tag'}, - {'name': 'index_type', 'type': 'tag'}, - {'name': 'is_unique', 'type': 'tag', 'boolean': True}, - {'name': 'is_countable', 'type': 'tag', 'boolean': True}, - {'name': 'index.entry_count', 'type': 'gauge'}, - {'name': 'index.memory_estimate', 'type': 'gauge'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'SITE_ID', 'name': 'site_id', 'type': 'tag'}, + {'source': 'PARTITION_ID', 'name': 'partition_id', 'type': 'tag'}, + {'source': 'INDEX_NAME', 'name': 'index', 'type': 'tag'}, + {'source': 'TABLE_NAME', 'name': 'table', 'type': 'tag'}, + {'source': 'INDEX_TYPE', 'name': 'index_type', 'type': 'tag'}, + {'source': 'IS_UNIQUE', 'name': 'is_unique', 'type': 'tag', 'boolean': True}, + { + 'source': 'IS_COUNTABLE', + 'name': 'is_countable', + 'type': 'tag', + 'boolean': True, + }, + {'source': 'ENTRY_COUNT', 'name': 'index.entry_count', 'type': 'gauge'}, + {'source': 'MEMORY_ESTIMATE', 'name': 'index.memory_estimate', 'type': 'gauge'}, ], } @@ -226,22 +326,33 @@ 'name': 'export', 'query': '@Statistics:[EXPORT, 1]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'site_id', 'type': 'tag'}, - {'name': 'partition_id', 'type': 'tag'}, - {'name': 'export_source', 'type': 'tag'}, - {'name': 'export_target', 'type': 'tag'}, - {'name': 'active', 'type': 'tag'}, - {'name': 'export.records_queued', 'type': 'monotonic_count'}, - {'name': 'export.records_pending', 'type': 'gauge'}, - {'name': '_source.last_queued_ms', 'type': 'source'}, - {'name': '_source.last_acked_ms', 'type': 'source'}, - {'name': 'export.latency.avg', 'type': 'gauge'}, - {'name': 'export.latency.max', 'type': 'gauge'}, - {'name': 'export.queue_gap', 'type': 'gauge'}, - {'name': 'export_status', 'type': 'tag'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'SITE_ID', 'name': 'site_id', 'type': 'tag'}, + {'source': 'PARTITION_ID', 'name': 'partition_id', 'type': 'tag'}, + {'source': 'SOURCE', 'name': 'export_source', 'type': 'tag'}, + {'source': 'TARGET', 'name': 'export_target', 'type': 'tag'}, + {'source': 'ACTIVE', 'name': 'active', 'type': 'tag'}, + { + 'source': 'TUPLE_COUNT', + 'name': 'export.records_queued', + 'type': 'monotonic_count', + }, + {'source': 'TUPLE_PENDING', 'name': 'export.records_pending', 'type': 'gauge'}, + { + 'source': 'LAST_QUEUED_TIMESTAMP', + 'name': '_source.last_queued_ms', + 'type': 'source', + }, + { + 'source': 'LAST_ACKED_TIMESTAMP', + 'name': '_source.last_acked_ms', + 'type': 'source', + }, + {'source': 'AVERAGE_LATENCY', 'name': 'export.latency.avg', 'type': 'gauge'}, + {'source': 'MAX_LATENCY', 'name': 'export.latency.max', 'type': 'gauge'}, + {'source': 'QUEUE_GAP', 'name': 'export.queue_gap', 'type': 'gauge'}, + {'source': 'STATUS', 'name': 'export_status', 'type': 'tag'}, ], 'extras': [ { @@ -273,16 +384,19 @@ 'name': 'import', 'query': '@Statistics:[IMPORT, 1]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'site_id', 'type': 'tag'}, - {'name': 'importer_name', 'type': 'tag'}, - {'name': 'procedure_name', 'type': 'tag'}, - {'name': 'import.successes', 'type': 'monotonic_gauge'}, - {'name': 'import.failures', 'type': 'monotonic_gauge'}, - {'name': 'import.outstanding_requests', 'type': 'gauge'}, - {'name': 'import.retries', 'type': 'monotonic_gauge'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'SITE_ID', 'name': 'site_id', 'type': 'tag'}, + {'source': 'IMPORTER_NAME', 'name': 'importer_name', 'type': 'tag'}, + {'source': 'PROCEDURE_NAME', 'name': 'procedure_name', 'type': 'tag'}, + {'source': 'SUCCESSES', 'name': 'import.successes', 'type': 'monotonic_gauge'}, + {'source': 'FAILURES', 'name': 'import.failures', 'type': 'monotonic_gauge'}, + { + 'source': 'OUTSTANDING_REQUESTS', + 'name': 'import.outstanding_requests', + 'type': 'gauge', + }, + {'source': 'RETRIES', 'name': 'import.retries', 'type': 'monotonic_gauge'}, ], } @@ -292,18 +406,23 @@ 'name': 'queue', 'query': '@Statistics:[QUEUE, 1]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'site_id', 'type': 'tag'}, - {'name': 'queue.current_depth', 'type': 'gauge'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'SITE_ID', 'name': 'site_id', 'type': 'tag'}, + {'source': 'CURRENT_DEPTH', 'name': 'queue.current_depth', 'type': 'gauge'}, # The next metric is the number of tasks that left the queue in the past 5 seconds. # We compute a rate by dividing this value by 5. - {'name': '_source.poll_count', 'type': 'source'}, - {'name': 'queue.avg_wait', 'type': 'gauge'}, - {'name': 'queue.max_wait', 'type': 'gauge'}, + {'source': 'POLL_COUNT', 'name': '_source.poll_count', 'type': 'source'}, + {'source': 'AVG_WAIT', 'name': 'queue.avg_wait', 'type': 'gauge'}, + {'source': 'MAX_WAIT', 'name': 'queue.max_wait', 'type': 'gauge'}, + ], + 'extras': [ + { + 'name': 'queue.poll_count_per_sec', + 'expression': '_source.poll_count / 5.0', + 'submit_type': 'gauge', + } ], - 'extras': [{'name': 'queue.poll_count_per_sec', 'expression': '_source.poll_count / 5.0', 'submit_type': 'gauge'}], } # See: https://docs.voltdb.com/UsingVoltDB/sysprocstatistics.php#sysprocstatidletime @@ -312,16 +431,15 @@ 'name': 'idletime', 'query': '@Statistics:[IDLETIME, 1]', 'columns': [ - None, # TIMESTAMP - {'name': 'host_id', 'type': 'tag'}, - {'name': 'voltdb_hostname', 'type': 'tag'}, - {'name': 'site_id', 'type': 'tag'}, - {'name': 'idletime.wait', 'type': 'monotonic_gauge'}, - {'name': 'idletime.wait.pct', 'type': 'gauge'}, - {'name': 'idletime.avg_wait', 'type': 'gauge'}, - {'name': 'idletime.min_wait', 'type': 'gauge'}, - {'name': 'idletime.max_wait', 'type': 'gauge'}, - {'name': 'idletime.stddev', 'type': 'gauge'}, + {'source': 'HOST_ID', 'name': 'host_id', 'type': 'tag'}, + {'source': 'HOSTNAME', 'name': 'voltdb_hostname', 'type': 'tag'}, + {'source': 'SITE_ID', 'name': 'site_id', 'type': 'tag'}, + {'source': 'COUNT', 'name': 'idletime.wait', 'type': 'monotonic_gauge'}, + {'source': 'PERCENT', 'name': 'idletime.wait.pct', 'type': 'gauge'}, + {'source': 'AVG', 'name': 'idletime.avg_wait', 'type': 'gauge'}, + {'source': 'MIN', 'name': 'idletime.min_wait', 'type': 'gauge'}, + {'source': 'MAX', 'name': 'idletime.max_wait', 'type': 'gauge'}, + {'source': 'STDDEV', 'name': 'idletime.stddev', 'type': 'gauge'}, ], } @@ -331,14 +449,37 @@ 'name': 'procedureoutput', 'query': '@Statistics:[PROCEDUREOUTPUT]', 'columns': [ - None, # TIMESTAMP - {'name': 'procedure', 'type': 'tag'}, - {'name': 'procedureoutput.weighted_perc', 'type': 'gauge'}, - {'name': 'procedureoutput.invocations', 'type': 'monotonic_gauge'}, - {'name': 'procedureoutput.min_result_size', 'type': 'gauge'}, - {'name': 'procedureoutput.max_result_size', 'type': 'gauge'}, - {'name': 'procedureoutput.avg_result_size', 'type': 'gauge'}, - {'name': 'procedureoutput.total_result_size', 'type': 'gauge'}, + {'source': 'PROCEDURE', 'name': 'procedure', 'type': 'tag'}, + { + 'source': 'WEIGHTED_PERC', + 'name': 'procedureoutput.weighted_perc', + 'type': 'gauge', + }, + { + 'source': 'INVOCATIONS', + 'name': 'procedureoutput.invocations', + 'type': 'monotonic_gauge', + }, + { + 'source': 'MIN_RESULT_SIZE', + 'name': 'procedureoutput.min_result_size', + 'type': 'gauge', + }, + { + 'source': 'MAX_RESULT_SIZE', + 'name': 'procedureoutput.max_result_size', + 'type': 'gauge', + }, + { + 'source': 'AVG_RESULT_SIZE', + 'name': 'procedureoutput.avg_result_size', + 'type': 'gauge', + }, + { + 'source': 'TOTAL_RESULT_SIZE_MB', + 'name': 'procedureoutput.total_result_size', + 'type': 'gauge', + }, ], } @@ -349,14 +490,29 @@ 'name': 'procedureprofile', 'query': '@Statistics:[PROCEDUREPROFILE]', 'columns': [ - None, # TIMESTAMP - {'name': 'procedure', 'type': 'tag'}, - {'name': 'procedureprofile.weighted_perc', 'type': 'gauge'}, - {'name': 'procedureprofile.invocations', 'type': 'monotonic_gauge'}, - {'name': 'procedureprofile.avg_time', 'type': 'gauge'}, - {'name': 'procedureprofile.min_time', 'type': 'gauge'}, - {'name': 'procedureprofile.max_time', 'type': 'gauge'}, - {'name': 'procedureprofile.aborts', 'type': 'monotonic_gauge'}, - {'name': 'procedureprofile.failures', 'type': 'monotonic_gauge'}, + {'source': 'PROCEDURE', 'name': 'procedure', 'type': 'tag'}, + { + 'source': 'WEIGHTED_PERC', + 'name': 'procedureprofile.weighted_perc', + 'type': 'gauge', + }, + { + 'source': 'INVOCATIONS', + 'name': 'procedureprofile.invocations', + 'type': 'monotonic_gauge', + }, + {'source': 'AVG', 'name': 'procedureprofile.avg_time', 'type': 'gauge'}, + {'source': 'MIN', 'name': 'procedureprofile.min_time', 'type': 'gauge'}, + {'source': 'MAX', 'name': 'procedureprofile.max_time', 'type': 'gauge'}, + { + 'source': 'ABORTS', + 'name': 'procedureprofile.aborts', + 'type': 'monotonic_gauge', + }, + { + 'source': 'FAILURES', + 'name': 'procedureprofile.failures', + 'type': 'monotonic_gauge', + }, ], } diff --git a/voltdb/datadog_checks/voltdb/types.py b/voltdb/datadog_checks/voltdb/types.py index 32c58abd7d1ab..a9bb5ecc64128 100644 --- a/voltdb/datadog_checks/voltdb/types.py +++ b/voltdb/datadog_checks/voltdb/types.py @@ -6,14 +6,18 @@ Instance = TypedDict( 'Instance', { + # Deprecated: use 'host' and 'port' instead. Kept for backwards + # compatibility with users who still pass the legacy HTTP URL. 'url': str, + 'host': str, + 'port': int, 'username': str, 'password': str, - 'password_hashed': bool, + 'use_ssl': bool, + 'ssl_config_file': str, + 'connect_timeout': float, + 'procedure_timeout': float, 'statistics_components': List[str], - 'tls_verify': bool, - 'tls_cert': str, - 'tls_ca_cert': str, 'tags': List[str], 'custom_queries': List[dict], }, diff --git a/voltdb/pyproject.toml b/voltdb/pyproject.toml index 7bcd78a7989ef..a5d007532e6b5 100644 --- a/voltdb/pyproject.toml +++ b/voltdb/pyproject.toml @@ -35,7 +35,9 @@ dynamic = [ ] [project.optional-dependencies] -deps = [] +deps = [ + "voltdbclient==14.2.0", +] [project.urls] Source = "https://github.com/DataDog/integrations-core" diff --git a/voltdb/tests/common.py b/voltdb/tests/common.py index 1aa17c6293cba..48fb2fa14530d 100644 --- a/voltdb/tests/common.py +++ b/voltdb/tests/common.py @@ -122,7 +122,14 @@ 'voltdb.table.string_data_memory', 'voltdb.table.percent_full', ], - {'host_id', 'voltdb_hostname', 'site_id', 'partition_id', 'table', 'table_type'}, + { + 'host_id', + 'voltdb_hostname', + 'site_id', + 'partition_id', + 'table', + 'table_type', + }, ), ( # INDEX @@ -160,14 +167,12 @@ TLS_ENABLED = is_affirmative(os.environ.get('TLS_ENABLED')) TLS_CERTS_DIR = os.path.join(HERE, 'compose', 'certs') -TLS_CERT = os.path.join(TLS_CERTS_DIR, 'client.pem') -TLS_CA_CERT = os.path.join(TLS_CERTS_DIR, 'ca.pem') -TLS_PASSWORD = 'tlspass' +# voltdbclient supports pointing `ssl_config_file` at a PEM truststore directly, +# so we reuse the same `ca.pem` the docker compose fixture already ships. +TLS_CONFIG_FILE = os.path.join(TLS_CERTS_DIR, 'ca.pem') VOLTDB_DEPLOYMENT = os.path.join(HERE, 'compose', 'deployment-tls.xml' if TLS_ENABLED else 'deployment.xml') -VOLTDB_SCHEME = 'https' if TLS_ENABLED else 'http' -VOLTDB_CLIENT_PORT = 8443 if TLS_ENABLED else 8080 -VOLTDB_URL = '{}://{}:{}'.format(VOLTDB_SCHEME, HOST, VOLTDB_CLIENT_PORT) +VOLTDB_CLIENT_PORT = 21212 SERVICE_CHECK_TAGS = ['host:{}'.format(HOST), 'port:{}'.format(VOLTDB_CLIENT_PORT)] @@ -175,8 +180,9 @@ VOLTDB_IMAGE = os.environ['VOLTDB_IMAGE'] BASE_INSTANCE = { - 'url': VOLTDB_URL, + 'host': HOST, + 'port': VOLTDB_CLIENT_PORT, 'username': 'doggo', - 'password': 'doggopass', # SHA256: e81255cee7bd2c4fbb4c8d6e9d6ba1d33a912bdfa9901dc9acfb2bd7f3e8eeb1 + 'password': 'doggopass', 'tags': ['test:voltdb'], } # type: Instance diff --git a/voltdb/tests/compose/docker-compose.yaml b/voltdb/tests/compose/docker-compose.yaml index d4334eae666b6..bdbb9b28d9569 100644 --- a/voltdb/tests/compose/docker-compose.yaml +++ b/voltdb/tests/compose/docker-compose.yaml @@ -11,7 +11,7 @@ services: - ./log4j.xml:/opt/voltdb/voltdb/log4j.xml - ${DD_LOG_1}:/var/log/voltdb.log ports: - - ${VOLTDB_CLIENT_PORT}:${VOLTDB_CLIENT_PORT} # JSON Interface: https://docs.voltdb.com/UsingVoltDB/ProgLangJson.php + - ${VOLTDB_CLIENT_PORT}:${VOLTDB_CLIENT_PORT} # Native client port: https://docs.voltdb.com/UsingVoltDB/HostConfigPortOpts.php voltdb1: image: ${VOLTDB_IMAGE} diff --git a/voltdb/tests/conftest.py b/voltdb/tests/conftest.py index 3002923cd734a..749bc578e4c00 100644 --- a/voltdb/tests/conftest.py +++ b/voltdb/tests/conftest.py @@ -26,9 +26,21 @@ def dd_environment(instance): schema = f.read() conditions = [ - CheckDockerLogs(compose_file, patterns=['Server completed initialization'], service='voltdb0'), - CheckDockerLogs(compose_file, patterns=['Server completed initialization'], service='voltdb1'), - CheckDockerLogs(compose_file, patterns=['Server completed initialization'], service='voltdb2'), + CheckDockerLogs( + compose_file, + patterns=['Server completed initialization'], + service='voltdb0', + ), + CheckDockerLogs( + compose_file, + patterns=['Server completed initialization'], + service='voltdb1', + ), + CheckDockerLogs( + compose_file, + patterns=['Server completed initialization'], + service='voltdb2', + ), CreateSchema(compose_file, schema, container_name='voltdb0'), EnsureExpectedMetricsShowUp(instance), ] @@ -42,13 +54,18 @@ def dd_environment(instance): if common.TLS_ENABLED: # Must refer to a path within the Agent container. instance = instance.copy() - instance['tls_cert'] = '/tmp/voltdb-certs/client.pem' - instance['tls_ca_cert'] = '/tmp/voltdb-certs/ca.pem' + instance['ssl_config_file'] = '/tmp/voltdb-certs/ca.pem' e2e_metadata = {'docker_volumes': ['{}:/tmp/voltdb-certs'.format(common.TLS_CERTS_DIR)]} else: e2e_metadata = {} - with docker_run(compose_file, conditions=conditions, env_vars=env_vars, mount_logs=True, attempts=2): + with docker_run( + compose_file, + conditions=conditions, + env_vars=env_vars, + mount_logs=True, + attempts=2, + ): yield instance, e2e_metadata @@ -68,8 +85,8 @@ def instance(): ] if common.TLS_ENABLED: - instance['tls_cert'] = common.TLS_CERT - instance['tls_ca_cert'] = common.TLS_CA_CERT + instance['use_ssl'] = True + instance['ssl_config_file'] = common.TLS_CONFIG_FILE return instance @@ -79,26 +96,246 @@ def instance_all(instance): # type: (Instance) -> Instance instance = common.BASE_INSTANCE.copy() instance['statistics_components'] = [ - "COMMANDLOG", - "CPU", - "EXPORT", - "GC", - "IDLETIME", - "IMPORT", - "INDEX", - "IOSTATS", - "LATENCY", - "MEMORY", - "PROCEDURE", - "PROCEDUREOUTPUT", - "PROCEDUREPROFILE", - "QUEUE", - "SNAPSHOTSTATUS", - "TABLE", + 'COMMANDLOG', + 'CPU', + 'EXPORT', + 'GC', + 'IDLETIME', + 'IMPORT', + 'INDEX', + 'IOSTATS', + 'LATENCY', + 'MEMORY', + 'PROCEDURE', + 'PROCEDUREOUTPUT', + 'PROCEDUREPROFILE', + 'QUEUE', + 'SNAPSHOTSTATUS', + 'TABLE', ] return instance + # Column headers for each `@Statistics` component, matching the positional + # layout of rows in tests/fixtures/mock_results.json. Used by the mock to expose + # `VoltTable.columns` so that the check can look up values by name. + + +MOCK_COLUMN_HEADERS = { + 'CPU': ['TIMESTAMP', 'HOST_ID', 'HOSTNAME', 'PERCENT_USED'], + 'MEMORY': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'RSS', + 'JAVAUSED', + 'JAVAUNUSED', + 'TUPLEDATA', + 'TUPLEALLOCATED', + 'INDEXMEMORY', + 'STRINGMEMORY', + 'TUPLECOUNT', + 'POOLEDMEMORY', + 'PHYSICALMEMORY', + 'JAVAMAXHEAP', + ], + 'SNAPSHOTSTATUS': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'TABLE', + 'PATH', + 'FILENAME', + 'NONCE', + 'TXNID', + 'START_TIME', + 'END_TIME', + 'SIZE', + 'DURATION', + 'THROUGHPUT', + 'RESULT', + 'TYPE', + ], + 'COMMANDLOG, 1': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'OUTSTANDING_BYTES', + 'OUTSTANDING_TXNS', + 'IN_USE_SEGMENT_COUNT', + 'SEGMENT_COUNT', + 'FSYNC_INTERVAL', + ], + 'PROCEDURE, 1': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'SITE_ID', + 'PARTITION_ID', + 'PROCEDURE', + 'INVOCATIONS', + 'TIMED_INVOCATIONS', + 'MIN_EXECUTION_TIME', + 'MAX_EXECUTION_TIME', + 'AVG_EXECUTION_TIME', + 'MIN_RESULT_SIZE', + 'MAX_RESULT_SIZE', + 'AVG_RESULT_SIZE', + 'MIN_PARAMETER_SET_SIZE', + 'MAX_PARAMETER_SET_SIZE', + 'AVG_PARAMETER_SET_SIZE', + 'ABORTS', + 'FAILURES', + 'TRANSACTIONAL', + ], + 'LATENCY': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'INTERVAL', + 'COUNT', + 'TPS', + 'P50', + 'P95', + 'P99', + 'P99.9', + 'P99.99', + 'P99.999', + 'MAX', + ], + 'GC, 1': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'NEWGEN_GC_COUNT', + 'NEWGEN_AVG_GC_TIME', + 'OLDGEN_GC_COUNT', + 'OLDGEN_AVG_GC_TIME', + ], + 'IOSTATS, 1': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'CONNECTION_ID', + 'CONNECTION_HOSTNAME', + 'BYTES_READ', + 'MESSAGES_READ', + 'BYTES_WRITTEN', + 'MESSAGES_WRITTEN', + ], + 'TABLE, 1': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'SITE_ID', + 'PARTITION_ID', + 'TABLE_NAME', + 'TABLE_TYPE', + 'TUPLE_COUNT', + 'TUPLE_ALLOCATED_MEMORY', + 'TUPLE_DATA_MEMORY', + 'STRING_DATA_MEMORY', + 'TUPLE_LIMIT', + 'PERCENT_FULL', + ], + 'INDEX, 1': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'SITE_ID', + 'PARTITION_ID', + 'INDEX_NAME', + 'TABLE_NAME', + 'INDEX_TYPE', + 'IS_UNIQUE', + 'IS_COUNTABLE', + 'ENTRY_COUNT', + 'MEMORY_ESTIMATE', + ], + 'EXPORT, 1': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'SITE_ID', + 'PARTITION_ID', + 'SOURCE', + 'TARGET', + 'ACTIVE', + 'TUPLE_COUNT', + 'TUPLE_PENDING', + 'LAST_QUEUED_TIMESTAMP', + 'LAST_ACKED_TIMESTAMP', + 'AVERAGE_LATENCY', + 'MAX_LATENCY', + 'QUEUE_GAP', + 'STATUS', + ], + 'IMPORT, 1': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'SITE_ID', + 'IMPORTER_NAME', + 'PROCEDURE_NAME', + 'SUCCESSES', + 'FAILURES', + 'OUTSTANDING_REQUESTS', + 'RETRIES', + ], + 'QUEUE, 1': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'SITE_ID', + 'CURRENT_DEPTH', + 'POLL_COUNT', + 'AVG_WAIT', + 'MAX_WAIT', + ], + 'IDLETIME, 1': [ + 'TIMESTAMP', + 'HOST_ID', + 'HOSTNAME', + 'SITE_ID', + 'COUNT', + 'PERCENT', + 'AVG', + 'MIN', + 'MAX', + 'STDDEV', + ], + 'PROCEDUREOUTPUT': [ + 'TIMESTAMP', + 'PROCEDURE', + 'WEIGHTED_PERC', + 'INVOCATIONS', + 'MIN_RESULT_SIZE', + 'MAX_RESULT_SIZE', + 'AVG_RESULT_SIZE', + 'TOTAL_RESULT_SIZE_MB', + ], + 'PROCEDUREPROFILE': [ + 'TIMESTAMP', + 'PROCEDURE', + 'WEIGHTED_PERC', + 'INVOCATIONS', + 'AVG', + 'MIN', + 'MAX', + 'ABORTS', + 'FAILURES', + ], +} + + +def _mock_columns(header_names): + columns = [] + for name in header_names: + col = mock.MagicMock() + col.name = name + columns.append(col) + return columns + @pytest.fixture(scope='session') def mock_results(): @@ -106,22 +343,35 @@ def mock_results(): with open(os.path.join(common.HERE, 'fixtures', 'mock_results.json'), 'r') as f: mocked_data = json.load(f) - def mocked_response(data): - m = mock.MagicMock() - m.json = lambda: {"results": [{"data": data}]} - return m + def mocked_response(rows, header_names): + table = mock.MagicMock() + table.tuples = rows + table.columns = _mock_columns(header_names) + resp = mock.MagicMock() + resp.status = 1 # Client.SUCCESS + resp.statusString = None + resp.tables = [table] + return resp - def mocked_request(procedure, parameters=None): - if procedure == '@SystemInformation' and parameters == ['OVERVIEW']: - return mocked_response([["host-0", "VERSION", "8.4"]]) + def mocked_call_procedure(procedure, params=None): + params = params or [] + if procedure == '@SystemInformation' and params == ['OVERVIEW']: + return mocked_response([['host-0', 'VERSION', '8.4']], ['HOST_ID', 'KEY', 'VALUE']) if procedure != '@Statistics': - raise Exception("Bad procedure name") - parameters = parameters.strip('[').strip(']') - if parameters not in mocked_data: - raise Exception("Invalid parameter %s" % parameters) - - return mocked_response(mocked_data[parameters]) + raise Exception('Bad procedure name: %s' % procedure) + # @Statistics params look like ['CPU'] or ['COMMANDLOG', 1]. + if len(params) == 1: + key = params[0] + else: + key = '{}, {}'.format(params[0], params[1]) + if key not in mocked_data: + raise Exception('Invalid parameter %s' % key) + return mocked_response(mocked_data[key], MOCK_COLUMN_HEADERS[key]) with mock.patch('datadog_checks.voltdb.check.Client') as m: - m.return_value.request = mocked_request + client = m.return_value + client.SUCCESS = 1 + client.call_procedure = mocked_call_procedure + client.raise_for_status = lambda r: None + client.close = lambda: None yield diff --git a/voltdb/tests/test_integration.py b/voltdb/tests/test_integration.py index 985094e4e4a59..3a88638fe5335 100644 --- a/voltdb/tests/test_integration.py +++ b/voltdb/tests/test_integration.py @@ -1,12 +1,10 @@ # (C) Datadog, Inc. 2020-present # All rights reserved # Licensed under a 3-clause BSD style license (see LICENSE) -import hashlib from typing import Callable # noqa: F401 import mock import pytest -import requests from datadog_checks.base.stubs.aggregator import AggregatorStub # noqa: F401 from datadog_checks.base.stubs.datadog_agent import DatadogAgentStub # noqa: F401 @@ -27,33 +25,19 @@ def test_check(self, aggregator, instance): assertions.assert_service_checks(aggregator, instance) assertions.assert_metrics(aggregator) - def test_password_hashed(self, aggregator, instance): - # type: (AggregatorStub, Instance) -> None - instance = instance.copy() - instance['password'] = hashlib.sha256(instance['password'].encode()).hexdigest() - instance['password_hashed'] = True - - check = VoltDBCheck('voltdb', {}, [instance]) - check.run() - - assertions.assert_service_checks(aggregator, instance) - assertions.assert_metrics(aggregator) - def test_failure_connection_refused(self, aggregator, instance): # type: (AggregatorStub, Instance) -> None instance = instance.copy() - instance['url'] = 'http://doesnotexist:8080' + instance['host'] = 'doesnotexist' # Speed up the test - instance["timeout"] = 2 + instance['connect_timeout'] = 2 check = VoltDBCheck('voltdb', {}, [instance]) - with pytest.raises(Exception) as ctx: + with pytest.raises(Exception): check.check(instance) - error = str(ctx.value) - assert error - tags = ['host:doesnotexist', 'port:8080'] + tags = ['host:doesnotexist', 'port:{}'.format(instance.get('port', 21212))] assertions.assert_service_checks(aggregator, instance, connect_status=VoltDBCheck.CRITICAL, tags=tags) def test_failure_unauthorized(self, aggregator, instance): @@ -63,39 +47,11 @@ def test_failure_unauthorized(self, aggregator, instance): check = VoltDBCheck('voltdb', {}, [instance]) - with pytest.raises(Exception) as ctx: + with pytest.raises(Exception): check.check(instance) - error = str(ctx.value) - assert '401 Client Error: Unauthorized' in error assertions.assert_service_checks(aggregator, instance, connect_status=VoltDBCheck.CRITICAL) - def test_http_error(self, aggregator, instance): - # type: (AggregatorStub, Instance) -> None - check = VoltDBCheck('voltdb', {}, [instance]) - - with mock.patch('requests.Session.get', side_effect=requests.RequestException('Something failed')): - error = check.run() - - assert 'Something failed' in error - - assertions.assert_service_checks(aggregator, instance, connect_status=VoltDBCheck.CRITICAL) - aggregator.assert_all_metrics_covered() # No metrics collected. - - def test_http_response_error(self, aggregator, instance): - # type: (AggregatorStub, Instance) -> None - check = VoltDBCheck('voltdb', {}, [instance]) - - resp = requests.Response() - resp.status_code = 503 - with mock.patch('requests.Session.get', return_value=resp): - error = check.run() - - assert '503 Server Error' in error - - assertions.assert_service_checks(aggregator, instance, connect_status=VoltDBCheck.CRITICAL) - aggregator.assert_all_metrics_covered() # No metrics collected. - def test_custom_tags(self, aggregator, instance): # type: (AggregatorStub, Instance) -> None instance = instance.copy() @@ -120,10 +76,18 @@ def __init__(self, client, app): self._client = client self._app = app - def request(self, procedure, parameters=None): + def call_procedure(self, procedure, params=None): if procedure == '@SystemInformation': return self._app() - return self._client.request(procedure, parameters=parameters) + return self._client.call_procedure(procedure, params=params) + + def raise_for_status(self, response): + # Mock responses already have status set by the test app. + if response.status != Client.SUCCESS: + self._client.raise_for_status(response) + + def close(self): + self._client.close() @pytest.mark.integration @@ -153,9 +117,17 @@ def test_default(self, instance, datadog_agent): def test_malformed(self, instance, datadog_agent): # type: (Instance, DatadogAgentStub) -> None def app(): - r = mock.MagicMock() - r.json.return_value = {'results': [{'data': [('0', 'VERSION', 'not_a_version_string')]}]} - return r + table = mock.MagicMock() + table.tuples = [('0', 'VERSION', 'not_a_version_string')] + table.columns = [ + mock.MagicMock(**{'name': 'HOST_ID'}), + mock.MagicMock(**{'name': 'KEY'}), + mock.MagicMock(**{'name': 'VALUE'}), + ] + resp = mock.MagicMock() + resp.status = Client.SUCCESS + resp.tables = [table] + return resp check_id = 'test' check = VoltDBCheck('voltdb', {}, [instance]) @@ -183,9 +155,17 @@ def app(): def test_no_version_column(self, aggregator, instance, datadog_agent): # type: (AggregatorStub, Instance, DatadogAgentStub) -> None def app(): - r = mock.MagicMock() - r.json.return_value = {'results': [{'data': [('0', 'THIS_IS_NOT_VERSION', 'test')]}]} - return r + table = mock.MagicMock() + table.tuples = [('0', 'THIS_IS_NOT_VERSION', 'test')] + table.columns = [ + mock.MagicMock(**{'name': 'HOST_ID'}), + mock.MagicMock(**{'name': 'KEY'}), + mock.MagicMock(**{'name': 'VALUE'}), + ] + resp = mock.MagicMock() + resp.status = Client.SUCCESS + resp.tables = [table] + return resp check_id = 'test' check = VoltDBCheck('voltdb', {}, [instance]) diff --git a/voltdb/tests/test_unit.py b/voltdb/tests/test_unit.py index 60cbdd19e3c44..1f072fdcf1b27 100644 --- a/voltdb/tests/test_unit.py +++ b/voltdb/tests/test_unit.py @@ -9,7 +9,7 @@ from datadog_checks.base import ConfigurationError from datadog_checks.dev.utils import get_metadata_metrics -from datadog_checks.voltdb.check import VoltDBCheck +from datadog_checks.voltdb.check import VoltDBCheck, _parse_query from datadog_checks.voltdb.config import Config from datadog_checks.voltdb.types import Instance # noqa: F401 @@ -19,22 +19,25 @@ @pytest.mark.parametrize( 'instance, match', [ - pytest.param({'username': 'doggo', 'password': 'doggopass'}, 'url is required', id='url-missing'), pytest.param( - {'url': 'http://:8080', 'username': 'doggo', 'password': 'doggopass'}, - 'URL must contain a host', - id='url-no-host', + {'username': 'doggo', 'password': 'doggopass'}, + "either 'host' or 'hosts' is required", + id='host-missing', ), - pytest.param({'url': 'http://:8080'}, 'username and password are required', id='creds-missing'), pytest.param( - {'url': 'http://localhost:8080', 'username': 'doggo'}, - 'username and password are required', - id='creds-username-only', + { + 'host': 'localhost', + 'port': 0, + 'username': 'doggo', + 'password': 'doggopass', + }, + 'port must be a positive integer', + id='port-invalid', ), pytest.param( - {'url': 'http://localhost:8080', 'password': 'doggopass'}, - 'username and password are required', - id='creds-password-only', + {'url': 'http://localhost:8080'}, + "'username' and 'password' are required when 'url' is set", + id='http-mode-needs-credentials', ), ], ) @@ -53,24 +56,511 @@ def test_config_errors(instance, match): ) def test_custom_tags(instance, tags): # type: (Instance, Optional[list]) -> None - instance = {'url': 'http://localhost:8000', 'username': 'doggo', 'password': 'doggopass'} + instance = {'host': 'localhost', 'username': 'doggo', 'password': 'doggopass'} if tags is not None: instance['tags'] = tags config = Config(instance) assert config.tags == tags +def test_default_port(): + # type: () -> None + config = Config({'host': 'localhost', 'username': 'doggo', 'password': 'doggopass'}) + assert config.netloc == ('localhost', 21212) + + +def test_custom_port(): + # type: () -> None + config = Config( + { + 'host': 'localhost', + 'port': 31212, + 'username': 'doggo', + 'password': 'doggopass', + } + ) + assert config.netloc == ('localhost', 31212) + + +def test_no_credentials(): + # type: () -> None + # Native client allows empty credentials when the cluster does not require auth. + config = Config({'host': 'localhost'}) + assert config.username == '' + assert config.password == '' + + +@pytest.mark.parametrize( + 'instance, expected', + [ + pytest.param({'host': 'localhost'}, 60, id='default'), + pytest.param({'host': 'localhost', 'procedure_timeout': 30}, 30, id='explicit'), + pytest.param({'host': 'localhost', 'procedure_timeout': 0}, None, id='zero-disables'), + pytest.param({'host': 'localhost', 'procedure_timeout': -1}, None, id='negative-disables'), + ], +) +def test_procedure_timeout_default(instance, expected): + """procedure_timeout defaults to 60s so a hung VoltDB procedure can't block + the check forever. Setting it to 0 (or any non-positive number) restores + the 'wait indefinitely' behavior.""" + config = Config(instance) + assert config.procedure_timeout == expected + + @pytest.mark.parametrize( - 'url, netloc', + 'url, expected_netloc', [ - pytest.param('http://localhost', ('localhost', 80), id='http'), - pytest.param('https://localhost', ('localhost', 443), id='https'), + pytest.param('http://localhost:8080', ('localhost', 8080), id='http-explicit-port'), + pytest.param('https://voltdb.example:8443', ('voltdb.example', 8443), id='https-explicit-port'), + pytest.param('http://my-cluster', ('my-cluster', 80), id='http-default-port'), + pytest.param('https://my-cluster', ('my-cluster', 443), id='https-default-port'), + ], +) +def test_url_activates_http_mode(url, expected_netloc): + """Setting `url` selects the HTTP/VMC transport. The URL's host and port are + used directly (no port-coercion to 21212 — that's the native client port).""" + from datadog_checks.voltdb.config import MODE_HTTP + + config = Config({'url': url, 'username': 'u', 'password': 'p'}) + assert config.mode == MODE_HTTP + assert config.url == url + assert config.netloc == expected_netloc + + +def test_host_without_url_uses_native_mode(): + """Setting `host` (without `url`) selects the native binary transport.""" + from datadog_checks.voltdb.config import MODE_NATIVE + + config = Config({'host': 'db-1.example', 'username': 'u', 'password': 'p'}) + assert config.mode == MODE_NATIVE + assert config.netloc == ('db-1.example', 21212) + assert config.endpoints == [('db-1.example', 21212)] + + +def test_hosts_list_expands_to_endpoints(): + """`hosts:` accepts either bare hostnames (using the global `port`) or + 'host:port' strings. Endpoints are tried in order.""" + config = Config( + { + 'hosts': ['db-1.example', 'db-2.example:21222', 'db-3.example'], + 'port': 21232, + 'username': 'u', + 'password': 'p', + } + ) + assert config.endpoints == [ + ('db-1.example', 21232), + ('db-2.example', 21222), + ('db-3.example', 21232), + ] + # netloc points at the first endpoint for stable tag values. + assert config.netloc == ('db-1.example', 21232) + + +def test_hosts_takes_precedence_over_host(): + """If both `host` and `hosts` are set, `hosts` wins so users can opt into + failover by just adding a `hosts:` entry.""" + config = Config({'host': 'ignored.example', 'hosts': ['db-1.example', 'db-2.example']}) + assert config.endpoints == [('db-1.example', 21212), ('db-2.example', 21212)] + + +@pytest.mark.parametrize( + 'instance, match', + [ + pytest.param( + {'hosts': ['db-1.example:abc']}, + 'has an invalid port', + id='non-numeric-port', + ), + pytest.param( + {'hosts': ['db-1.example:0']}, + 'non-positive port', + id='zero-port', + ), + pytest.param( + {'hosts': ['']}, + 'non-empty', + id='empty-entry', + ), + pytest.param( + {'hosts': 'db-1.example'}, + "'hosts' must be a list", + id='hosts-not-a-list', + ), ], ) -def test_default_port(url, netloc): - # type: (str, tuple) -> None - config = Config({'url': url, 'username': 'doggo', 'password': 'doggopass'}) - assert config.netloc == netloc +def test_hosts_validation_errors(instance, match): + with pytest.raises(ConfigurationError, match=match): + Config(instance) + + +def test_client_failover_tries_each_endpoint(monkeypatch): + """When the first endpoint refuses connection, the client tries the next one.""" + from datadog_checks.voltdb.client import Client + + attempts = [] + + class FakeFser: + def close(self): + pass + + def fake_init(host, port, **_): + attempts.append((host, port)) + if host == 'down.example': + raise ConnectionRefusedError('first node is down') + return FakeFser() + + monkeypatch.setattr(Client, '_open', lambda self, host, port: fake_init(host, port)) + + client = Client( + endpoints=[('down.example', 21212), ('up.example', 21212)], + ) + fser = client._get_connection() + assert isinstance(fser, FakeFser) + assert attempts == [('down.example', 21212), ('up.example', 21212)] + assert client.active_endpoint == ('up.example', 21212) + + +def test_client_raises_when_no_endpoint_is_reachable(monkeypatch): + """If every endpoint refuses connection, the client surfaces the last error.""" + from datadog_checks.voltdb.client import Client + + def always_refuse(self, host, port): + raise ConnectionRefusedError('{}:{} is down'.format(host, port)) + + monkeypatch.setattr(Client, '_open', always_refuse) + + client = Client(endpoints=[('a.example', 21212), ('b.example', 21212)]) + with pytest.raises(ConnectionRefusedError, match='b.example:21212 is down'): + client._get_connection() + assert client.active_endpoint is None + + +def test_client_requires_at_least_one_endpoint(): + from datadog_checks.voltdb.client import Client + + with pytest.raises(ValueError, match='at least one'): + Client(endpoints=[]) + + +def test_client_call_procedure_returns_response(monkeypatch): + """Happy path: the client opens a connection, hands it to VoltProcedure, + and returns the response object.""" + import mock + import voltdbclient + + from datadog_checks.voltdb.client import Client + + fake_fser = mock.MagicMock() + monkeypatch.setattr(Client, '_open', lambda self, host, port: fake_fser) + + fake_response = mock.MagicMock() + fake_proc = mock.MagicMock() + fake_proc.call.return_value = fake_response + monkeypatch.setattr(voltdbclient, 'VoltProcedure', lambda fser, name, types: fake_proc) + + client = Client(endpoints=[('h.example', 21212)]) + resp = client.call_procedure('@Statistics', ['CPU', 0]) + assert resp is fake_response + fake_proc.call.assert_called_once_with(['CPU', 0]) + + +def test_client_call_procedure_retries_once_on_stale_connection(monkeypatch): + """If a procedure call fails on an existing connection, the client closes, + reconnects, and retries once. Verifies the second-attempt path.""" + import mock + import voltdbclient + + from datadog_checks.voltdb.client import Client + + opens = [] + + def fake_open(self, host, port): + f = mock.MagicMock(name='fser-{}'.format(len(opens))) + opens.append(f) + return f + + monkeypatch.setattr(Client, '_open', fake_open) + + good_response = mock.MagicMock(name='good') + call_count = {'n': 0} + + def fake_proc(fser, name, types): + proc = mock.MagicMock() + + def call(params, timeout=None): + call_count['n'] += 1 + if call_count['n'] == 1: + # Pretend the first call (on the cached connection) fails mid-flight. + raise BrokenPipeError('mid-flight failure') + return good_response + + proc.call.side_effect = call + return proc + + monkeypatch.setattr(voltdbclient, 'VoltProcedure', fake_proc) + + client = Client(endpoints=[('h.example', 21212)]) + # Prime the cached connection so the next call_procedure goes through the + # 'had_connection = True' branch. + client._get_connection() + resp = client.call_procedure('@Ping') + assert resp is good_response + assert call_count['n'] == 2 # initial failure + retry success + + +def test_client_raise_for_status(): + from datadog_checks.voltdb.client import Client, VoltDBError + + client = Client(endpoints=[('h.example', 21212)]) + # Success path: should not raise. + ok_resp = type('R', (), {'status': Client.SUCCESS, 'statusString': None})() + client.raise_for_status(ok_resp) + # Failure path: VoltDBError carries the status code and string. + bad_resp = type('R', (), {'status': -2, 'statusString': 'connection lost'})() + with pytest.raises(VoltDBError, match='connection lost') as exc: + client.raise_for_status(bad_resp) + assert exc.value.status == -2 + assert exc.value.status_string == 'connection lost' + + +def test_client_close_is_idempotent(monkeypatch): + """close() can run safely whether or not a connection has been opened, and + swallows exceptions from FastSerializer.close().""" + import mock + + from datadog_checks.voltdb.client import Client + + client = Client(endpoints=[('h.example', 21212)]) + client.close() # no-op when nothing is open + assert client.active_endpoint is None + + bad_fser = mock.MagicMock() + bad_fser.close.side_effect = OSError('underlying socket already dead') + monkeypatch.setattr(Client, '_open', lambda self, host, port: bad_fser) + client._get_connection() + assert client.active_endpoint == ('h.example', 21212) + client.close() # must not propagate the OSError + assert client.active_endpoint is None + + +def test_infer_volt_type_distinguishes_bool_int_float_string(): + from voltdbclient import FastSerializer + + from datadog_checks.voltdb.client import _infer_volt_type + + # bool must come before int (bool is a subclass of int in Python). + assert _infer_volt_type(True) == FastSerializer.VOLTTYPE_TINYINT + assert _infer_volt_type(42) == FastSerializer.VOLTTYPE_INTEGER + assert _infer_volt_type(3.14) == FastSerializer.VOLTTYPE_FLOAT + assert _infer_volt_type('CPU') == FastSerializer.VOLTTYPE_STRING + + +def test_http_client_serializes_list_params_as_json(): + """`HttpClient.call_procedure` accepts both pre-serialized parameter strings + and Python lists; lists must be JSON-encoded the way VoltDB's HTTP/JSON + interface expects.""" + import json + + import mock + + from datadog_checks.voltdb.http_client import HttpClient + + calls = [] + + def fake_get(url, auth=None, params=None, **_): + calls.append(params) + resp = mock.MagicMock() + resp.raise_for_status = lambda: None + resp.json = lambda: {'status': 1, 'results': []} + return resp + + client = HttpClient(url='http://vmc.example:8080', http_get=fake_get, username='u', password='p') + client.call_procedure('@Statistics', ['CPU', 0]) + client.call_procedure('@Statistics', '[CPU, 0]') # passthrough string + client.call_procedure('@Ping') # no parameters + + assert calls[0] == {'Procedure': '@Statistics', 'Parameters': json.dumps(['CPU', 0])} + assert calls[1] == {'Procedure': '@Statistics', 'Parameters': '[CPU, 0]'} + assert calls[2] == {'Procedure': '@Ping'} + + +def test_http_client_raise_for_status(): + from datadog_checks.voltdb.client import VoltDBError + from datadog_checks.voltdb.http_client import HttpClient, HttpResponse + + client = HttpClient(url='http://vmc.example:8080', http_get=lambda *a, **k: None, username='u', password='p') + ok = HttpResponse({'status': 1, 'results': []}) + client.raise_for_status(ok) + + bad = HttpResponse({'status': 0, 'statusstring': 'unauthorized', 'results': []}) + with pytest.raises(VoltDBError, match='unauthorized'): + client.raise_for_status(bad) + + +def test_url_takes_precedence_over_host(): + """When both `url` and `host` are set, the HTTP transport is chosen — the + URL points at the VMC endpoint and `host` is ignored.""" + from datadog_checks.voltdb.config import MODE_HTTP + + config = Config({'host': 'db-1.example', 'url': 'http://vmc.example:8080', 'username': 'u', 'password': 'p'}) + assert config.mode == MODE_HTTP + assert config.netloc == ('vmc.example', 8080) + + +def test_password_hashed_only_kept_for_http(): + """`password_hashed` is forwarded to the HTTP client; the native client + ignores it (handled at client-construction time in check.py).""" + config = Config({'url': 'http://vmc:8080', 'username': 'u', 'password': 'abc', 'password_hashed': True}) + assert config.password_hashed is True + + +def test_http_mode_end_to_end(aggregator, dd_run_check): + """When `url` is set, the check uses the HTTP transport and unwraps the + JSON response into the same `tables[].columns[].name` / `tuples` shape the + native code path uses.""" + import mock + + def fake_get(url, auth=None, params=None, **_): + proc = params['Procedure'] + resp = mock.MagicMock() + resp.status_code = 200 + resp.raise_for_status = lambda: None + if proc == '@SystemInformation': + resp.json = lambda: { + 'status': 1, + 'results': [ + { + 'schema': [{'name': 'HOST_ID'}, {'name': 'KEY'}, {'name': 'VALUE'}], + 'data': [[0, 'VERSION', '14.2']], + } + ], + } + elif proc == '@Statistics' and '"CPU"' in params['Parameters']: + resp.json = lambda: { + 'status': 1, + 'results': [ + { + 'schema': [ + {'name': 'TIMESTAMP'}, + {'name': 'HOST_ID'}, + {'name': 'HOSTNAME'}, + {'name': 'PERCENT_USED'}, + ], + 'data': [[1234567890, 7, 'host-X', 42.5]], + } + ], + } + else: + resp.json = lambda: {'status': 1, 'results': []} + return resp + + instance = { + 'url': 'http://vmc.example:8080', + 'username': 'doggo', + 'password': 'doggopass', + 'statistics_components': ['CPU'], + 'tags': ['live:test'], + } + with mock.patch('requests.Session.get', side_effect=fake_get): + check = VoltDBCheck('voltdb', {}, [instance]) + dd_run_check(check) + + aggregator.assert_metric( + 'voltdb.cpu.percent_used', + value=42.5, + tags=['host_id:7', 'voltdb_hostname:host-X', 'live:test'], + ) + + +@pytest.mark.parametrize( + 'query, expected_procedure, expected_params', + [ + pytest.param( + '@SystemInformation:[OVERVIEW]', + '@SystemInformation', + ['OVERVIEW'], + id='single-string', + ), + pytest.param('@Statistics:[CPU]', '@Statistics', ['CPU'], id='one-string'), + pytest.param( + '@Statistics:[COMMANDLOG, 1]', + '@Statistics', + ['COMMANDLOG', 1], + id='string-and-int', + ), + pytest.param('HeroStats', 'HeroStats', [], id='no-params'), + pytest.param('Proc:[]', 'Proc', [], id='empty-list'), + ], +) +def test_parse_query(query, expected_procedure, expected_params): + procedure, params = _parse_query(query) + assert procedure == expected_procedure + assert params == expected_params + + +def test_columns_resolved_by_name(aggregator, dd_run_check): + """The check looks up columns by name, so the server can return extra columns + in any order without breaking the integration.""" + import mock + + def _make_table(headers, rows): + table = mock.MagicMock() + table.tuples = rows + cols = [] + for n in headers: + c = mock.MagicMock() + c.name = n + cols.append(c) + table.columns = cols + return table + + def _make_response(table): + r = mock.MagicMock() + r.status = 1 + r.statusString = None + r.tables = [table] + return r + + def fake_call(procedure, params=None): + params = params or [] + if procedure == '@SystemInformation': + return _make_response(_make_table(['HOST_ID', 'KEY', 'VALUE'], [(0, 'VERSION', '14.2')])) + # @Statistics CPU response with columns shuffled and an extra trailing column. + if procedure == '@Statistics' and params and params[0] == 'CPU': + headers = [ + 'EXTRA_NEW_COL', + 'PERCENT_USED', + 'TIMESTAMP', + 'HOSTNAME', + 'HOST_ID', + ] + rows = [(999, 42.5, 1234567890, 'voltdb-host-X', 7)] + return _make_response(_make_table(headers, rows)) + # Other statistics: missing entirely. + return _make_response(_make_table([], [])) + + with mock.patch('datadog_checks.voltdb.check.Client') as m: + client = m.return_value + client.SUCCESS = 1 + client.call_procedure = fake_call + client.raise_for_status = lambda r: None + client.close = lambda: None + + instance = { + 'host': 'localhost', + 'port': 21212, + 'statistics_components': ['CPU'], + 'tags': ['live:test'], + } + check = VoltDBCheck('voltdb', {}, [instance]) + dd_run_check(check) + + aggregator.assert_metric( + 'voltdb.cpu.percent_used', + value=42.5, + tags=['host_id:7', 'voltdb_hostname:voltdb-host-X', 'live:test'], + ) def test_metrics_with_fixtures(mock_results, aggregator, dd_run_check, instance_all): @@ -83,7 +573,7 @@ def test_metrics_with_fixtures(mock_results, aggregator, dd_run_check, instance_ for m in metrics: aggregator.assert_metric(m['name'], tags=m['tags'], metric_type=m['type']) - # Ensure we're mapping the response correctly + # Ensure we're mapping the response correctly aggregator.assert_metric('voltdb.memory.tuple_count', value=2847267.0) aggregator.assert_metric('voltdb.memory.java.max_heap', value=531998.0) diff --git a/voltdb/tests/utils.py b/voltdb/tests/utils.py index 93443a6ba79b7..367d66dc6f61b 100644 --- a/voltdb/tests/utils.py +++ b/voltdb/tests/utils.py @@ -4,7 +4,6 @@ from subprocess import PIPE, STDOUT, Popen from datadog_checks.base.utils.common import ensure_bytes -from datadog_checks.base.utils.http import RequestsWrapper from datadog_checks.dev.errors import SubprocessError from datadog_checks.dev.structures import LazyFunction from datadog_checks.voltdb.client import Client @@ -56,27 +55,30 @@ class EnsureExpectedMetricsShowUp(LazyFunction): def __init__(self, instance): # type: (Instance) -> None - http = RequestsWrapper(instance, {}) - self._client = Client(url=instance['url'], http_get=http.get, username='admin', password='admin') + self._client = Client( + endpoints=[(instance['host'], instance.get('port', 21212))], + username='admin', + password='admin', + use_ssl=instance.get('use_ssl', False), + ssl_config_file=instance.get('ssl_config_file'), + ) def __call__(self): # type: () -> None - # Call procedures to make PROCEDURE and PROCEDUREDETAIL metrics show up... - # Built-in procedure. - r = self._client.request('Hero.insert', parameters=[0, 'Bits']) - assert r.status_code == 200 - assert r.json()["status"] == 1 - # Custom procedure. - r = self._client.request('LookUpHero', parameters=[0]) - assert r.status_code == 200 - data = r.json() - assert data["status"] == 1 - rows = data["results"][0]["data"] - assert rows == [[0, "Bits"]] + try: + # Call procedures to make PROCEDURE and PROCEDUREDETAIL metrics show up... + r = self._client.call_procedure('Hero.insert', [0, 'Bits']) + self._client.raise_for_status(r) - # Create a snapshot to make SNAPSHOTSTATUS metrics appear. - # See: https://docs.voltdb.com/UsingVoltDB/sysprocsave.php - block_transactions = 0 # We don't really care, but this is required. - r = self._client.request('@SnapshotSave', parameters=['/tmp/voltdb/backup/', 'heroes', block_transactions]) - assert r.status_code == 200 - assert r.json()["status"] == 1 + r = self._client.call_procedure('LookUpHero', [0]) + self._client.raise_for_status(r) + rows = r.tables[0].tuples + assert rows == [[0, 'Bits']] + + # Create a snapshot to make SNAPSHOTSTATUS metrics appear. + # See: https://docs.voltdb.com/UsingVoltDB/sysprocsave.php + block_transactions = 0 # We don't really care, but this is required. + r = self._client.call_procedure('@SnapshotSave', ['/tmp/voltdb/backup/', 'heroes', block_transactions]) + self._client.raise_for_status(r) + finally: + self._client.close()