Skip to content

HDDS-14327. [Website v2][Docs][Administrator Guide] Reorganize Decommissioning and Maintenance Modes for Datanodes#241

Merged
sarvekshayr merged 2 commits intoapache:HDDS-9225-website-v2from
sarvekshayr:HDDS-14327
Jan 21, 2026
Merged

HDDS-14327. [Website v2][Docs][Administrator Guide] Reorganize Decommissioning and Maintenance Modes for Datanodes#241
sarvekshayr merged 2 commits intoapache:HDDS-9225-website-v2from
sarvekshayr:HDDS-14327

Conversation

@sarvekshayr
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

#181 (comment) - Introduced an overview page with decommission and maintenance introduction and then created sub pages for these topics.

What is the link to the Apache Jira?

HDDS-14327

How was this patch tested?

Screenshot 2026-01-13 at 4 36 22 PM

Check off which of the following tests were done on this change. If additional testing was done, please elaborate here as well.

  • The CI checks on my fork are passing
  • I verified the rendered content using a local preview
  • I manually verified the steps provided in this change work as described

…issioning and Maintenance Modes for Datanodes

HDDS-14327. [Website v2][Docs][Administrator Guide] Reorganize Decommissioning and Maintenance Modes for Datanodes

lint error
Copy link
Copy Markdown
Contributor

@Gargi-jais11 Gargi-jais11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sarvekshayr for the patch.
Just some minor comments.

@Gargi-jais11
Copy link
Copy Markdown
Contributor

@sarvekshayr
The NodeDecommissionMetrics class tracks both decommissioning and maintenance workflows. The documentation currently lists only 2 metrics, but there are more that are useful for monitoring decommissioning maintenance mode.
Could you please double check the above class and verify if more metrics we need to include in documentation?

@Gargi-jais11
Copy link
Copy Markdown
Contributor

Gargi-jais11 commented Jan 14, 2026

Below are the metrics I am talking about. Please double check if we need these as well to be documented.
NodeDecommissionMetrics (entire section missing)
This metrics class is not documented. It tracks decommissioning and maintenance workflows.
Aggregate metrics (cluster-wide):

  • DecommissioningMaintenanceNodesTotal (node_decommission_metrics_decommissioning_maintenance_nodes_total): Number of nodes tracked for decommissioning and maintenance
  • RecommissionNodesTotal (node_decommission_metrics_recommission_nodes_total): Number of nodes tracked for recommissioning
  • PipelinesWaitingToCloseTotal (node_decommission_metrics_pipelines_waiting_to_close_total): Number of nodes tracked with pipelines waiting to close
  • ContainersUnderReplicatedTotal (node_decommission_metrics_containers_under_replicated_total): Number of containers under replicated in tracked nodes
  • ContainersUnClosedTotal (node_decommission_metrics_containers_un_closed_total): Number of containers not fully closed in tracked nodes
  • ContainersSufficientlyReplicatedTotal (node_decommission_metrics_containers_sufficiently_replicated_total): Number of containers sufficiently replicated in tracked nodes

Per-host metrics (tagged by datanode hostname):

  • UnderReplicatedDN (node_decommission_metrics_under_replicated_dn): Number of under-replicated containers for the specific host
  • PipelinesWaitingToCloseDN (node_decommission_metrics_pipelines_waiting_to_close_dn): Number of pipelines waiting to close for the specific host
  • SufficientlyReplicatedDN (node_decommission_metrics_sufficiently_replicated_dn): Number of sufficiently replicated containers for the specific host
  • UnclosedContainersDN (node_decommission_metrics_unclosed_containers_dn): Number of containers not fully closed for the specific host
  • StartTimeDN (node_decommission_metrics_start_time_dn): Timestamp when decommissioning was started for the specific host

@jojochuang
Copy link
Copy Markdown
Contributor

That looks like sizeable update. Can we merge it as is, and I can open a follow-up jira to address those issues.

@Gargi-jais11
Copy link
Copy Markdown
Contributor

That looks like sizeable update. Can we merge it as is, and I can open a follow-up jira to address those issues.

That makes sense

@sarvekshayr
Copy link
Copy Markdown
Contributor Author

Sorry for the delay, I've addressed the comments.

@sarvekshayr sarvekshayr requested review from Gargi-jais11 and jojochuang and removed request for Gargi-jais11 January 19, 2026 09:08
@sarvekshayr sarvekshayr merged commit 43c5b39 into apache:HDDS-9225-website-v2 Jan 21, 2026
11 checks passed
@sarvekshayr
Copy link
Copy Markdown
Contributor Author

Thanks @jojochuang and @Gargi-jais11 for the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants