Skip to content

fix: containerd config compatible with both 1.x and 2.x CRI plugins#5999

Open
kriscoleman wants to merge 1 commit into
mainfrom
fix/containerd-config-cri-v1-compat
Open

fix: containerd config compatible with both 1.x and 2.x CRI plugins#5999
kriscoleman wants to merge 1 commit into
mainfrom
fix/containerd-config-cri-v1-compat

Conversation

@kriscoleman
Copy link
Copy Markdown
Member

@kriscoleman kriscoleman commented May 5, 2026

Summary

Fixes CRI v1 RuntimeService registration failure during kubeadm preflight on release v2026.05.05-0:

validate CRI v1 runtime API for endpoint "unix:///run/containerd/containerd.sock":
rpc error: code = Unimplemented desc = unknown service runtime.v1.RuntimeService

Root cause

containerd_configure() used a delete-and-append approach for the SystemdCgroup runc option:

  1. sed -i '/containerd.runtimes.runc.options/d' — deleted the [...runc.options] TOML section header
  2. Appended a new section using the containerd 1.x plugin name (io.containerd.grpc.v1.cri)

This caused two problems:

Problem Impact
Deleting the section header orphaned key-value pairs (SystemdCgroup = false, BinaryName, etc.) under the wrong TOML section Config corruption — orphaned keys end up under [...runtimes.runc] instead of [...runtimes.runc.options]
Appended section uses io.containerd.grpc.v1.cri plugin name Containerd 2.x (AL2023, Ubuntu 24.04) uses io.containerd.cri.v1.runtime — the 1.x name is ignored

Additionally, config_path was set to /etc/containerd/certs.d (added in #5945) but the directory was never created before containerd restarted.

Fix

  • Replace delete-and-append with in-place substitution: sed 's/SystemdCgroup = false/SystemdCgroup = true/' modifies the value where it already exists, preserving the TOML structure for both containerd 1.x and 2.x
  • Create /etc/containerd/certs.d before containerd restarts so the config_path directory exists

Files changed

  • addons/containerd/template/base/install.sh — template (source of truth)
  • addons/containerd/1.7.{25-29}/install.sh — active 1.7.x versions
  • addons/containerd/1.6.{28,31,32,33}/install.sh — active 1.6.x versions

CMX Validation

Tested on two Ubuntu 24.04 CMX VMs (containerd 2.2.1 from OS repos) with kubeadm v1.34.3.

Side-by-side comparison

Metric BROKEN (main) FIXED (this branch)
runc.options section Wrong plugin name (grpc.v1.cri, appended at line 273) Correct 2.x plugin name (cri.v1.runtime, preserved at line 100)
SystemdCgroup entries 2 (false orphaned @108, true @274) 1 (true @109, in-place)
TOML warnings in containerd log 13 (4× "Ignoring unknown key in TOML for plugin") 0
/etc/containerd/certs.d Missing Exists
CRI v1 API Reachable (containerd 2.2.1 tolerates corruption) Reachable
kubeadm preflight CRI check Passes on Ubuntu 24.04 Passes on Ubuntu 24.04

BROKEN config warnings (containerd 2.2.1 log)

level=warning msg="Ignoring unknown key in TOML for plugin" key=containerd plugin=io.containerd.grpc.v1.cri
level=warning msg="Ignoring unknown key in TOML for plugin" key="containerd runtimes" plugin=io.containerd.grpc.v1.cri
level=warning msg="Ignoring unknown key in TOML for plugin" key="containerd runtimes runc" plugin=io.containerd.grpc.v1.cri
level=warning msg="Ignoring unknown key in TOML for plugin" key="containerd runtimes runc options" plugin=io.containerd.grpc.v1.cri

FIXED config log

(clean — no TOML warnings)

Note on reproduction

The exact CRI Unimplemented error could not be reproduced on Ubuntu 24.04 — containerd 2.2.1 tolerates the config corruption (it logs warnings but still initializes the CRI plugin). The original failure in v2026.05.05-0 likely occurs on Amazon Linux 2023 or another OS with a containerd 2.x build that is stricter about config validation. CMX currently only supports Ubuntu VMs, so AL2023 testing requires the full testgrid.

Test plan

  • Testgrid: K8s 1.34.x + containerd latest on Amazon Linux 2023
  • Testgrid: K8s 1.34.x + containerd latest on Ubuntu 24.04
  • Testgrid: K8s 1.34.x + containerd latest on Ubuntu 22.04
  • Testgrid: K8s 1.34.x + containerd latest on Rocky 9
  • Verify SystemdCgroup = true is set correctly in /etc/containerd/config.toml post-install
  • Verify /etc/containerd/certs.d directory exists post-install
  • Verify zero TOML warnings in containerd journal on containerd 2.x systems

The previous containerd_configure() approach deleted the
[...runtimes.runc.options] TOML section header and appended a
replacement using the containerd 1.x plugin name
(io.containerd.grpc.v1.cri). This caused two problems:

1. Deleting the section header orphaned key-value pairs (like
   SystemdCgroup) under the wrong TOML section, corrupting the config.

2. The appended section used the 1.x plugin name which containerd 2.x
   (shipped by AL2023, Ubuntu 24.04) does not recognize, so the CRI
   runtime plugin never received the SystemdCgroup=true setting.

Replace the delete-and-append approach with an in-place sed substitution
(SystemdCgroup = false → true) that works regardless of which CRI plugin
name wraps the runc.options section. Also create /etc/containerd/certs.d
before restarting containerd so the config_path directory exists.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 5, 2026

@kriscoleman kriscoleman marked this pull request as ready for review May 6, 2026 12:31
@kriscoleman kriscoleman requested a review from a team as a code owner May 6, 2026 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants