Skip to content

Silently ignore container update failure when the container is already gone#107

Merged
enp0s3 merged 6 commits into
openshift-virtualization:mainfrom
enp0s3:fix-oci-race
May 12, 2026
Merged

Silently ignore container update failure when the container is already gone#107
enp0s3 merged 6 commits into
openshift-virtualization:mainfrom
enp0s3:fix-oci-race

Conversation

@enp0s3
Copy link
Copy Markdown
Member

@enp0s3 enp0s3 commented May 12, 2026

Signed-off-by: Igor Bezukh ibezukh@redhat.com

As described in #105 there is a problem with running poststart hook while systemd manages the cgroup.
In this PR we assume that a failure to run the update on the non existing container indicates the reproduction
of the issue and we silently ignore it. For any other case the error is bubbled up to the runtime.

In addition to prevent regression some tooling and testing was added to cover the OCI hook rendering
and OCI hook scripting use cases.

enp0s3 added 6 commits May 11, 2026 22:21
Signed-off-by: Igor Bezukh <ibezukh@redhat.com>
the tool will help with automated testing
of the generated OCI hook script.

Signed-off-by: Igor Bezukh <ibezukh@redhat.com>
Signed-off-by: Igor Bezukh <ibezukh@redhat.com>
now it is possile to validate the OCI hook script
by running "make lint-hook-script".

ShellCheck should be installed on the target
execution environment.

Signed-off-by: Igor Bezukh <ibezukh@redhat.com>
when the poststart hook is used together with systed cgroup
driver there can be a race condition between the hook run
and the systemd cgorup cleanup. It can happen on short lived
containers such as containers that copy small files or run a single
on shot quick command.

the proposed solution is to ignore the udpate failure in case
the container PID doesn't exist or if its a zombie. If the failure
happens on a running container the error will be returned.

in addition the script was refactored for the sake of simplicity
we prefer to avoid multiple nested if conditions since its hard
to follow for the reviewer.

Signed-off-by: Igor Bezukh <ibezukh@redhat.com>
the tests will be run with bash script.
fake components such as /proc, crun and container data
are located at OCI-hook/testData and documented in the
readme file.

tests will be triggered by `make test`

Signed-off-by: Igor Bezukh <ibezukh@redhat.com>
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented May 12, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign enp0s3 for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@enp0s3
Copy link
Copy Markdown
Member Author

enp0s3 commented May 12, 2026

/cc @Barakmor1

@openshift-ci openshift-ci Bot requested a review from Barakmor1 May 12, 2026 11:42
Copy link
Copy Markdown
Member

@Barakmor1 Barakmor1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@openshift-ci openshift-ci Bot added the lgtm label May 12, 2026
@enp0s3 enp0s3 merged commit b031358 into openshift-virtualization:main May 12, 2026
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants