Skip to content

fix(actions): retry npm publish, not just the visibility poll#26

Merged
jan-kubica merged 3 commits into
mainfrom
fix/npm-publish-retry-publish
May 13, 2026
Merged

fix(actions): retry npm publish, not just the visibility poll#26
jan-kubica merged 3 commits into
mainfrom
fix/npm-publish-retry-publish

Conversation

@jan-kubica
Copy link
Copy Markdown
Contributor

Addresses codex P2 review on stella/stdnum#95.

The regression

The composite ran `npm publish` exactly once, then polled `npm view` between sleeps. That covers eventual consistency (publish appeared to fail but actually went through) but does not retry a real transient failure — registry 5xx, network blip, TLS error — where the artifact never reached the registry. The previous inline loops in stdnum and the napi-rs repos retried publish up to 5x; migrating to the composite as-shipped would weaken that.

Fix

Wrap `npm publish` in the retry loop. A single `already_published` check between attempts covers both failure modes:

  • publish fails AND artifact landed (eventual consistency) → exit 0
  • publish fails AND artifact didn't land → sleep, retry

Backoff lengthened to 5s, 10s, 15s, 20s between attempts (50s total across 5 attempts) — gives the registry a real chance to recover.

Test plan

The previous loop ran `npm publish` exactly once and then polled
`npm view` between sleeps. That handles the eventual-consistency
case (publish appeared to fail but actually went through) but does
NOT retry a real transient failure — registry 5xx, network blip,
TLS error — where the artifact never reached the registry.

stdnum's previous inline publish loop did retry the publish itself
up to 5 times; that semantic is restored here, with a single
`already_published` check inside the loop covering both modes:

  - publish fails AND artifact landed (eventual consistency) → exit 0
  - publish fails AND artifact didn't land → next attempt

Backoff lengthened to 5s, 10s, 15s, 20s between attempts (50s total
across 5 attempts) to give the registry a real chance to recover
from a transient.

Addresses codex P2 on stella/stdnum#95.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6001bc2924

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +110 to +113
if (( attempt == MAX_ATTEMPTS )); then
printf '::error::Failed to publish %s@%s after %d attempts.\n' \
"${PACKAGE_NAME}" "${PACKAGE_VERSION}" "${attempt}" >&2
exit 1
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Poll after the final publish failure before exiting

If the fifth npm publish is the first attempt that reaches the registry but exits non-zero before the version is visible, this branch exits immediately after one already_published check. That reintroduces the eventual-consistency failure for the last attempt: a package can be accepted and become visible seconds later, but the action has already failed. Please keep polling/sleeping after the final failed publish as well before returning failure.

Useful? React with 👍 / 👎.

…blish

If the 5th and final `npm publish` is the first attempt that reaches
the registry but exits non-zero before the version is visible, the
previous control flow ran one `already_published` check and exited
immediately. That collapses the eventual-consistency window to ~0s
for the last attempt and reintroduces the exact failure mode the
retry loop was meant to avoid.

After all publish attempts fail, run an additional 1/2/3/4/5s
polling loop (15s) for the version to appear. Worst-case total time
across all 5 publish attempts + final poll is ~65s.

Addresses codex P2 review on PR #26.
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a retry mechanism for the npm publication process to handle transient network errors and registry eventual consistency. The implementation uses a loop to attempt the publish up to five times with an incremental backoff strategy. Feedback was provided regarding the GitHub Actions workflow commands, noting that the error message should be sent to standard output instead of standard error to ensure that UI annotations are correctly generated in the GitHub interface.

Comment on lines +111 to +112
printf '::error::Failed to publish %s@%s after %d attempts.\n' \
"${PACKAGE_NAME}" "${PACKAGE_VERSION}" "${attempt}" >&2
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

According to the GitHub Actions documentation, workflow commands like ::error:: must be written to standard output (stdout) to be processed correctly and create annotations in the UI. Redirecting this message to standard error (>&2) prevents the annotation from being created, though the message will still appear in the logs. To enable UI annotations, please remove the >&2 redirection. This advice also applies to other ::error:: calls in this file.

Suggested change
printf '::error::Failed to publish %s@%s after %d attempts.\n' \
"${PACKAGE_NAME}" "${PACKAGE_VERSION}" "${attempt}" >&2
printf '::error::Failed to publish %s@%s after %d attempts.\n' \
"${PACKAGE_NAME}" "${PACKAGE_VERSION}" "${attempt}"

@jan-kubica
Copy link
Copy Markdown
Contributor Author

CC on behalf of @jan-kubica

Codex P2 addressed in the latest commit. After all 5 publish attempts fail, an additional 15s polling window (1+2+3+4+5s sleep cadence) runs before declaring failure — so an attempt-5 publish that lands but lags in registry visibility still gets credited. Worst-case total time across publish + final poll is ~65s.

GitHub Actions parses `::error::`/`::warning::`/`::notice::` workflow
commands out of the runner's log stream to create UI annotations. Per
the docs, those commands must be written to stdout — `>&2` skips the
annotation pipeline and the message only shows up in raw logs.

Remove `>&2` from every `::error::` printf. The remaining `>&2` on
`cat "${PUBLISH_LOG}" >&2` is kept: that's the npm publish failure
output, which is genuine stderr content rather than a workflow
command.

Addresses gemini medium on PR #26.
@jan-kubica
Copy link
Copy Markdown
Contributor Author

CC on behalf of @jan-kubica

Gemini medium addressed in the latest commit. All `::error::` printfs now go to stdout (the annotation processor reads workflow commands from stdout per GitHub docs). The remaining `>&2` redirect on `cat "$PUBLISH_LOG" >&2` is kept because that's genuine stderr content (npm publish's own error output), not a workflow command.

@jan-kubica jan-kubica merged commit 24dbde1 into main May 13, 2026
1 check passed
@jan-kubica jan-kubica deleted the fix/npm-publish-retry-publish branch May 13, 2026 14:35
@github-actions github-actions Bot locked and limited conversation to collaborators May 13, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant