Skip to content

Remote deploy dirties working tree by writing func.yaml before pipeline runs #3679

@lkingland

Description

@lkingland

Background

func.yaml has historically been written to disk only after a
successful deploy
. This is a deliberate invariant: the on-disk
func.yaml represents committed, deployed state, and the working tree
stays clean if a deploy fails. The invariant is even called out
explicitly in pkg/functions/function.go:450:

// Write Function struct (metadata) to Disk at f.Root
func (f Function) Write() (err error) {
    // Skip writing (and dirtying the work tree) if there were no modifications.
    ...
}

User mental model: if I func deploy with a one-off CLI flag and the
deploy fails, my source tree is unchanged and I can retry with
different flags without first reverting func.yaml.

What #3663 did and why

#3663 fixed a real bug: CLI flags like --image-pull-secret,
--service-account, and --deployer weren't reaching remote deploys
because the remote pipeline uploads sources from disk to a PVC, and
the in-memory f (with the CLI overrides applied) was never persisted
to disk before the upload. The on-cluster deploy step then read a stale
func.yaml from the PVC.

The fix in #3663 was to call f.Write() before invoking the pipeline:

// cmd/deploy.go (post-#3663)
if cfg.Remote {
    if err = f.Write(); err != nil {  // ← new
        return
    }
    var url string
    // Invoke a remote build/push/deploy pipeline
    ...
}

That fixes the user-facing bug (CLI flags now take effect on remote
deploys) but at the cost of violating the on-disk invariant: every
remote deploy now persists CLI-driven mutations to disk before the
pipeline succeeds. If the deploy fails, the user is left with a dirty
working tree, and a git status after a failed deploy now shows
changes the user did not commit to.

Why it matters

  • Retry semantics. Re-running func deploy after a failure now
    starts from a mutated func.yaml. The user has to know to revert
    before retrying with different flags, or accept that their first
    attempt's flags are now baked in.
  • Source hygiene / VCS noise. Failed deploys leave the working
    tree dirty for users in git, complicating bisects, status views, and
    pre-commit hooks.
  • Mental-model break. The invariant "func.yaml reflects
    deployed state"
    no longer holds; it now reflects "state of last
    deploy attempt."
  • Symmetry with local deploy. Local deploys still respect the
    invariant. Remote deploys now don't. The behavior of --remote
    diverges from non-remote in a non-obvious way.

Possible solutions

Solution A: Rollback on failure

Snapshot the on-disk func.yaml (bytes + mode) before f.Write(),
and restore the snapshot on any error from the pipeline.

original, _ := os.ReadFile(filepath.Join(f.Root, FunctionFile))  // may not exist
existed := original != nil

if err = f.Write(); err != nil { return }
defer func() {
    if err == nil { return }
    if existed {
        _ = os.WriteFile(filepath.Join(f.Root, FunctionFile), original, 0644)
    } else {
        _ = os.Remove(filepath.Join(f.Root, FunctionFile))
    }
}()

Edge cases. Several:

  • Concurrent edits: if the user opens func.yaml in an editor
    between our write and the rollback (long pipeline), the rollback
    silently overwrites their work. Mitigation: detect mtime drift and
    skip rollback with a warning, but the warning is itself surprising.
  • Process killed mid-rollback: SIGINT/SIGTERM during the deploy
    leaves the dirty state. No clean fix without WAL semantics.
  • File metadata: mode, ownership, mtime should be preserved. The
    snippet above is the simple shape; doing it faithfully is a few
    more lines.
  • Ergonomic regression for partial successes: if the on-cluster
    step partially succeeds (the resource is reconciling but the CLI
    surfaces a transient error), rolling back func.yaml leaves it
    out of sync with what's actually deployed. Hard to know which
    failure modes warrant rollback.
  • Repeats per-write site: every code path that mutates f and
    triggers a pre-deploy write needs the same protection. Easy to
    forget when adding a new feature.

Simple to write, semantically muddy.

Solution B: Inject the in-memory func.yaml into the upload stream (preferred)

The remote pipeline uploads sources via sourcesAsTarStream(f) in
pkg/pipelines/tekton/pipelines_provider.go:282. The function already
takes the in-memory fn.Function, walks f.Root, and tars files as
it finds them. The natural injection point is the file walk itself:

  • For every regular file other than func.yaml, read and tar from
    disk as today.
  • For func.yaml, skip the on-disk read and synthesize the tar entry
    from the in-memory f's serialized form.
  • If func.yaml doesn't exist on disk (brand-new function never
    saved), inject it explicitly after the walk.

The on-disk file is never touched until the existing post-success
f.Write() runs. The f.Write() call introduced by #3663 in
cmd/deploy.go can be reverted as part of this work.

Implementation sketch:

  1. Factor the func.yaml byte production out of Function.Write in
    pkg/functions/function.go:450 into a small helper, e.g.
    func (f Function) MarshalYAML() ([]byte, error). Both Write
    and the tar injection call it. (Write currently does
    yaml.Marshal(&f) plus the schema-header prefix — the helper
    must produce the same bytes.)

  2. In sourcesAsTarStream (pkg/pipelines/tekton/pipelines_provider.go),
    pre-compute the bytes once and inject during the walk:

    funcYamlBytes, err := f.MarshalYAML()
    if err != nil { _ = pw.CloseWithError(err); return }
    
    injected := false
    err = filepath.Walk(f.Root, func(p string, fi fs.FileInfo, err error) error {
        relp, _ := filepath.Rel(f.Root, p)
        if relp == FunctionFile && fi.Mode().IsRegular() {
            hdr := &tar.Header{
                Typeflag: tar.TypeReg,
                Name:     "source/" + FunctionFile,
                Size:     int64(len(funcYamlBytes)),
                Mode:     int64(fi.Mode().Perm()),
                ModTime:  time.Now(),
                Uid:      nobodyID, Gid: nobodyID,
                Uname:    "nobody", Gname: "nobody",
            }
            if err := tw.WriteHeader(hdr); err != nil { return err }
            if _, err := tw.Write(funcYamlBytes); err != nil { return err }
            injected = true
            return nil
        }
        // ... existing on-disk path ...
    })
    
    if !injected {
        // Function never saved to disk — inject explicitly.
        hdr := &tar.Header{
            Typeflag: tar.TypeReg,
            Name:     "source/" + FunctionFile,
            Size:     int64(len(funcYamlBytes)),
            Mode:     0o644,
            ModTime:  time.Now(),
            Uid:      nobodyID, Gid: nobodyID,
            Uname:    "nobody", Gname: "nobody",
        }
        _ = tw.WriteHeader(hdr)
        _, _ = tw.Write(funcYamlBytes)
    }
  3. Revert the f.Write() call added by fix: write func.yaml before remote deploy upload #3663 in cmd/deploy.go.

  4. Tests:

    • Unit-test sourcesAsTarStream: pass an f whose in-memory
      func.yaml differs from disk; assert the tar entry for
      source/func.yaml matches the in-memory bytes.
    • Update / move the assertions from cmd/deploy_test.go added by
      fix: write func.yaml before remote deploy upload #3663 (which currently verify on-disk func.yaml after a remote
      deploy attempt) to instead verify on-disk state is unchanged,
      and tar contents reflect CLI flags.
    • Integration test that confirms --image-pull-secret,
      --service-account, --deployer flow through to the on-cluster
      deploy step on a remote deploy — i.e. the bug fix: write func.yaml before remote deploy upload #3663 fixed
      stays fixed.

Why this is cleaner than Solution A.

  • No state-restoration semantics, no concurrent-edit hazards, no
    process-kill hazards.
  • Localized: one helper extraction + ~15 lines in the tar walker +
    one revert in cmd/deploy.go.
  • Preserves the on-disk invariant exactly. func.yaml is still
    written only after successful deploy, by the existing
    post-success path.
  • Symmetric with local deploys.
  • Future-proofs: any new CLI flag that mutates f automatically
    flows through correctly without further plumbing.

Solution C: Stage to a temp directory

Copy the source tree to a temp directory, write the modified
func.yaml into the temp copy, tar from there, clean up.

Mechanical but doubles disk I/O and has to faithfully recreate
permissions, symlinks, ownership. Strictly worse than B in every
dimension. Mentioned for completeness.

Other shapes (out of scope)

  • Send func.yaml over a separate channel (ConfigMap, sidecar arg)
    rather than embedding it in the source tar. Cleaner separation of
    code from config but a much larger refactor of the pipeline
    contract.
  • Make f.Write() itself a no-op when called pre-deploy (track
    "committed vs. uncommitted" inside f). Adds state to
    Function; doesn't help the remote case unless paired with one of
    the solutions above.

Recommendation

Solution B, scoped as described. The change is small, localized,
and preserves the invariant the codebase already documents. Solution A
is a fallback if for some reason refactoring the tar walker turns out
to be harder than expected, but no such reason is currently apparent.

Why this wasn't fixed in #3663

#3663 was a focused bug fix for missing CLI-flag propagation on remote
deploys, from an active contributor. The mid-deploy f.Write() call
is the smallest possible patch that fixes the user-facing bug. Asking
the contributor to refactor the tar walker and extract a YAML helper
would have widened scope significantly and gated the fix on cleanup
work. This issue captures the cleanup as separable, picker-uppable
follow-up.

File pointers

  • pkg/functions/function.go:450Function.Write (extract
    MarshalYAML helper)
  • pkg/pipelines/tekton/pipelines_provider.go:282
    sourcesAsTarStream (inject during walk)
  • cmd/deploy.go (around the post-#3663 f.Write() call) — revert
  • cmd/deploy_test.go and pkg/pipelines/tekton/pipelines_provider_test.go
    — update / migrate tests

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions