Skip to content

feat(datastore): race-id DDS create (alternative to claims, v1 alpha)#27325

Draft
kian-thompson wants to merge 5 commits into
microsoft:mainfrom
kian-thompson:users/copilot/race-id-dds-create
Draft

feat(datastore): race-id DDS create (alternative to claims, v1 alpha)#27325
kian-thompson wants to merge 5 commits into
microsoft:mainfrom
kian-thompson:users/copilot/race-id-dds-create

Conversation

@kian-thompson
Copy link
Copy Markdown
Contributor

Description

Alternative to PRs #27286 / #27291 (claims). Solves the "two clients racing to lazily create the same singleton DDS" problem at the DDS attach path instead of via a generic FWW key→value primitive. Preserves optimistic local application; only pays a cost when a race actually occurs.

This is v1, alpha-tagged, and intentionally narrow. Larger pieces (race-id handles, data-store-level races, async onLost, public IChannel.dispose()) are deferred to follow-ups.

API surface (alpha)

// New overload on IFluidDataStoreRuntime — opts the call into race semantics.
createChannel(raceId: string, type: string, raceOptions: { onLost?: OnRaceLost }): IChannel;

// Propagated on the wire so all clients see the race id with the attach op.
interface IAttachMessage { /* ...existing... */ raceId?: string; }

// Loser callback so the app can merge edits into the winner channel.
type OnRaceLost = (loser: IChannel, winnerChannelId: string) => void;

// Diagnostic event.
runtime.on("raceResolved", ({ raceId, winnerChannelId, loserChannelIds }) => ...);

The runtime mints a unique internal channel id (${raceId}#${guid}) — racing clients agree only on the race id, not the channel id.

Resolution semantics

FWW by sequenced attach-op order:

  1. First attach for a given raceId wins; its channel id is the canonical winner.
  2. Later attaches with the same raceId are dropped.
  3. Channel ops addressed to known-loser channel ids are dropped on every client deterministically (loserToWinner map).
  4. On loss locally, onLost(loserChannel, winnerChannelId) fires (via queueMicrotask) so the app can salvage local edits before continuing on the winner channel.
  5. loserToWinner is persisted in a .races summary blob so mid-session joiners drop late ops correctly.

What's in this PR

Commit
c141675b API surface
adaf797c Core FWW resolution in FluidDataStoreRuntime
68b5d4db Unit tests (4 new) — validation paths
3e8d85c1 .races summary blob (write + async rehydrate)
314e2d96 Changeset + regenerated API reports

Build: green. Tests: 46/46 passing in @fluidframework/datastore.

v1 limitations / explicitly deferred

  • Race-id handles — handles still carry the minted internal channel id, not the race id. Loser handles will be broken after resolution; apps must use onLost to migrate.
  • Data-store-level races — only DDS create is in scope.
  • IChannel.dispose() — no public dispose; loser context is removed but apps must stop using the loser channel themselves.
  • Async onLost — callback is sync (scheduled via queueMicrotask); merge work cannot be awaited by op processing.
  • minVersionForCollab gate — older clients won't recognize the raceId field; relying on alpha tag + opt-in API for now. Recommended to add before GA.
  • Detached / staging mode — race overload is rejected; races are resolved by attach-op sequencing which only happens once globally visible.
  • Summary rehydrate is best-effort — ops to historical losers may transiently apply during the constructor's async read window.

Follow-ups (tracked separately)

  • Race-id handle kind + serializer integration
  • minVersionForCollab gate + back-compat tests
  • Cross-runtime end-to-end test (singleton-canvas scenario)
  • Race-id for data-store create

Comparison

vs. PR #27286 / #27291 (claims): claims add a general FWW key→value primitive at runtime or DDS level; consumers must restructure to read-before-create. This PR keeps optimistic create-first semantics and only forces app awareness when the race actually fires.


🤖 Generated with Copilot CLI

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

kian-thompson and others added 5 commits May 15, 2026 22:55
…e to claims)

Adds the public API surface for race-id-tagged channel creation:

- IAttachMessage gains optional 'raceId' field (only emitted when document
  schema indicates support; older clients in mixed sessions never see it).
- IFluidDataStoreRuntime.createChannel gains a new 3-argument overload:
  createChannel(raceId, type, { onLost }). The overload is the opt-in for
  race semantics; existing 2-arg call sites are unchanged.
- New OnRaceLost type and 'raceResolved' event.

Implementation follows in subsequent commits.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Core race resolution in FluidDataStoreRuntime:
- createChannel(raceId, type, raceOptions) overload mints unique
  internal channel id (${raceId}#${guid}) and tracks the entry.
- Outbound IAttachMessage now carries optional raceId.
- processAttachMessages applies first-wins resolution across clients:
  loser context is removed from contexts; loser->winner redirect is
  recorded; onLost callback is scheduled via queueMicrotask.
- Inbound channel ops to known-loser channel ids are dropped on every
  client via the loserToWinner map, keeping resolution deterministic.
- raceResolved event emitted for diagnostics.

v1 scope: summary persistence of redirects, doc-schema gate, tests,
and changeset still pending per plan.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Covers synchronous validation paths of the new 3-arg createChannel
overload: detached-state rejection, derived channel id format
(${raceId}#...), duplicate-raceId rejection per client, and empty
raceId rejection.

Full FWW resolution across two runtimes (inbound a-t-t-a-c-h ops,
loser context teardown, onLost scheduling) requires more harness
plumbing and is deferred to a follow-up.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Writes a '.races' top-level blob containing the loserToWinner map in
FluidDataStoreRuntime.summarize() when the map is non-empty, and reads
it back asynchronously during construction. This lets mid-session
joiners drop late ops to historical losers deterministically once the
load completes.

v1 caveat: the read is best-effort and not awaited before op
processing. Ops to historical losers may transiently be applied during
the load window. Tracked as a follow-up.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Adds .changeset/race-id-dds-create.md describing the new alpha
  surface (raceId, OnRaceLost, raceResolved event) and v1 caveats.
- Adds @Alpha overload declaration on FluidDataStoreRuntime.createChannel
  so the implementation's release tag matches the @Alpha OnRaceLost
  type it references (fixes ae-incompatible-release-tags).
- Escapes '->' in tsdoc comments per tsdoc-escape-greater-than.
- Regenerates api-report files for datastore-definitions,
  runtime-definitions, and datastore.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Hi! Thank you for opening this PR. Want me to review it?

Based on the diff (484 lines, 9 files), I've queued these reviewers:

  • Correctness — logic errors, race conditions, lifecycle issues
  • Security — vulnerabilities, secret exposure, injection
  • API Compatibility — breaking changes, release tags, type design
  • Performance — algorithmic regressions, memory leaks
  • Testing — coverage gaps, hollow tests

How this works

  • Adjust the reviewer set by ticking/unticking boxes above. Reviewer toggles alone don't trigger anything.

  • Tick Start review below to dispatch the review fleet.

  • After review finishes, tick Start review again to request another run — it auto-resets after each dispatch.

  • This comment updates as new commits land; your reviewer selections are preserved.

  • Start review

@github-actions
Copy link
Copy Markdown
Contributor

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> fluid-framework-docs-site@0.0.0 ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> fluid-framework-docs-site@0.0.0 serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> fluid-framework-docs-site@0.0.0 check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  288859 links
    1925 destination URLs
    2175 URLs ignored
       0 warnings
       0 errors


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant