Skip to content

KAFKA-20268: Initialize group epochs to 1 for assignment offload#21700

Merged
dajac merged 5 commits intoapache:trunkfrom
confluentinc:squah-kip-1263-start-group-epochs-at-2
Mar 11, 2026
Merged

KAFKA-20268: Initialize group epochs to 1 for assignment offload#21700
dajac merged 5 commits intoapache:trunkfrom
confluentinc:squah-kip-1263-start-group-epochs-at-2

Conversation

@squah-confluent
Copy link
Contributor

@squah-confluent squah-confluent commented Mar 10, 2026

Once assignment offload is implemented, newly created groups may not
have their first assignment at epoch 1 available yet. We cannot give the
member an epoch of 0, since it is reserved for new members joining a
group. The new member's epoch must be greater than 0 and less than the
group epoch.

To resolve this, we initialize groups at epoch 1 with target assignment
epoch 1. Epoch 1 contains an empty assignment and the first computed
assignment will have epoch 2. Members joining a new group will reconcile
towards epoch 1 while waiting for epoch 2's assignment.

Streams groups already start group epochs at 2. We update streams groups
to use the same approach as consumer and share groups.

Reviewers: Lucas Brutschy lbrutschy@confluent.io, David Jacot
djacot@confluent.io

Once assignment offload is implemented, newly created groups may not
have their first assignment at epoch 1 available yet. We cannot give the
member an epoch of 0, since it is reserved for new members joining a
group. The new member's epoch must be greater than 0 and less than the
group epoch.

To resolve this, we initialize groups at epoch 1 with target assignment
epoch 1. Epoch 1 contains an empty assignment and the first computed
assignment will have epoch 2. Members joining a new group will reconcile
towards epoch 1 while waiting for epoch 2's assignment.

Streams groups already start group epochs at 2. We update streams groups
to use the same approach as consumer and share groups.
@github-actions github-actions bot added triage PRs from the community group-coordinator labels Mar 10, 2026
@dajac dajac added ci-approved and removed triage PRs from the community labels Mar 10, 2026
Copy link
Member

@dajac dajac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@squah-confluent Thanks for the PR. I left a few minor comments.

Comment on lines 43 to 44
this.assignmentEpoch = 1;
this.assignmentTimestamp = 0L;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use TargetAssignmentMetadata.INITIAL to init those two?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, yes.

Comment on lines 42 to 43
this.assignmentEpoch = 1;
this.assignmentTimestamp = 0L;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Comment on lines 45 to 46
this.targetAssignmentEpoch = 1;
this.targetAssignmentTimestamp = 0L;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

Comment on lines -22614 to -22618
// The group epoch is bumped.
GroupCoordinatorRecordHelpers.newConsumerGroupEpochRecord(groupId, 1, 0),
// The target assignment is created.
GroupCoordinatorRecordHelpers.newConsumerGroupTargetAssignmentRecord(groupId, memberId1, Map.of()),
GroupCoordinatorRecordHelpers.newConsumerGroupTargetAssignmentMetadataRecord(groupId, 1, context.time.milliseconds()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's replace those with a comment explaining why there are not here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment

@dajac
Copy link
Member

dajac commented Mar 10, 2026

@squah-confluent Could you check the failed tests? They seem to be related.

@github-actions github-actions bot added the core Kafka Broker label Mar 11, 2026
Copy link
Member

@dajac dajac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. waiting on @lucasbru to stamp it too.

Copy link
Member

@lucasbru lucasbru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement, cleaner than the previous logic. Thanks, LGTM!

@dajac dajac merged commit 5c8e3c5 into apache:trunk Mar 11, 2026
24 checks passed
@dajac dajac deleted the squah-kip-1263-start-group-epochs-at-2 branch March 11, 2026 08:42
@squah-confluent
Copy link
Contributor Author

CI on trunk is broken again. I opened a fix at #21707.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants