Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-18494-3: solution for the bug relating to gaps in the share partition cachedStates post initialization #18696

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

chirag-wadhwa5
Copy link
Contributor

This PR fixes the bug introduced by gaps in the read state response during share partition initialization. A window is maintained tracking the gaps in the cachedStates. This window is updated post acquisition of the records present in the gaps.

…are partition cachedState after initialization
@github-actions github-actions bot added triage PRs from the community core Kafka Broker KIP-932 Queues for Kafka labels Jan 24, 2025
Copy link
Collaborator

@apoorvmittal10 apoorvmittal10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, but shouldn't we have more tests?

@@ -5752,13 +5752,13 @@ public void testMaybeInitializeWhenReadStateRpcReturnsZeroAvailableRecords() {
for (int i = 0; i < 500; i++) {
stateBatches.add(new PersisterStateBatch(234L + i, 234L + i, RecordState.ACKNOWLEDGED.id, (short) 1));
}
stateBatches.add(new PersisterStateBatch(232L, 232L, RecordState.ARCHIVED.id, (short) 1));
stateBatches.add(new PersisterStateBatch(233L, 233L, RecordState.ARCHIVED.id, (short) 1));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we require more tests as previously we didn't find the problem in share partition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. Added unit test cases in the recent commit

@@ -462,6 +468,10 @@ public CompletableFuture<Void> maybeInitialize() {
// in the cached state are not missed
findNextFetchOffset.set(true);
endOffset = cachedState.lastEntry().getValue().lastOffset();
// initialReadGapOffset is not required, if there are no gaps in the read state response
if (isGapPresentInCachedState()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather iterating in the metod again, though it's a low cost thing, should we have a variable in earlier for loop in the method to check if gap exists, wdyt?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. Made this change in the recent commit

Comment on lines +1286 to +1288
if (lastAcquiredOffset > endOffset) {
endOffset = lastAcquiredOffset;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we write comment here reqarding when this can happen given we are acquirung new batch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. I have added the comment explaining this scenario.

@github-actions github-actions bot removed the triage PRs from the community label Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci-approved core Kafka Broker KIP-932 Queues for Kafka
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants