Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize vector get last written LSN #553

Merged
merged 11 commits into from
Jan 14, 2025
Merged

Conversation

knizhnik
Copy link

No description provided.

Copy link
Contributor

@hlinnaka hlinnaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct comparing with the old code. I have some doubts about this though that are applicable to both the old and the new version though:

  1. Why does GetLastWrittenLSNv() call SetLastWrittenLSNForBlock() for pages that were not in the cache previously? There are no comments on that and I find it pretty surprising. Luckily the PR description that added it Remember last written LSN when it is first requested #412 explains it. Let's copy that explanation to a comment in GetLastWrittenLSNv().

  2. Is it correct to release the lock and then call SetLastWrittenLSNForBlock()? If the page is concurrently updated and written out, the LSN we set here might override the LSN that was set by the concurrent updater. There was discussion on PR Remember last written LSN when it is first requested #412 about a concurrent read scenario. ISTM it was considered to be just a performance issue, but isn't it also a potential correctness issue if the last-written LSN of a page is moved backwards?

  3. I don't know if it matters from a performance point of view in practice, but it seems wasteful to release LastWrittenLsnLock, when SetLastWrittenLSNForBlock() will immediately re-acquire it. Instead of calling SetLastWrittenLSNForBlock(), you could hold the lock and copy some of the logic from SetLastWrittenLSNForBlock() into GetLastWrittenLSNv() itself, or introduce some common helper functions to reduce the duplication. That would also address the previous point about concurrent updates.

@knizhnik
Copy link
Author

Let's copy that explanation to a comment in GetLastWrittenLSNv().

Done

Is it correct to release the lock and then call SetLastWrittenLSNForBlock()? STM it was considered to be just a performance issue, but isn't it also a potential correctness issue if the last-written LSN of a page is moved backwards?

Values can not be moved backward, because there is check in SetLastWrittenLSN performed in critical section which prevents it. But see 3)

I don't know if it matters from a performance point of view in practice, but it seems wasteful to release LastWrittenLsnLock, when SetLastWrittenLSNForBlock() will immediately re-acquire it.

Fixed. I did it only for pg17 because it is just optimization.

Copy link

@MMeent MMeent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this whole "optimization" is invalid, as it completely ignores LW_EXCLUSIVE / LW_SHARED conflicts.

With this, youre updating a shared hash table if it's locked, regardless of whether you're holding a read lock or write lock. That's just plain incorrect.

@knizhnik
Copy link
Author

I think this whole "optimization" is invalid, as it completely ignores LW_EXCLUSIVE / LW_SHARED conflicts.

With this, youre updating a shared hash table if it's locked, regardless of whether you're holding a read lock or write lock. That's just plain incorrect.

Sorry, stupid error:(
Fixed.

@alexanderlaw
Copy link

By the way, isn't the comment:

But GetLastWrittenLSN(InvalidOid) is used only by zenith_dbsize which is not performance critical.

above GetLastWrittenLSN outdated?
(I can't find zenith_dbsize in the current code.)

Copy link

@MMeent MMeent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One bug, and some suggestions to address a lack of comments in the new code.

src/backend/access/transam/xlog.c Outdated Show resolved Hide resolved
src/backend/access/transam/xlog.c Outdated Show resolved Hide resolved
src/backend/access/transam/xlog.c Outdated Show resolved Hide resolved
src/backend/access/transam/xlog.c Outdated Show resolved Hide resolved
@knizhnik
Copy link
Author

By the way, isn't the comment:

But GetLastWrittenLSN(InvalidOid) is used only by zenith_dbsize which is not performance critical.

above GetLastWrittenLSN outdated? (I can't find zenith_dbsize in the current code.)

It was renamed to neon_dbsize.
Will update comment.

@MMeent
Copy link

MMeent commented Jan 13, 2025

But GetLastWrittenLSN(InvalidOid) is used only by zenith_dbsize which is not performance critical.

Yeah, I think that's been renamed to neon_dbsize.

@knizhnik knizhnik requested a review from MMeent January 13, 2025 18:28
Copy link

@MMeent MMeent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nitpicks, but apart from that LGTM.

src/backend/access/transam/xlog.c Outdated Show resolved Hide resolved
src/backend/access/transam/xlog.c Show resolved Hide resolved
github-merge-queue bot pushed a commit to neondatabase/neon that referenced this pull request Jan 14, 2025
## Problem

See #10281

pg17 performs extra lock/unlock operation when fetching LwLSN.

## Summary of changes

Perform all lookups under one lock, moving initialization of not found
keys to separate loop.

Related Postgres PR:
neondatabase/postgres#553

---------

Co-authored-by: Konstantin Knizhnik <[email protected]>
@knizhnik knizhnik merged commit 0f8da73 into REL_17_STABLE_neon Jan 14, 2025
1 check passed
@knizhnik knizhnik deleted the lwlsn_optimization branch January 14, 2025 05:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants