forked from git/git
-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bundle: avoid closing file descriptor twice #1857
Open
dscho
wants to merge
1
commit into
gitgitgadget:maint-2.48
Choose a base branch
from
dscho:fix-fetch-bundle-double-close
base: maint-2.48
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Already when introduced in c7a8a16 (Add bundle transport, 2007-09-10), the `bundle` transport had a bug where it would open a file descriptor to the bundle file and then close it _twice_: First, the file descriptor (`data->fd`) is passed to `unbundle()`, which would use it as the `stdin` of the `index-pack` process, which as a consequence would close it via `start_command()`. However, `data->fd` would still hold the numerical value of the file descriptor, and `close_bundle()` would see that and happily close it again. This seems not to have caused too many problems in almost two decades, but I encountered a situation today where it _does_ cause problems: In i686 variants of Git for Windows, it seems that file descriptors are reused quickly after they have been closed. In the particular scenario I faced, `git fetch <bundle> <ref>` gets the same file descriptor value when opening the bundle file and importing its embedded packfile (which implicitly closes the file descriptor) and then when opening a pack file in `fetch_and_consume_refs()` while looking up an object's header. Later on, after the bundle has been imported (and the `close_bundle()` function erroneously closes the file descriptor that has _already_ been closed when using it as `stdin` for `git index-pack`), the same file descriptor value has now been reused via `use_pack()`. Now, when either the recursive fetch (which defaults to "on", unfortunately) or a commit-graph update needs to `mmap()` the packfile, it fails due to a now-invalid file descriptor that _should_ point to the pack file but doesn't anymore. To fix that, let's invalidate `data->fd` after calling `unbundle()`. That way, `close_bundle()` does not close a file descriptor that may have been reused for something different. While at it, document that `unbundle()` closes the file descriptor, and ensure that it also does that when failing to verify the bundle. Luckily, this bug does not affect the bundle URI feature, it only affects the `git fetch <bundle>` code path. Note that this patch does not _completely_ clarifies who is responsible to close that file descriptor, as `run_command()` may fail _without_ closing `cmd->in`. Addressing this issue thoroughly, however, would require a rather thorough re-design of the `start_command()` and `finish_command()` functionality to make it a lot less murky who is responsible for what file descriptors. At least this here patch is relatively easy to reason about, and addresses a hard failure (`fatal: mmap: could not determine filesize`) at the expense of leaking a file descriptor under very rare circumstances in which `git fetch` would error out anyway. Signed-off-by: Johannes Schindelin <[email protected]>
dscho
added a commit
to git-for-windows/git-for-windows-automation
that referenced
this pull request
Jan 25, 2025
As part of the snapshot builds via the `git-artifacts` workflow, we try to import the tag from a bundle. This uncovered a bug that is over 17 years old, and which I am trying to fix via gitgitgadget/git#1857. The symptom is that just after fetching the refs, when looking through the fetched revisions whether a recursive fetch is needed, Git fails to `mmap()` the newly-imported packfile. The symptom looks like this: From [...]/git.bundle * tag <tag> -> FETCH_HEAD * [new tag] <tag> -> <tag> fatal: mmap: could not determine filesize Curiously, this only happens on 32-bit Windows, not on 64-bit Windows, nor on Linux or macOS. The explanation is most likely to be found in how quickly file descriptor values are used again after closing them, which would appear to be _really_ quickly on i686 Windows. Be that as it may, to work around this issue, we simply avoid any operation that would need to access the just-imported packfile in `git fetch <bundle>`. Signed-off-by: Johannes Schindelin <[email protected]>
/submit |
Submitted as [email protected] To fetch this version into
To fetch this version to local tag
|
On the Git mailing list, Junio C Hamano wrote (reply to this): "Johannes Schindelin via GitGitGadget" <[email protected]>
writes:
> This seems not to have caused too many problems in almost two decades,
> but I encountered a situation today where it _does_ cause problems: In
> i686 variants of Git for Windows, it seems that file descriptors are
> reused quickly after they have been closed.
Nice finding. It pays not to be in too uniform a monoculture.
> This is a really old bug. Deviating from Git's practice, I did not apply
> this on top of the bugged commit. In fact, I did not even apply this on
> top of an older maint-* branch because it would cause conflicts even
> cherry-picking it on top of maint-2.47.
Just for fun, attached is my backport to maint-2.44 track. It
merges, with conflict which is rather trivial to resolve, back to
v2.48.1, on which your original patch is based, and gives the
merge result identical to the result of direct application of your
patch to v2.48.1.
I am not sure what to feel about the FD leak mentioned in the log
message, though.
Thanks, will queue.
--- >8 ---
From: Johannes Schindelin <[email protected]>
Date: Sat, 25 Jan 2025 23:57:36 +0000
Subject: [PATCH] bundle: avoid closing file descriptor twice
Already when introduced in c7a8a16239 (Add bundle transport,
2007-09-10), the `bundle` transport had a bug where it would open a file
descriptor to the bundle file and then close it _twice_: First, ...
... At least this here patch is relatively easy to reason about, and
addresses a hard failure (`fatal: mmap: could not determine filesize`)
at the expense of leaking a file descriptor under very rare
circumstances in which `git fetch` would error out anyway.
Signed-off-by: Johannes Schindelin <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
---
bundle.c | 4 +++-
bundle.h | 2 ++
transport.c | 1 +
3 files changed, 6 insertions(+), 1 deletion(-)
Obviously this is hard to test ;-)
diff --git a/bundle.c b/bundle.c
index a9744da255..7b3f2da40f 100644
--- a/bundle.c
+++ b/bundle.c
@@ -616,8 +616,10 @@ int unbundle(struct repository *r, struct bundle_header *header,
{
struct child_process ip = CHILD_PROCESS_INIT;
- if (verify_bundle(r, header, flags))
+ if (verify_bundle(r, header, flags)) {
+ close(bundle_fd);
return -1;
+ }
strvec_pushl(&ip.args, "index-pack", "--fix-thin", "--stdin", NULL);
diff --git a/bundle.h b/bundle.h
index 021adbdcbb..8b187be7cf 100644
--- a/bundle.h
+++ b/bundle.h
@@ -50,6 +50,8 @@ int verify_bundle(struct repository *r, struct bundle_header *header,
*
* Before unbundling, this method will call verify_bundle() with the
* given 'flags'.
+ *
+ * Note that the `bundle_fd` will be closed as part of the operation.
*/
int unbundle(struct repository *r, struct bundle_header *header,
int bundle_fd, struct strvec *extra_index_pack_args,
diff --git a/transport.c b/transport.c
index df518ead70..25e2233da0 100644
--- a/transport.c
+++ b/transport.c
@@ -184,6 +184,7 @@ static int fetch_refs_from_bundle(struct transport *transport,
get_refs_from_bundle_inner(transport);
ret = unbundle(the_repository, &data->header, data->fd,
&extra_index_pack_args, 0);
+ data->fd = -1; /* `unbundle()` closes the file descriptor */
transport->hash_algo = data->header.hash_algo;
return ret;
}
--
2.48.1-293-gbb90fdfe3c
|
This patch series was integrated into seen via git@d69419e. |
dscho
added a commit
to git-for-windows/git-for-windows-automation
that referenced
this pull request
Jan 26, 2025
As part of the snapshot builds via the `git-artifacts` workflow, we try to import the tag from a bundle. This uncovered a bug that is over 17 years old, and which I am trying to fix via gitgitgadget/git#1857. The symptom is that just after fetching the refs, when looking through the fetched revisions whether a recursive fetch is needed, Git fails to `mmap()` the newly-imported packfile. The symptom looks like this: From [...]/git.bundle * tag <tag> -> FETCH_HEAD * [new tag] <tag> -> <tag> fatal: mmap: could not determine filesize Curiously, this only happens on 32-bit Windows, not on 64-bit Windows, nor on Linux or macOS. The explanation is most likely to be found in how quickly file descriptor values are used again after closing them, which would appear to be _really_ quickly on i686 Windows. Be that as it may, to work around this issue, we simply avoid any operation that would need to access the just-imported packfile in `git fetch <bundle>`. Signed-off-by: Johannes Schindelin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a really old bug. Deviating from Git's practice, I did not apply this on top of the bugged commit. In fact, I did not even apply this on top of an older
maint-*
branch because it would cause conflicts even cherry-picking it on top ofmaint-2.47
.