Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Try improving relay_datagram_send_channel() #3118

Merged
merged 2 commits into from
Jan 10, 2025

Conversation

matheus23
Copy link
Contributor

@matheus23 matheus23 commented Jan 10, 2025

Description

Closes #3067

Since multiple Connections can have multiple AsyncUdpSocket IpPollers, it's incorrect to assume that a single AtomicWaker can wake up these tasks correctly.
Instead, if there's only one AtomicWaker for all of them, only the last registered waker will be woken when there's capacity again.

I'm changing this such that poll_writable will actually add the current task's waker to a list, if it hasn't been added yet.
And successfully recving an item will wake all tasks.

A second issue was that between the call to self.sender.capacity() (it being 0) and self.waker.register, another thread might have successfully recvd an item and called self.waker.wake().
This means that the first thread could've registered a waker, while the capacity is actually non-zero. Not terrible (it's very likely that it'll be woken up when the capacity jumps to 2 this time), but also not great.
The fix is to re-check the capacity after registering the waker.
The downside is that we keep the waker and thus potentially get a spurious wakeup, but that's fine.

Notes & open questions

This... "fixes" the concerns, but I'd actually like to write a loom test for the concerns eventually.

It's just... a lot of work, and I'd rather prioritize other things, but also improve this part of the code at the same time.

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

@matheus23 matheus23 self-assigned this Jan 10, 2025
Copy link

github-actions bot commented Jan 10, 2025

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/3118/docs/iroh/

Last updated: 2025-01-10T13:41:04Z

Copy link

github-actions bot commented Jan 10, 2025

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: 7db1d47

@matheus23
Copy link
Contributor Author

matheus23 commented Jan 10, 2025

Netsim results:

test case throughput_gbps throughput_transfer
iroh 1_to_1 1.41 1.41
iroh 1_to_3 4.26 4.27
iroh 1_to_5 7.20 7.21
iroh 1_to_10 12.25 12.27
iroh 2_to_2 3.03 3.03
iroh 2_to_4 5.57 5.57
iroh 2_to_6 9.03 9.05
iroh 2_to_10 13.69 13.74

versus main:

test case throughput_gbps throughput_transfer
iroh 1_to_1 1.38 1.38
iroh 1_to_3 4.27 4.28
iroh 1_to_5 6.99 7.00
iroh 1_to_10 12.01 12.03
iroh 2_to_2 3.06 3.07
iroh 2_to_4 5.86 5.87
iroh 2_to_6 9.20 9.22
iroh 2_to_10 13.62 13.65

No noticable difference.

No differences in relay-only performance (cargo run --release -p iroh-bench --all-features -- iroh --only-relay).

Copy link
Contributor

@flub flub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@matheus23 matheus23 added this pull request to the merge queue Jan 10, 2025
@matheus23 matheus23 added this to the v0.31.0 milestone Jan 10, 2025
Merged via the queue into main with commit 594b861 Jan 10, 2025
26 checks passed
@matheus23 matheus23 deleted the matheus23/relay-datagram-channels branch January 10, 2025 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

Relay send path can overwrite wakers and might report Poll::Pending more than necessary
2 participants