Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid cloning live declarations #15419

Closed
wants to merge 1 commit into from

Conversation

MichaReiser
Copy link
Member

@MichaReiser MichaReiser commented Jan 11, 2025

Summary

I don't know what I'm doing but all tests still pass ;)

The declarations get merged below so maybe this is no longer needed?

Test Plan

cargo test

@MichaReiser MichaReiser added the red-knot Multi-file analysis & type inference label Jan 11, 2025
@@ -310,7 +310,7 @@ impl SymbolState {
visibility_constraints: VisibilityConstraintPerBinding::default(),
},
declarations: SymbolDeclarations {
live_declarations: self.declarations.live_declarations.clone(),
live_declarations: Declarations::default(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we initialize all vecs here with their existing capacity or even use the max capacity between a and b? It seems to me that we always add at least all existing entries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that live_declarations is a BitSet, not a vector. Or do you think it would make sense to add something like a capacity parameter for large BitSets that have overflowed onto the heap?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the bitset dynamically sized? I'd suggest to have a BitSet::with_capacity so that it can allocate the right inner vector. We could consider doing the same for all other fields. E.g. visibility_constraint's length is at least as large as the max of a's and b's visibility constraints (if I understand the logic correctly)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the bitset dynamically sized?

It is, I was mostly just responding to the "initialize all vecs here".

For small sizes, the BitSet is stack-allocated with a fixed capacity, so as long as we're below that threshold, reserving wouldn't help. But in general, adding reserve/with_capacity calls here sounds like a reasonable thing to try!

Copy link
Contributor

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@MichaReiser MichaReiser marked this pull request as ready for review January 11, 2025 10:30
@AlexWaygood
Copy link
Member

I'm not the expert in this area, but if this is correct, you could further simplify it:

--- a/crates/red_knot_python_semantic/src/semantic_index/use_def/symbol_state.rs
+++ b/crates/red_knot_python_semantic/src/semantic_index/use_def/symbol_state.rs
@@ -127,7 +127,7 @@ type VisibilityConstraintsIntoIterator = smallvec::IntoIter<InlineVisibilityCons
 
 /// Live declarations for a single symbol at some point in control flow, with their
 /// corresponding visibility constraints.
-#[derive(Clone, Debug, PartialEq, Eq)]
+#[derive(Clone, Debug, PartialEq, Eq, Default)]
 pub(super) struct SymbolDeclarations {
     /// [`BitSet`]: which declarations (as [`ScopedDefinitionId`]) can reach the current location?
     pub(crate) live_declarations: Declarations,
@@ -177,7 +177,7 @@ impl SymbolDeclarations {
 
 /// Live bindings for a single symbol at some point in control flow. Each live binding comes
 /// with a set of narrowing constraints and a visibility constraint.
-#[derive(Clone, Debug, PartialEq, Eq)]
+#[derive(Clone, Debug, PartialEq, Eq, Default)]
 pub(super) struct SymbolBindings {
     /// [`BitSet`]: which bindings (as [`ScopedDefinitionId`]) can reach the current location?
     live_bindings: Bindings,
@@ -244,7 +244,7 @@ impl SymbolBindings {
     }
 }
 
-#[derive(Clone, Debug, PartialEq, Eq)]
+#[derive(Clone, Debug, PartialEq, Eq, Default)]
 pub(super) struct SymbolState {
     declarations: SymbolDeclarations,
     bindings: SymbolBindings,
@@ -303,18 +303,7 @@ impl SymbolState {
         b: SymbolState,
         visibility_constraints: &mut VisibilityConstraints,
     ) {
-        let mut a = Self {
-            bindings: SymbolBindings {
-                live_bindings: Bindings::default(),
-                constraints: ConstraintsPerBinding::default(),
-                visibility_constraints: VisibilityConstraintPerBinding::default(),
-            },
-            declarations: SymbolDeclarations {
-                live_declarations: Declarations::default(),
-                visibility_constraints: VisibilityConstraintPerDeclaration::default(),
-            },
-        };
-
+        let mut a = Self::default();
         std::mem::swap(&mut a, self);

It looks like it's a 1% speedup on the cold benchmark: https://codspeed.io/astral-sh/ruff/branches/micha%2Fdont-clone-live-declarations

@sharkdp
Copy link
Contributor

sharkdp commented Jan 13, 2025

The declarations get merged below so maybe this is no longer needed?

Yes, thank you. This is an oversight by me from #15019. There is no semantic difference, we just inserted declarations twice into the BitSet (where the second insertion was a no-op).

Your solution here might be the best way, but for comparison, I opened #15451 which changes the code back to the state before statically-known branches. It uses BitSet::union instead of inserting elements one by one. The solution here might still be faster, because with the statically-known branches work, we now have to iterate over live declarations (to construct merged visibility constraints). So it might make sense to compute the union "as we go".

@MichaReiser
Copy link
Member Author

Thanks @sharkdp. I like your solution better because it simplifies the merge flow (and already goes in the direction of incorporating the expected size with_capacity for all structs). I'll close this in favor of #15451

sharkdp added a commit that referenced this pull request Jan 13, 2025
## Summary

In `SymbolState` merging, use `BitSet::union` instead of inserting
declarations one by one. This used to be the case but was changed in
#15019 because we had to iterate
over declarations anyway.

This is an alternative to #15419
by @MichaReiser. It's similar in performance, but a bit more
declarative and less imperative.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants