-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Fix join order problem in TaintedAllocationSize
#18564
Conversation
Before this did not really terminate on `silentearth/curl2`. After the barrier looks like: ``` [2025-01-22 21:22:55] Evaluated non-recursive predicate TaintedAllocationSize::TaintedAllocationSizeConfig::isBarrier/1#6f365b45@37d8bfho in 5168ms (size: 240345). Evaluated relational algebra for predicate TaintedAllocationSize::TaintedAllocationSizeConfig::isBarrier/1#6f365b45@37d8bfho with tuple counts: 43 ~0% {1} r1 = JOIN Allocation::HeuristicAllocationFunction#class#57f08eba WITH `Function::Function.getAParameter/0#dispred#511fd682` ON FIRST 1 OUTPUT Rhs.1 43 ~0% {1} | JOIN WITH `DataFlowUtil::Node.asParameter/0#dispred#81f7eba7_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1 228072 ~0% {1} r2 = JOIN `TaintedAllocationSize::readsVariable/2#e074f316_10#join_rhs` WITH `TaintedAllocationSize::hasUpperBoundsCheck/1#9dd5da82` ON FIRST 1 OUTPUT Lhs.1 228209 ~0% {1} | JOIN WITH `Operand::Operand.getDef/0#dispred#a70e8079_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1 228125 ~0% {1} | JOIN WITH DataFlowUtil::OperandNode#3e3b23f6_20#join_rhs ON FIRST 1 OUTPUT Rhs.1 1 ~0% {1} r3 = CONSTANT(unique int)[38] 103 ~0% {1} | JOIN WITH exprs_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 206 ~0% {1} | JOIN WITH `Expr::Operation.getAnOperand/0#dispred#4dc2ee04#bf` ON FIRST 1 OUTPUT Rhs.1 8944 ~2% {1} r4 = `Bounded::bounded/1#e7aa9c09` UNION r3 4307 ~0% {1} | JOIN WITH `project#ExprNodes::ExprNode.getExpr/1#dispred#81a030dd_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1 70451786 ~8% {2} r5 = JOIN `TaintedAllocationSize::readsVariable/2#e074f316_10#join_rhs` WITH `TaintedAllocationSize::readsVariable/2#e074f316_10#join_rhs` ON FIRST 1 OUTPUT Lhs.1, Rhs.1 70451768 ~1% {3} | JOIN WITH `Instruction::Instruction.getBlock/0#dispred#58a40e80` ON FIRST 1 OUTPUT Lhs.1, Lhs.0, Rhs.1 75573899 ~2% {3} | JOIN WITH `Operand::Operand.getDef/0#dispred#a70e8079_10#join_rhs` ON FIRST 1 OUTPUT Lhs.1, Lhs.2, Rhs.1 90437235 ~0% {3} | JOIN WITH `Operand::Operand.getDef/0#dispred#a70e8079_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2 75675394 ~2% {3} | JOIN WITH DataFlowUtil::OperandNode#3e3b23f6_20#join_rhs ON FIRST 1 OUTPUT Lhs.2, Lhs.1, Rhs.1 5218044 ~0% {4} | JOIN WITH `project#IRGuards::Cached::compares_eq/6#511a0d6d_102#join_rhs` ON FIRST 1 OUTPUT Lhs.1, Rhs.2, Lhs.2, Rhs.1 51350331 ~2% {3} | JOIN WITH `IRGuards::IRGuardCondition.valueControls/2#eb6b9b19_120#join_rhs` ON FIRST 2 OUTPUT Rhs.2, Lhs.3, Lhs.2 25351 ~227% {1} | JOIN WITH `ValueNumberingInternal::tvalueNumber/1#f03b58f9` ON FIRST 2 OUTPUT Lhs.2 257826 ~6% {1} r6 = r1 UNION r2 UNION r4 UNION r5 return r6 ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
Tip: If you use Visual Studio Code, you can request a review from Copilot before you push from the "Source Control" tab. Learn more
The RA you posted as the "after" looks pretty abysmal - it's joining |
Thanks for your input. And, yes, this is definitely not the right approach (see also the slow downs in DCA). Let me have another think about this first. Note that I've also been discussing this with @MathiasVP . |
I tried to play the "where's fanout" game, and found the following: The entire thing is complicated by the |
Yeah, I'm aware of this problem 😭 This happened after we started using value numbering in the guards library (here). This was mitigated by inline'ing the predicates that perform The problem is that, if we have code like: if(x > 0) {
// ...
}
if(x > 0) {
// ...
}
...
if(x > 0) {
// ...
} many many times (which we have seen happen often in autogenerated code), then In retrospect, it probably means value numbering is not the right tool to use in a guards library. And once someone (cough, cough) creates a shared guards library C++ will jump on that bandwagon quite fast since we really need the smartness we got out of the value numbering-solution, but we would like to get that smartness without relying on value numbering (because of the problem above). |
It's slightly less abysmal when I drop the
|
Agree.
Right, yes, hrm, I guess I should put that closer to the top of my priorities. |
I will say, though, that simply getting a shared version of the Java guards library won't solve all of the problems since the most important thing C++ needs from a guards library is the ability to reason about unary and binary equalities and inequalities and, as far as I know, the Java guards library gets that from range analysis. And C++ is still not fully onboarded to using the shared range analysis (we're still not using it in the Code Scanning suite). So even if you magically conjure up a PR for a shared guards library (which would be amazing!) C++ still has some work to do before we can actually make use of such a library |
While we all agree that there's future work to be done on the guards library, I think this PR mitigates the problems we're seeing on that project on the current |
It's causing massive slowdowns on Kamalio and neovim, so that's not an option. |
Oh, ouch. Yeah, I'm seeing that now |
Alright, I've got a fixed join order for you now. PR incoming. (Fingers crossed that it's good across all repos.) |
Before this did not really terminate on
silentearth/curl2
. After the barrier looks like:Pull Request checklist
All query authors
.qhelp
. See the documentation in this repository.Internal query authors only
.ql
,.qll
, or.qhelp
files. See the documentation (internal access required).