-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AMDGPU is missing simplify demanded bits optimizations of readfirstlane and similar operations #128390
Comments
@llvm/issue-subscribers-backend-amdgpu Author: Matt Arsenault (arsenm)
We are missing known bits and demanded bits optimizations which look through readfirstlane, readlane, and DPP operators. We need to insert extensions to produce a legal type, but these imply inserting instructions to appropriately set the high bits of the input value. These bits are never needed, and starting from the use context of the trunc, we should be able to delete it.
This should be accomplished by implementing SimplifyDemandedBitsForTargetNode in SITargetLowering. This should handle INTRINSIC_WO_CHAIN operations, and handle Intrinsic::amdgcn_readfirstlane as the base example As an example these appear in the tests from #128388 |
Hi! This issue may be a good introductory issue for people new to working on LLVM. If you would like to work on this issue, your first steps are:
If you have any further questions about this issue, don't hesitate to ask via a comment in the thread below. |
@llvm/issue-subscribers-good-first-issue Author: Matt Arsenault (arsenm)
We are missing known bits and demanded bits optimizations which look through readfirstlane, readlane, and DPP operators. We need to insert extensions to produce a legal type, but these imply inserting instructions to appropriately set the high bits of the input value. These bits are never needed, and starting from the use context of the trunc, we should be able to delete it.
This should be accomplished by implementing SimplifyDemandedBitsForTargetNode in SITargetLowering. This should handle INTRINSIC_WO_CHAIN operations, and handle Intrinsic::amdgcn_readfirstlane as the base example As an example these appear in the tests from #128388 |
I'd like to work on this. Could I get assigned? |
We are missing known bits and demanded bits optimizations which look through readfirstlane, readlane, and DPP operators. We need to insert extensions to produce a legal type, but these imply inserting instructions to appropriately set the high bits of the input value. These bits are never needed, and starting from the use context of the trunc, we should be able to delete it.
This should be accomplished by implementing SimplifyDemandedBitsForTargetNode in SITargetLowering. This should handle INTRINSIC_WO_CHAIN operations, and handle Intrinsic::amdgcn_readfirstlane as the base example
As an example these appear in the tests from #128388
The text was updated successfully, but these errors were encountered: