Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Summary: Fix for pytorch/pytorch#129104 Our heuristic for num_warps was giving the optimal number, but we were capping maximum num_warps at 8. Gives 1% speedup on HF and TIMM in inference, 2% speedup in TIMM training, neutral otherwise. ultimately, I think we want live var analysis for register usage.. still worth landing this now. X-link: pytorch/pytorch#132458 Approved by: https://github.com/Chillee, https://github.com/shunting314 Reviewed By: jovianjaison Differential Revision: D61308271 Pulled By: eellison fbshipit-source-id: 3ceafd3701ab712693abfdd1ebe40aed845d3e6f
- Loading branch information