Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add channel last path for max_unpooling3d #1420

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

chunhuanMeng
Copy link
Contributor

@chunhuanMeng chunhuanMeng commented Feb 28, 2025

Optimize MaxUnpooling3d for Channel Last cases by adding a specific kernel, which enable MaxUnpooling3d format conversion free.

@chunhuanMeng
Copy link
Contributor Author

dtype shape op (us) op (us)
bf16 ['bfloat16[2, 32, 2, 2, 2]', 'int64[2, 32, 2, 2, 2]', [64, 64, 64], [32, 32, 32], [0, 0, 0]] 52.656us 50.048us
bf16 ['bfloat16[4, 33, 1, 1, 1]', 'int64[4, 33, 1, 1, 1]', [64, 64, 64], [33, 33, 33], [0, 0, 0]] 91.680us 86.112us
bf16 ['bfloat16[16, 32, 1, 1, 1]', 'int64[16, 32, 1, 1, 1]', [32, 32, 32], [32, 32, 32], [0, 0, 0]] 94.080us 68.864us
bf16_ChannelsLast ['bfloat16[2, 32, 2, 2, 2]', 'int64[2, 32, 2, 2, 2]', [64, 64, 64], [32, 32, 32], [0, 0, 0]] 92.336us 83.440us
bf16_ChannelsLast ['bfloat16[4, 33, 1, 1, 1]', 'int64[4, 33, 1, 1, 1]', [64, 64, 64], [33, 33, 33], [0, 0, 0]] 194.256us 164.464us
bf16_ChannelsLast ['bfloat16[16, 32, 1, 1, 1]', 'int64[16, 32, 1, 1, 1]', [32, 32, 32], [32, 32, 32], [0, 0, 0]] 144.768us 82.864us
fp16 ['float16[2, 32, 2, 2, 2]', 'int64[2, 32, 2, 2, 2]', [64, 64, 64], [32, 32, 32], [0, 0, 0]] 52.320us 50.768us
fp16 ['float16[4, 33, 1, 1, 1]', 'int64[4, 33, 1, 1, 1]', [64, 64, 64], [33, 33, 33], [0, 0, 0]] 90.816us 88.928us
fp16 ['float16[16, 32, 1, 1, 1]', 'int64[16, 32, 1, 1, 1]', [32, 32, 32], [32, 32, 32], [0, 0, 0]] 93.392us 67.280us
fp16_ChannelsLast ['float16[2, 32, 2, 2, 2]', 'int64[2, 32, 2, 2, 2]', [64, 64, 64], [32, 32, 32], [0, 0, 0]] 93.344us 83.392us
fp16_ChannelsLast ['float16[4, 33, 1, 1, 1]', 'int64[4, 33, 1, 1, 1]', [64, 64, 64], [33, 33, 33], [0, 0, 0]] 191.936us 169.408us
fp16_ChannelsLast ['float16[16, 32, 1, 1, 1]', 'int64[16, 32, 1, 1, 1]', [32, 32, 32], [32, 32, 32], [0, 0, 0]] 146.608us 80.784us
fp32 ['float32[2, 32, 2, 2, 2]', 'int64[2, 32, 2, 2, 2]', [64, 64, 64], [32, 32, 32], [0, 0, 0]] 64.032us 61.968us
fp32 ['float32[4, 33, 1, 1, 1]', 'int64[4, 33, 1, 1, 1]', [64, 64, 64], [33, 33, 33], [0, 0, 0]] 171.568us 168.192us
fp32 ['float32[16, 32, 1, 1, 1]', 'int64[16, 32, 1, 1, 1]', [32, 32, 32], [32, 32, 32], [0, 0, 0]] 104.688us 77.952us
fp32_ChannelsLast ['float32[2, 32, 2, 2, 2]', 'int64[2, 32, 2, 2, 2]', [64, 64, 64], [32, 32, 32], [0, 0, 0]] 103.712us 97.056us
fp32_ChannelsLast ['float32[4, 33, 1, 1, 1]', 'int64[4, 33, 1, 1, 1]', [64, 64, 64], [33, 33, 33], [0, 0, 0]] 263.616us 228.128us
fp32_ChannelsLast ['float32[16, 32, 1, 1, 1]', 'int64[16, 32, 1, 1, 1]', [32, 32, 32], [32, 32, 32], [0, 0, 0]] 155.472us 91.984us

@@ -252,7 +252,7 @@ void max_unpooling3d_forward_kernel(
offsetZ);

int64_t work_group_size_w = 32;
int64_t work_group_size_h = syclMaxWorkGroupSize(kfn) / 32;
int64_t work_group_size_h = 8;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syclWorkGroupSizePerEU / work_group_size_w ?

outputWidth,
output);

int64_t group_size = syclMaxWorkGroupSize(kfn);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use WorkGroupSizePerEU for Unpool2d? but MaxWorkGroupSize for Unpool3d?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants