Aot compiler fix #9634

mcr229 · 2025-03-25T23:52:14Z

Summary

Changes:

When initializing Llama2 for aot_compiler, since checkpoints can only e downloaded from hugging face, we initialize llama2 with uninitialized weights. The problem with this is that when running quantization, we can run into errors with the histogram if the unitialized values are nan. We fix this by initializing the weights with zeros if no check point is provided. This enforces that quantization step can still work.
Quant Type in AoT compiler. When looking at the model options available to XNNPACK, everything is quantized with per-tensor static quantization. This isn't the best option for all the models available. For example transformer based models like Llama and MobileBert would likely prefer dynamically quantized per channel weights, where has CNN like MobileNet would prefer statically quantized per channel weights. We add this type of Quant Type to the existing models options. This also helps with Test Timeouts. per-tensor static quantization on a model like llama can take a long time due to the introduction of MANY q/dq nodes, and the complex partitions it creates. As a result, proposing partitions can take a long time due to the constant BFS to find the largest possible partition. By specifying the more apt quantization scheme like dynamic per-channel quantization, we can avoid this complexity.

Overall this should help with flakey [nan, nan] errors in the quantization histogram, and it should also help with CI timing out.

Test plan

OSS XNNPACK CI for all model delegation

cc @digantdesai @cbilgin

pytorch-bot · 2025-03-25T23:52:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9634

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit ccf664e with merge base 4b8ac94 ():

NEW FAILURE - The following job has failed:

trunk / test-models-linux-aarch64 (phi_4_mini, portable) / linux-job (gh)
RuntimeError: Command docker exec -t fe2f1f663ff952269e03a3d674e379f9abda3381abdf9d58218ccb94f5758da3 /exec failed with exit code 137

This comment was automatically generated by Dr. CI and updates every 15 minutes.

mergennachin

Great!

mcr229 · 2025-03-26T19:58:17Z

phi_4_mini ci tests are failing with:

.ci/scripts/test_model.sh: line 75: 30893 Killed                  "${PYTHON_EXECUTABLE}" -m examples.models.llama.export_llama --model "${MODEL_NAME}" -c examples/models/llama/params/demo_rand_params.pth -p examples/models/phi_4_mini/config.json

there is no error message with the run, so I don't think anything is failing but just that this tests is getting killed. Running it locally on my laptop seems to pass

jackzhxng

Seems fine, doesn't seem like it should have affected the phi test. If it's consistently getting killed it might be OOMing

mcr229 requested review from digantdesai, lucylq and jackzhxng as code owners March 25, 2025 23:52

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 25, 2025

mcr229 requested review from mergennachin and GregoryComer March 25, 2025 23:52

mcr229 added the module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ label Mar 25, 2025

metascroy approved these changes Mar 26, 2025

View reviewed changes

mcr229 added 2 commits March 25, 2025 17:39

Update Llama2 to use zero weights when no checkpoint is provided

698612f

Specify Quant Type in AoT Compiler for better results

ccf664e

mcr229 force-pushed the aot_compiler_fix branch from 10408cd to ccf664e Compare March 26, 2025 00:39

mcr229 added release notes: xnnpack Changes to the XNNPack backend delegate release notes: examples Changes to any of our example LLMs integrations, such as Llama3 and Llava labels Mar 26, 2025

GregoryComer approved these changes Mar 26, 2025

View reviewed changes

mergennachin added the ciflow/trunk label Mar 26, 2025

mergennachin approved these changes Mar 26, 2025

View reviewed changes

mcr229 merged commit 91be93c into pytorch:main Mar 26, 2025
249 of 254 checks passed

mcr229 deleted the aot_compiler_fix branch March 26, 2025 19:59

jackzhxng reviewed Mar 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aot compiler fix #9634

Aot compiler fix #9634

mcr229 commented Mar 25, 2025 •

edited

Loading

pytorch-bot bot commented Mar 25, 2025 •

edited

Loading

mergennachin left a comment

mcr229 commented Mar 26, 2025

jackzhxng left a comment

Aot compiler fix #9634

Aot compiler fix #9634

Conversation

mcr229 commented Mar 25, 2025 • edited Loading

Summary

Test plan

pytorch-bot bot commented Mar 25, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9634

❌ 1 New Failure

mergennachin left a comment

Choose a reason for hiding this comment

mcr229 commented Mar 26, 2025

jackzhxng left a comment

Choose a reason for hiding this comment

mcr229 commented Mar 25, 2025 •

edited

Loading

pytorch-bot bot commented Mar 25, 2025 •

edited

Loading