Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aot compiler fix #9634

Merged
merged 2 commits into from
Mar 26, 2025
Merged

Aot compiler fix #9634

merged 2 commits into from
Mar 26, 2025

Conversation

mcr229
Copy link
Contributor

@mcr229 mcr229 commented Mar 25, 2025

Summary

Changes:

  1. When initializing Llama2 for aot_compiler, since checkpoints can only e downloaded from hugging face, we initialize llama2 with uninitialized weights. The problem with this is that when running quantization, we can run into errors with the histogram if the unitialized values are nan. We fix this by initializing the weights with zeros if no check point is provided. This enforces that quantization step can still work.
  2. Quant Type in AoT compiler. When looking at the model options available to XNNPACK, everything is quantized with per-tensor static quantization. This isn't the best option for all the models available. For example transformer based models like Llama and MobileBert would likely prefer dynamically quantized per channel weights, where has CNN like MobileNet would prefer statically quantized per channel weights. We add this type of Quant Type to the existing models options. This also helps with Test Timeouts. per-tensor static quantization on a model like llama can take a long time due to the introduction of MANY q/dq nodes, and the complex partitions it creates. As a result, proposing partitions can take a long time due to the constant BFS to find the largest possible partition. By specifying the more apt quantization scheme like dynamic per-channel quantization, we can avoid this complexity.

Overall this should help with flakey [nan, nan] errors in the quantization histogram, and it should also help with CI timing out.

Test plan

OSS XNNPACK CI for all model delegation

cc @digantdesai @cbilgin

Copy link

pytorch-bot bot commented Mar 25, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/9634

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure

As of commit ccf664e with merge base 4b8ac94 (image):

NEW FAILURE - The following job has failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 25, 2025
@mcr229 mcr229 added the module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ label Mar 25, 2025
@mcr229 mcr229 force-pushed the aot_compiler_fix branch from 10408cd to ccf664e Compare March 26, 2025 00:39
@mcr229 mcr229 added release notes: xnnpack Changes to the XNNPack backend delegate release notes: examples Changes to any of our example LLMs integrations, such as Llama3 and Llava labels Mar 26, 2025
Copy link
Contributor

@mergennachin mergennachin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

@mcr229
Copy link
Contributor Author

mcr229 commented Mar 26, 2025

phi_4_mini ci tests are failing with:

.ci/scripts/test_model.sh: line 75: 30893 Killed                  "${PYTHON_EXECUTABLE}" -m examples.models.llama.export_llama --model "${MODEL_NAME}" -c examples/models/llama/params/demo_rand_params.pth -p examples/models/phi_4_mini/config.json

there is no error message with the run, so I don't think anything is failing but just that this tests is getting killed. Running it locally on my laptop seems to pass

@mcr229 mcr229 merged commit 91be93c into pytorch:main Mar 26, 2025
249 of 254 checks passed
@mcr229 mcr229 deleted the aot_compiler_fix branch March 26, 2025 19:59
Copy link
Contributor

@jackzhxng jackzhxng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine, doesn't seem like it should have affected the phi test. If it's consistently getting killed it might be OOMing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ciflow/trunk CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: xnnpack Issues related to xnnpack delegation and the code under backends/xnnpack/ release notes: examples Changes to any of our example LLMs integrations, such as Llama3 and Llava release notes: xnnpack Changes to the XNNPack backend delegate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants