-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OV] Add --all-layers argument to CLI #713
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
optimum/commands/export/openvino.py
Outdated
action="store_true", | ||
default=None, | ||
help=( | ||
"Whether embeddings and last MatMul layers should be compressed to a primary precision (usually, INT4)." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a non-INT4 usecase? I see below that if it is provided, all_layers is set to None if is_int8 else self.args.all_layers
so it seems like it's ignored for INT8? If it is only for INT4 it would be good to clarify that in the help message. And possibly make it mutually exclusive with INT8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for INT4 only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comment! By default we compress those layers to INT8, even if number of bits is set to 4. This flag allows to compress those layer to INT4 as well. I've updated the description, hopefully it is more clear now (since currently the only supported primary precision is INT4, I've rephrased it a bit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @nikita-savelyevv
What does this PR do?
Add
--all-layers
quantization argument to openvino export CLI interface.Before submitting