-
Notifications
You must be signed in to change notification settings - Fork 516
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix falcon dummy input generator #1825
Conversation
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the addition @eaidova
if normalized_config.new_decoder_architecture and normalized_config.multi_query: | ||
self.num_kv_heads = normalized_config.num_attention_heads | ||
elif normalized_config.new_decoder_architecture: | ||
self.num_kv_heads = normalized_config.num_kv_heads | ||
else: | ||
self.num_kv_heads = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it be :
if normalized_config.new_decoder_architecture and normalized_config.multi_query: | |
self.num_kv_heads = normalized_config.num_attention_heads | |
elif normalized_config.new_decoder_architecture: | |
self.num_kv_heads = normalized_config.num_kv_heads | |
else: | |
self.num_kv_heads = 1 | |
if normalized_config.new_decoder_architecture: | |
self.num_kv_heads = normalized_config.num_attention_heads | |
else: | |
self.num_kv_heads = normalized_config.num_kv_heads if not normalized_config.multi_query else 1 |
?
elif normalized_config.new_decoder_architecture: | ||
self.num_kv_heads = normalized_config.num_kv_heads | ||
else: | ||
self.num_kv_heads = 1 | ||
self.head_dim = self.hidden_size // self.num_attention_heads | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also why not have this value set in the normalized_config
directly and use it in both FalconDummyPastKeyValuesGenerator
and ORTFalconForCausalLM
@fxmarty ? (removing the need to have this check in both places)
This PR has been marked as stale because it has been open for 90 days with no activity. This thread will be automatically closed in 30 days if no further activity occurs. |
What does this PR do?
during enabling falcon40b export for openvino, I found that this model is not covered by dummy input generator configuration and failed with shape mismatch. This PR fixes that.
I propose these changed into optimum-intel, but the proper place for it here.
Who can review?