Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support deeepseek models export #1155

Merged
merged 12 commits into from
Feb 24, 2025
Merged

Conversation

eaidova
Copy link
Collaborator

@eaidova eaidova commented Feb 11, 2025

What does this PR do?

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@eaidova eaidova added the openvino-test Trigger OpenVINO slow tests label Feb 11, 2025
@eaidova eaidova force-pushed the ea/deepseek branch 2 times, most recently from f8cb843 to b1a2950 Compare February 11, 2025 05:52
@eaidova eaidova removed the openvino-test Trigger OpenVINO slow tests label Feb 13, 2025
@eaidova
Copy link
Collaborator Author

eaidova commented Feb 14, 2025

@echarlaix @IlyasMoutawwakil could you please take a look?

Copy link
Member

@IlyasMoutawwakil IlyasMoutawwakil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -3575,6 +3575,299 @@ def __exit__(self, exc_type, exc_value, traceback):
block.self_attn.forward = block.self_attn._orig_forward


class DeepseekPatcher(DecoderModelPatcher):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deepseek-v3 will be included in transformers v4.50 (huggingface/transformers#35926) could make sense to make sure everything is compatible before merging

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the pr stated "code relies heavily on original remote code.", it is exact the same reference code that I used

@eaidova
Copy link
Collaborator Author

eaidova commented Feb 24, 2025

@IlyasMoutawwakil @echarlaix could you please merge if there is no any additional comments?

@IlyasMoutawwakil IlyasMoutawwakil merged commit 63bee4e into huggingface:main Feb 24, 2025
19 of 22 checks passed
@eaidova eaidova deleted the ea/deepseek branch February 24, 2025 12:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants