-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OpenVINO Tokenizers #513
Merged
Merged
Changes from 22 commits
Commits
Show all changes
26 commits
Select commit
Hold shift + click to select a range
94bc226
Convert tokenizers with openvino_tokenizers
slyalin 6bb395f
Update optimum/exporters/openvino/__main__.py
slyalin 7d16ec7
Refactor and Add Tests
apaniukov f0933ad
Fix t5 Test
apaniukov 24cc616
Add Warning
apaniukov 49337b0
Return Tests
apaniukov 7709043
Move export_tokenizer to convert.py
apaniukov dbd609b
Avoid Double Tokenizer Save
apaniukov 7e24f10
Fix Style
apaniukov 2cf460d
Refactor After Review
apaniukov 57782d1
Skip Tokenizers Tests If No Package Installed
apaniukov db668da
Merge branch 'main' into openvino_tokenizers
apaniukov e7cd70f
Style Fix
apaniukov 40cf117
Fix OV Tokenizers Check
apaniukov 901f48a
Fix Tests
apaniukov 1a76bd4
Add Missing return
apaniukov f2b2237
Turn off tokenizer message if not installed
apaniukov 3066ade
Merge branch 'main' into openvino-tokenizers
apaniukov 9a2e271
Merge branch 'main' into openvino-tokenizers
apaniukov 7ee347e
Move tokenizers to OV dependencies
apaniukov 8d2ec41
Check OV Compatibility
apaniukov 32a7274
Bump OV Version
apaniukov 8c029e0
Move OpenVINO Tokenizers To Optional Dependencies
apaniukov 09b067f
Add --convert-tokenizer Option to CLI
apaniukov f3b8ce8
Merge branch 'main' into openvino_tokenizers
apaniukov 3c27fbd
Fix SD Tokenizer
apaniukov File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would prefer to add an argument to trigger this export
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add
--convert-tokenizer
Option to CLIThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot @apaniukov, after our internal discussion I'm OK with both solutions so we can either revert 09b067f or keep
convert-tokenizer
(+ some tests needs to be fixed), let me know what work best for you and will merge afterwardsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, resolved merge conflicts and pushed a fix for the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apaniukov, consider creating a PR with removing the mandatory
--convert-tokenizer
. Instead of enabling conversion we probably need a key to disable conversion, something like--disable-convert-tokenizer
. There are options for how to deprecate the old option:I like option [2] considering I don't know the level of adoption of the existing key. But if there is evidence that nobody has started using
--convert-tokenizer
, option 3 is better, and if we are bold enough, go with option [1].There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#580